Releases · NVIDIA/spark-rapids-tools

14 Jan 03:32

github-actions

v24.12.1

4c898ee

v24.12.1 Latest

Latest

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.1/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.1/

Changes

Add compute_precision_recall utility function (#1500)
Fix additional FutureWarning issues (#1499)
Qualx model updates from weekly KPI run 2025-01-10 (#1495)
Fix future warnings for pandas>=2.2 (#1494)
Pin scikit-learn dependency for shap (#1491)
Make spill heuristic 1 TB by default (#1488)
Support Python 3.9-3.12 (#1486)
Update models for latest code/datasets (#1485)

Core

Improve scalastyle rules to detect spaces (#1493)
Improve shuffle manager recommendation in AutoTuner with version validation (#1483)
Support group-limit optimization for ROW_NUMBER in Qualification (#1487)
Bump minimum Spark version to 3.2.0 and improve AutoTuner unit tests for multiple Spark versions (#1482)
Fix inconsistent shuffle write time sum results in Profiler output (#1450)
Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs (#1479)
Split AutoTuner for Profiling and Qualification and Override BATCH_SIZE_BYTES (#1471)

Assets 3

20 Dec 20:23

github-actions

v24.12.0

7bb5cdd

v24.12.0

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.0/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.0/

Changes

Core

Skip processing apps with invalid platform and spark runtime configurations (#1421)
Improve implementation of finding median in StatisticsMetrics (#1474)
Optimize implementation of getAggregateRawMetrics in core-tools (#1468)
Adding Spark 3.5.2 support in auto tuner for EMR (#1466)
Mark RunningWindowFunction as supported in Qual tool (#1465)
Deduplicate calls to aggregateSparkMetricsBySql (#1464)

Miscellaneous

Follow Up: Make '--platform' argument mandatory in CLI (#1473)

Assets 3

13 Dec 01:57

github-actions

v24.10.3

ec40e07

v24.10.3

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.10.3/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.10.3/

Changes

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.10.2/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.10.2/

Changes

Core

Count expressions per Exec in SQLPlanParser (#1449)
Report all operators in the output file (#1444)
Fix missing exec-to-stageId mapping in Qual tool (#1437)
[BUG] Fix Profiler tool index out of bound exception when generating diagnostic metrics (#1439)
Sort Qual execs report by sqlId and nodeId (#1436)
Include expression parsers for HashAggregate and ObjectHashAggregate (#1432)
[FEA] Add stage/task level diagnostic output for GPU slowness in Profiler tool (#1375)
Reduce the log noise caused by core report summary (#1426)
Trigger GC at the beginning of each benchmark iteration (#1424)

Miscellaneous

[BUG] Fix sync plugin files script to handle empty or non-existing cvs files (#1446)
Enable license header check (#1440)

Assets 3

15 Nov 02:36

github-actions

v24.10.1

07b9a0f

v24.10.1

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.10.1/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.10.1/

Changes

Core

Adding EMR-specific tunings for shuffle manager and ignoring jar (#1419)
Changing autotuner memory error to warning in comments (#1418)
Add sparkRuntime property to capture runtime type in application_information (#1414)
Refactor Exec Parsers - remove individual parser classes (#1396)
Remove estimated GPU duration from qualification output (#1412)

Assets 3

04 Nov 23:23

github-actions

v24.10.0

3867a5c

v24.10.0

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.10.0/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.10.0/

Changes

[FEA] Allow users to specify custom Dependency jars (#1395)
Reduce default memory allocation to the java process (#1407)
Update error handling in python for parsing cluster information (#1394)
user-tools should add xms argument to java cmd (#1391)
Use environment variables to set thresholds in static yaml configurations (#1389)
Use StorageLib to download dependencies (#1383)
Remove total core second heuristic and filter apps only in top candidate view (#1376)
Generate log files for Python Profiling cli (#1366)
Update models for updated datasets and latest code (#1365)
Isolate dataset for qualx plugin invocations (#1361)
[FEA] Add total core seconds into top candidate view (#1342)
Fix python tool picking up wrong JAR version in Fat wheel mode (#1357)
[FOLLOWUP-1326] Set Spark version to 3.4.2 by default for onprem environment (#1358)
Disable too-many-positional-arguments in pylintrc (#1353)
Reduce console output tree level, exclude JAR tool output files and remove incorrect logging (#1340)

Core

Add support for Photon-specific SQL Metrics (#1390)
Add support for processing Photon event logs in Scala (#1338)
Add Reflection to support custom Spark Implementation at Runtime (#1362)
Improve AQE support by capturing SQLPlan versions (#1354)
Add PartitionFilters and DataFilters to the dataSourceInfo table (#1346)
Add support to ArrayJoin in Qualification tool (#1345)

Miscellaneous

Cluster information should handle dynamic allocation and nodes being removed and added (#1369)
Rename tag core to core_tools (#1350)

Assets 3

10 Sep 21:25

github-actions

v24.08.2

9613aa1

v24.08.2

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.08.2/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.08.2/

Changes

Core

[FEA] Add total core seconds in Qualification core tool output (#1320)
Add support to MaxBy and MinBy in Qualification tool (#1335)
Add safeguards to prevent older attempts from generating metrics output in Scala Tool (#1324)
Sync up DAYTIME and YEARMONTH fields with CSV plugin files (#1328)

Miscellaneous

Update signoff usage [skip ci] (#1332)

Assets 3

04 Sep 01:06

github-actions

v24.08.1

a224a0f

v24.08.1

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.08.1/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.08.1/

Changes

[DOC] spark_rapids CLI help cmd still shows cost savings (#1317)
Fix Qualification and Profiling tools CLI argument shorthands (#1312)
Raise error for enum creation from invalid string values (#1300)
Append HADOOP_CONF_DIR to the tools CLASSPATH execution cmd (#1308)
Fix key error and cross-join error during qualx evaluate (#1298)
Qual tool: Print more useful log messages when failures happen downloading dependencies (#1292)
Fix --help text for custom_model_file option (#1285)

Core

Remove legacy SpeedupFactor from core output files (#1318)
Mark decimalsum as supported in Qualification tool (#1323)
Mark SMJ as unsupported operator for corner cases in left join (#1309)
Remove arguments and code related to the html-report (#1311)
Handle SparkRapidsBuildInfoEvent in GPU event logs (#1203)
Enable recursive search for event logs by default and optional --no-recursion flag (#1297)
Qualification tool support filtering by a filesystem time range (#1299)
Skip generating timeline for stages that do not have completion time (#1290)
Save core tools logs to output log file (#1269)
Qualification tool - Add option to filter by minimum event log size (#1291)
Include exception message for unknown app status in core tool (#1281)

Miscellaneous

Remove restricted google sheets link and outdated TCO section (#1289)

Assets 2

13 Aug 02:52

github-actions

v24.08.0

1deeae7

v24.08.0

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.08.0/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.08.0/

Changes

Remove calculation of gpu cluster recommendation from python tool when cluster argument is passed (#1278)
Remove unused argument --target_platform in Python Tool (#1279)
Qualification tool: Add output stats file for Execs(operators) (#1225)
Include GPU information in the cluster recommendation for Dataproc and OnPrem (#1265)
Remove speedup based recommendation column from qual_summary csv (#1268)
Fix prediction CSV files for multiple qual directories (#1267)
Clean up tools after removing CLI dependency (#1256)
Rename cluster shape columns to use 'worker' prefix in the output files and rename metadata file (#1258)
Remove CLI dependency in Dataproc _pull_gpu_hw_info implementation (#1245)
Replace split_nds with split_train_val (#1252)
Update xgboost models and metrics (#1244)
Add footnotes for config recommendations and speedup category in top candidate view (#1243)
[BUG] Update Dataproc instance catalog for n1 series GPU info (#1242)
Improvements in Cluster Config Recommender (#1241)
Improve console output from python tool for failed/gpu/photon event logs (#1235)
[FEA] Generate and use instance description file for Databricks-Azure platform (#1232)
Remove arguments related to cost-savings (#1230)
Updated models for latest databricks-aws datasets (#1231)
Refactor QualX for Linter and Test Compatibility (#1228)
Generate summary metadata file and fix node recommendation in python (#1216)
[FEA] Remove gcloud CLI dependency for Dataproc platform (#1223)
Updated models for latest dataproc eventlogs (#1226)
Remove estimation-model column from qualification summary (#1220)
Add option to add features.csv files to training set (#1212)
Disable cost saving functionality (#1218)
[FEA] Remove CLI dependency for EMR and Databricks-AWS platforms in user tool (#1196)
Fix some basic pylint errors in qualx code (#1210)
Qual tool tuning rec based on CPU event log coherently recommend tunings and node setup and infer cluster from eventlog (#1188)
Add shap command to internal CLI for debugging (#1197)
Add internal CLI to generate instance descriptions for CSPs (#1137)
[FEA] Support custom XGBoost model file via user tools CLI (#1184)
Updated models for new training data (#1186)
Add evaluate_summary command to internal CLI (#1185)
[DOC] Fix broken link to qualX docs and update python prerequisites (#1180)
Bump to certifi-2024.7.4 and urllib3-1.26.19 (#1173)
Disable UI-HTML report by default in Qualification tool (#1168)
Fix parsing App IDs inside metrics directory in QualX (#1167)
Refactor Databricks-AWS Qual tool to cache and process pricing info from DB website (#1141)
Add plugin mechanism for dataset-specific preprocessing in qualx (#1148)
Unsupported op logic should read action column from qual's output (#1150)
Update qualx readme for training (#1140)
Disable pylint-unreachable code in tox.ini (#1145)

Core

Include GPU information in the cluster recommendation for Dataproc and OnPrem (#1265)
[TASK] Optimize the storage of accumulables in core tools (#1263)
Sync GetJsonObject support with Rapids-Plugin (#1266)
Do not create new StageInfo object (#1261)
[FEA] Add support for map_from_arrays in qualification tools (#1248)
Rename cluster shape columns to use 'worker' prefix in the output files and rename metadata file (#1258)
Fix stage level metrics output csv file (#1251)
Handle event logs with wildcards in status report generation (#1237)
Fix duplicate records in DataSourceInfo report (#1227)
Reduce memory footprint of stageInfo (#1222)
Ensure UTF-8 encoding for reading non-english characters (#1211)
Sync plugin support for hash-hive and shift operators (#1198)
Sync-up the support of parse_url in qualification tool (#1195)
Include status information for failed event logs in core tool (#1187)
[FEA] Adding Benchmarking classes to evaluate core tools performance (#1169)
[BUG] Fix handling of non-english characters in tools output files (#1189)
[Bug] Fix java Qual tool handling of --platform argument (#1161)
Add all stage metrics to tools output (#1151)
Follow-up 1142: remove TODO line (#1146)
Mark wholestageCodeGen as shouldRemove when child nodes are removed (#1142)
[FEA] Display full failure messages in failed CSV files (#1135)

Miscellaneous

Qualification tool: Add option to filter event logs for a maximum file system size (#1275)
Qualification tool should print Kryo related recommendations (#1204)
Fix header check script to exclude files (#1224)
Update header check script for pre-commit hooks (#1219)
Follow-up 1189: handle non-english characters in data-output.js (#1208)
Update pre-commit hooks to check for headers and white-spaces (#1205)
user-tools:Update --help for cluster argument (#1178)
Support fine-tuning models (#1174)

Assets 2

18 Jun 22:44

github-actions

v24.06.1

b71c22f

v24.06.1

Packages

Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.06.1/
PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.06.1/

Changes

Core

Handle different exception thrown by incomplete eventlogs (#1124)
Include number of executors per node in cluster information (#1119)

Assets 3

Releases: NVIDIA/spark-rapids-tools

v24.12.1

Packages

Changes

User Tools

Core

v24.12.0

Packages

Changes

User Tools

Core

Miscellaneous

v24.10.3

Packages

Changes

User Tools

v24.10.2

Packages

Changes

User Tools

Core

Miscellaneous

v24.10.1

Packages

Changes

User Tools

Core

v24.10.0

Packages

Changes

User Tools

Core

Miscellaneous

v24.08.2

Packages

Changes

User Tools

Core

Miscellaneous

v24.08.1

Packages

Changes

User Tools

Core

Miscellaneous

v24.08.0

Packages

Changes

User Tools

Core

Miscellaneous

v24.06.1

Packages

Changes

User Tools

Core