-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Hybrid test cases in premerge/nightly CIs [databricks] #11906
base: branch-25.02
Are you sure you want to change the base?
Enable Hybrid test cases in premerge/nightly CIs [databricks] #11906
Conversation
Signed-off-by: Chong Gao <[email protected]>
…park 322,331,343,351
…ly supports 3.2.2, 3.3.1, 3.4.2, and 3.5.1.
Signed-off-by: sperlingxx <[email protected]>
Signed-off-by: Chong Gao <[email protected]>
a523285
to
889f32e
Compare
build |
jenkins/spark-premerge-build.sh
Outdated
|
||
prepare_for_hybrid_feature() { | ||
mvn -B dependency:get -DgroupId=com.nvidia \ | ||
-DartifactId=gluten-velox-bundle \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What artifacts are we going to release?
From the scripts, we'll release gluten-velox-bundle
, rapids-4-spark-hybrid_2.12
, gluten-thirdparty-lib
?
Or we can consider package all this dependencies into one jar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, release 3 jars, we can not package all this dependencies into one jar. Because customer may have there own customized gluten-velox-bundle jar. E.g.: customer has developed a file system like AFS.
jenkins/spark-premerge-build.sh
Outdated
@@ -160,6 +163,36 @@ rapids_shuffle_smoke_test() { | |||
$SPARK_HOME/sbin/stop-master.sh | |||
} | |||
|
|||
# parameters for Hybrid featrue | |||
spark_prefix="${SPARK_VER:0:3}" # get prefix from SPARK_VER, e.g.: 3.2, 3.3 ... 3.5 | |||
GLUTEN_BUNDLE_JAR="gluten-velox-bundle-spark${spark_prefix}_2.12-ubuntu_20.04_x86_64-1.2.0.jar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we def a global Gluten version 1.2.0
some where, e.g. in pom.xml?
Then we can share this version among docs and CI scripts, and easy to update it when Gluten version get updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add a config in version-def.sh
GLUTEN_VERSION=1.2.0
889f32e
to
de7a570
Compare
Signed-off-by: Chong Gao <[email protected]>
de7a570
to
fd2b513
Compare
build |
-DartifactId=gluten-velox-bundle \ | ||
-Dversion=${GLUTEN_VERSION} \ | ||
-Dpackaging=jar \ | ||
-Dclassifier=spark${spark_prefix}_2.12-ubuntu_${GLUTEN_FOR_OS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo ?
\
-->-Dclassifier=spark${spark_prefix}_2.12-ubuntu_${GLUTEN_FOR_OS} '
@@ -26,6 +26,9 @@ for VAR in $OVERWRITE_PARAMS; do | |||
done | |||
IFS=$PRE_IFS | |||
|
|||
# configs for Hybrid feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use default value patten instead of hardcode, then we can overwrite these vars outside without changing this script
GLUTEN_VERSION=${GLUTEN_VERSION:-"1.2.0"}
GLUTEN_FOR_OS=${GLUTEN_FOR_OS:-"20.04"}
GLUTEN_THIRD_PARTY_JAR="gluten-thirdparty-lib-${GLUTEN_VERSION}-ubuntu-${GLUTEN_FOR_OS}-x86_64.jar" | ||
|
||
# download Gluten, Hybrid jars | ||
mvn -B dependency:get -DgroupId=com.nvidia \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: avoid using mvn dependency:get if possible, the IT run does not require mvn dep.
use wget/curl instead
@@ -109,6 +109,10 @@ mvn_verify() { | |||
do | |||
TZ=$tz ./integration_tests/run_pyspark_from_build.sh -m tz_sensitive_test | |||
done | |||
|
|||
# test Hybrid feature | |||
source "${WORKSPACE}/jenkins/hybrid_execution.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how long will this test take? we may need some duration to determine if we should put it into in ci1 or ci2 stage for balancing the workloads
@@ -0,0 +1,51 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to make this script for preparing artifacts only and the actual caller place to run test command directly,
or at least separate the func of artifacts preparation from the actual test call func.
sth like
prepare() {
# detect the os,JVM, etc
echo "success" # or "skip/fail/or actual reason why this is discontinued" which is non-success to skip run_test
# the func still errors out in case that it passes the env detection but fails to do the preparation works like download error
}
run_test() {
# run the case
}
# from caller
# source the script
result=$(prepare)
if [[ "$result" == "success" ]]; then
run_test
else
echo "skipped test run blahblah: $result"
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please help doc the steps for local development to run of hybrid cases in section like https://github.com/NVIDIA/spark-rapids/blob/branch-25.02/integration_tests/README.md#enabling-cudf_udf-tests
Enable Hybrid test cases in premerge.
Depends on
This PR
Enable Hybrid test cases in premerge/nightly.
Temporarily based on #11720
Please review the tail commits, other commits is from #11720
Will merge #11720 and this PR after the premerge/nightly CIs are passed.
Signed-off-by: Chong Gao [email protected]