-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Almost match Cast Floats to Decimal #10909
Conversation
Signed-off-by: Haoyang Li <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
cc @ttnghia Here are some integration tests. scala test and Spark UT are also useful. # Integration test:
./integration_tests/run_pyspark_from_build.sh -s -k test_cast_float_to_decimal
# Spark UT:
mvn test -Dbuildver=330 -DwildcardSuites=org.apache.spark.sql.rapids.suites.RapidsCastSuite
# Scala Test:
mvn test -Dbuildver=330 -DwildcardSuites=com.nvidia.spark.rapids.CastOpSuite |
@@ -877,8 +877,8 @@ class CastOpSuite extends GpuExpressionTestSuite { | |||
|
|||
overflowCase(DataTypes.FloatType, precision = 10, scale = 6, | |||
generator = floatGenerator(Seq(12345.678f))) | |||
overflowCase(DataTypes.DoubleType, precision = 15, scale = -5, | |||
generator = doubleGenerator(Seq(1.23e21))) | |||
// overflowCase(DataTypes.DoubleType, precision = 15, scale = -5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two line failed because #10908
@@ -261,6 +261,23 @@ def test_cast_long_to_decimal_overflow(): | |||
lambda spark : unary_op_df(spark, long_gen).select( | |||
f.col('a').cast(DecimalType(18, -1)))) | |||
|
|||
@datagen_overrides(seed=0, reason='edge cases') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed seed because #10890
Close because we have a better approach in #10917 |
Post this pr for sharing tests.
Fixes #9682
Fixes #10809
This PR rewrite Cast floats to decimal with a new floats => string => decimal path, to make the results closer to CPU.
There are still some differences from the known limits of ryu float to string and two edge cases in string to decimal #10890 and #10908.
Performance test to cast 5000000 floats to 10 kinds of decimal types (in ms)
We can make it a bit faster by combining the two kernels in jni.