-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Toward release 0.5.0 - fast TDigest and Spark-3 Aggregator API (#20)
* update isarn-sketches dep to 0.2.2 * test design with thin shim class for new fast TDigest to clean up the API * update initial commands * scope t digest shim class, TDigestAggregator companion obj * bump isarn-sketches to 0.3.0 * example of a java/python binding * modify python tdigest UDT and a test UDF * ScalarNumeric, data type functions, python renaming, commenting out old code * spark 3.0 supports scala 2.12 only * http -> https * TDigestArrayAggregator * array function overloadings * add instructions for cleaning out ivy on local publish * spark vector aggregations * no longer need UDT for tdigest array * old tdigest UDTs are obsolete * remove package object * sketches.spark.tdigest._ * tdigest.scala * TDigestReduceAggregator * TDigestArrayReduceAggregator * TDigestUDAF.scala is obsolete * TDigestArrayReduceAggregator inherit from TDigestArrayAggregatorBase * factor out compression and maxdiscrete from TDigestArrayAggregatorBase * /udaf/ -> /spark/ * /udaf/ -> /spark/ * move python TDigestUDT into spark/tdigest.py * update sbt build mappings for python refactor * update readme python for new organization * copyright * unused imports * more unused imports * switch to fast java TDigest * explicit import of JavaPredictionModel * /pipelines/ -> /pipelines/spark/ * python/isarnproject/pipelines/__init__.py * update build mappings for new python organization * update package paths for new organization * fix package object path * update copyright * update pyspark tdigest to be cleaner and analogous to java implementation * spark pipeline param delta -> compression * fi.scala * update assembly dep and move it into plugins * add scaladoc * move ScalarNumeric out of tdigest specific package * update README examples for scala * spark.sparkContext * update python tdigest examples * update feature importance examples * isarn-sketches-java * utest harness for spark testing * TDigestAggregator test * TDigestAggregator test * KSD cumulative distribution divergence measure for unit testing * test counts * BigDecimal range * local build against spark-3.0.1-SNAPSHOT * test TDigestArrayAggregator * tests for spark ML vector types * cache test data sets * test tdigest reducing aggregators * epsD * move approx to base class * disable parallel test execution to prevent spark cluster teardown race conditions * feature importance unit test * build against spark 3.0.1 * xsbt -> sbt * 0.5.0
- Loading branch information
1 parent
a8ec2b4
commit e7d3136
Showing
20 changed files
with
1,642 additions
and
1,573 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
Oops, something went wrong.