Releases · neo4j/graph-data-science

29 Nov 10:39

laeg

2.2.5

dcb851c

Graph Data Science 2.2.5

Neo4j Graph Data Science 2.2.5 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Bug Fixes

Some functions would not work as expected with Neo4j 5.x versions
- gds.alpha.linkprediction.adamicAdar
- gds.alpha.linkprediction.commonNeighbors
- gds.alpha.linkprediction.resourceAllocation
- gds.alpha.linkprediction.totalNeighbors

Assets 4

21 Nov 10:41

laeg

2.3.0-alpha02

8c92e84

Graph Data Science 2.3.0-Alpha02 Pre-release

Pre-release

GDS 2.3.0-alpha02 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
K-means is promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
The parameter startNodeId in Spanning Tree algorithms have been replaced with sourceNode.
The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behavior by specifying the new parameter objective in gds.alpha.spanningTree.

New features

Leiden

New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
New parameter seedProperty to seed initial communities for nodes.
New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
Now available in progress tracking - gds.list.progress()
Added memory estimation mode:
- gds.beta.leiden.mutate.estimate
- gds.beta.leiden.stats.estimate
- gds.beta.leiden.stream.estimate
- gds.beta.leiden.write.estimate

Logistic Regression & MLP

New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
- gds.beta.pipeline.nodeClassification.addLogisticRegression
- gds.beta.pipeline.nodeClassification.addMLP
- gds.beta.pipeline.linkPrediction.addLogisticRegression
- gds.beta.pipeline.linkPrediction.addMLP

HashGNN

New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Spanning Tree

New modes supported: gds.alpha.spanningTree.(stats, stream, mutate)
New yield output for gds.alpha.spanningTree that outputs the sum of weights in the discovered spanning tree.
New yield output for gds.alpha.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
Added memory estimation mode:
gds.alpha.spanningTree.stream.estimate
gds.alpha.spanningTree.mutate.estimate
gds.alpha.spanningTree.stats.estimate
gds.alpha.spanningTree.write.estimate

Bug fixes

Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow

graph import now fully supports external node ids in the 64 Bit space.
graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

Better parallelization and improved overall performance improvements

Other changes

Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.

Assets 4

21 Nov 10:46

laeg

2.2.4

69004ee

Graph Data Science 2.2.4

GDS 2.2.4 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

gds.alpha.nodeSimilarity.filtered - would give incorrect node IDs.
Pregel framework - the computation would not stop after terminating the underlying transaction. This affects gds.pageRank, gds.articleRank, gds.eigenvector.
alpha.hits and gds.alpha.sllpa could not be used as a nodeProperty step inside ml pipeline including gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.
nodeProperty steps could not be added to ml pipelines when running against Neo4j 5.x. This affected gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.

Improvements

gds.graph.list will only calculate the graph size when the procedure is called without any YIELD or if the fields memoryUsage or sizeInBytes are explicitly YIELD-ed.
Using YIELD to return other fields but not one of memoryUsage or sizeInBytes can speed up the execution time of gds.graph.list.

Assets 4

07 Nov 09:55

laeg

2.2.3

22aa4c9

Graph Data Science 2.2.3

GDS 2.2.3 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

gds.graph.export failed to run on Neo4j 5.X
gds.graph.export failed with InvalidRecordException when writeConcurrency is set >1.
Enterprise users were unable to load models trained with concurrency > 4.

Improvements

Arrow graph import now fully supports external node ids in the 64 Bit space.

Assets 4

23 Oct 13:42

laeg

2.3.0-alpha01

78a8fa6

Graph Data Science 2.3.0-alpha01 Pre-release

Pre-release

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Added a parameter consecutiveIds to Leiden to assign consecutive ids for the discovered communities.
Added a parameter seedProperty to Leiden to seed initial communities for nodes.
Added new configuration parameter focusWeight for Logistic Regression training method, supported by procedures:
- gds.beta.pipeline.nodeClassification.addLogisticRegression
- gds.beta.pipeline.linkPrediction.addLogisticRegression

Bug fixes

Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow graph import now fully supports external node ids in the 64 Bit space.
Arrow graph import now supports 16, 32 or 64 Bit node identifiers.

Other changes

Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.

Assets 4

21 Oct 10:17

laeg

2.2.2

09d5cd8

Graph Data Science 2.2.2

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Improvements

Graph Data Science ≥2.2.2 now supports Neo4j 5

Assets 4

17 Oct 08:44

laeg

2.2.1

a51a4a3

Graph Data Science 2.2.1

GDS 2.2.1 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Breaking changes

Change the content of some fields from the output of gds.debug.arrow:
- listenAddress now always returns the same content as advertisedListenAddress
- serverLocation always returns NULL

Assets 4

10 Oct 11:45

laeg

2.2.0

55a96ce

Graph Data Science 2.2.0

GDS 2.2.0 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

Breaking changes

Link Prediction filtering:
- Change graph filtering in gds.beta.pipeline.linkPrediction.train
  - Replace parameter nodeLabels with sourceNodeLabel and targetNodeLabel.
  - Replace parameter relationshipTypes with targetRelationshipType.
- Change graph filtering in gds.beta.pipeline.linkPrediction.predict
  - Replace parameter nodeLabels with optional sourceNodeLabels and targetNodeLabels. By default, they will be derived from the model's train configuration.
  - Change the default value for relationshipTypes with the targetRelationshipType from the model's train configuration.
Node Classification & Regression filtering:
- Change graph filtering in gds.beta.pipeline.nodeClassification.train and gds.beta.pipeline.nodeRegression.train
  - Replace parameter nodeLabels with targetNodeLabels
- Change graph filtering in gds.beta.pipeline.nodeClassification.predict and gds.beta.pipeline.nodeRegression.predict
  - Replace parameter nodeLabels with targetNodeLabels By default, they will be derived from the model's train configuration.
Promoting Collapse Path to beta tier
- Changed the procedure name to gds.beta.collapsePath.mutate
- Use parameter pathTemplates to now specify multiple_path templates_.
Promoting CELF to beta tier
- Moved gds.alpha.influenceMaximization.celf.stream to gds.beta.influenceMaximization.celf.stream
For graphs created, with gds.graph.project.cypher, reduce output of gds.graph.list to only print the names of parameters. This will avoid printing the parameter values, which potentially leads to long procedure execution times.
RandomWalk algorithm promoted to product tier
- gds.beta.randomWalk.stats => gds.randomWalk.stats
- gds.beta.randomWalk.stats.estimate => gds.randomWalk.stats.estimate
- gds.beta.randomWalk.stream => gds.randomWalk.stream
- gds.beta.randomWalk.stream.estimate => gds.randomWalk.stream.estimate
Removed debug_log config field from Arrow Create Database action.
Node2Vec uses new embedding initializer NORMALIZED as default.
Dropped support for older patches:
- for 4.3, only 4.3.15 and later is supported
- for 4.4, only 4.4.9 and later is supported

New features

Link Prediction filtering:
- Supports heterogeneous LinkPrediction pipelines by allowing configuring which node labels and relationship type to train and predict for.
- See Breaking changes above for more details.
K-means:
- Added centroids and average node-centroid distance to result for Mutate, Stats, and Write modes.
- Added distance to centroid per node result in Stream mode.
- Introduced a parameter numberOfRestarts that runs K-Means multiple times and picks the one with the minimum node-centroid distance.
- Introduced a parameter computeSilhouette that if enabled will compute silhouette related metrics.
- Introduced a parameter initialSampler which can select different sampling strategies for picking the first centroids.
  - Added the K-means++ initialization algorithm which can be enabled by setting initialSampler=kmeans++.
- Introduced the parameter seedCentroids which seeds input centroids to k-means (in negation of the above).
Introduced a new scaler Center for ScaleProperties that subtracts the mean from each value.
Expose penaltyL2 to configure the l2 regularization term to the loss function in gds.beta.graphSage.train.
Add Multilayer Perceptron as a training method for node classification (gds.alpha.pipeline.nodeClassification.addMLP) and link prediction (gds.alpha.pipeline.linkPrediction.addMLP).
Add SAME_CATEGORY feature type to gds.beta.pipeline.linkPrediction.addFeature.
Added new procedure gds.beta.graph.relationships.stream that streams relationship topology.
Added arrow export endpoint gds.beta.graph.relationships.stream that streams relationship topology.
Added new procedure gds.alpha.graph.sample.rwr that creates a new graph projection by sampling using random walk with restarts.
Added the ability to collapse multiple paths using gds.beta.collapsePath.mutate.
Promoting CELF algorithm to beta tier.
- Added gds.beta.influenceMaximization.celf.stats
- Added gds.beta.influenceMaximization.celf.mutate
- Added gds.beta.influenceMaximization.celf.write
- Added progress tracking capabilities.
- Added memory estimation.
Progress tracking for KMeans algorithm.
Memory estimation for KMeans.
- added gds.alpha.kmeans.mutate.estimate
- added gds.alpha.kmeans.stats.estimate
- added gds.alpha.kmeans.stream.estimate
- added gds.alpha.kmeans.write.estimate
Added procedure to compute modularity for pre-computed communities.
- gds.alpha.modularity.stats
- gds.alpha.modularity.stream
Added new config options to the GDS Flight server.
- gds.arrow.encryption.never deactivates the server encryption even if it would otherwise be enabled.
- gds.arrow.advertised_listen_address sets the server location that clients should connect to.
Added support for importing String node identifiers for the Arrow CREATE_DATABASE action.
Added capability to run BetweennessCentrality using relationship weights.
- Added relationshipWeightProperty optional configuration parameter.
Added stats mode procedures for RandomWalk.
- gds.beta.randomWalk.stats
- gds.beta.randomWalk.stats.estimate
Introduced the ability to configure defaults and limits for configuration parameters.
- gds.alpha.config.defaults.list
- gds.alpha.config.defaults.set
- gds.alpha.config.limits.list
- gds.alpha.config.limits.set
Introduce new configuration parameters contextNodeLabels and contextRelationshipTypes in nodePropertySteps.
- gds.beta.pipeline.linkPrediction.addNodeProperty
- gds.beta.pipeline.nodeClassification.addNodeProperty
- gds.alpha.pipeline.nodeRegression.addNodeProperty
- The context is used to enlarge the input graph to the node property steps when running gds.beta.pipeline.linkPrediction.addNodeProperty.[train|predict], gds.beta.pipeline.nodeClassification.[train|predict] and gds.alpha.pipeline.nodeRegression.[train|predict].
Leiden
- Add capability to mutate intermediateCommunities when includeIntermediateCommunities is set to true.
- Add capability to write intermediateCommunities when includeIntermediateCommunities is set to true.
Node2Vec adds new embedding initializer NORMALIZED configured with the parameter embeddingInitializer.

Bug fixes

Fixed a bug where eager checking for business rules around GDS on a Neo4j cluster would cause the cluster to fail to start.
Fixed a bug where Neo4j users with admin role could not see all graphs in the catalog on GDS enterprise.
Fixed a bug in random graph generation where the resulting graph can end up with an incorrect relationship schema.
Fixed a bug where a schema filter would not create a deep copy of the property schema map.
Fixed a bug where modularity could have been incorrectly updated in ModularityOptimization. This may affect the number of performed iterations for ModularityOptimization or number of levels for Louvain.
Fixed a bug where restoring from csv could not read values wrapped in quotes.
Fixed a bug where KNN did not use the expected search space. This will improve the result but also increase the runtime.
Fixed a bug in ML autotuning where maxTrials included model evaluations with concrete configs.
Fixed a bug where gds.triangleCount and gds.localClusteringCoefficient were allowed to run on directed graphs.
Fixed a bug in gds.graph.export and Arrow DB import where the writeConcurrency was not respected.
Fixed a bug with Node Operations where gds.graph.nodeProperties.write, gds.graph.nodeProperties.drop and gds.graph.nodeProperties/y.stream would not accept String input for parameters nodeLabels and/or nodeProperties.
Fixed a bug, where Node2Vec would report negative losses.
Fixed a bug with gds.graph.nodeProperties/y.stream, where the wrong nodes where returned when specifying a nodeLabels filter and using Arrow.
Fixed a bug in the Louvain algorithm, where aggregating dense communities could potentially lead to an exception.
Fixed a bug where model loading is attempted even for unlicensed user, which might fail database startup.

Improvements

Better error handling in K-means
Improve memory estimation for gds.beta.pipeline.linkPrediction.train when the nodePropertySteps used a weighted graph.
Improve runtime of feature generation in gds.beta.linkPrediction.[train|predict].
Improve performance of gds.graph.project.cypher by using the subscriber API.
Improve convergence criteria for LogisticRegression and LinearRegression trainers, by making it independent of the number of batches. This affects gds.alpha.pipeline.nodeRegression.train, gds.beta.pipeline.[linkPrediction|nodeClassification].train.
Improve error handling on invalid user input.
Cypher on GDS projections is now capable of setting labels on nodes.
Promoting CELF algorithm to `bet...

Assets 4

29 Sep 18:22

laeg

2.1.13

a156150

Graph Data Science 2.1.13

GDS 2.1.13 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

Bug Fixes

gds.graph.nodeProperties.write, gds.graph.nodeProperties.drop, gds.graph.nodeProperty.stream and gds.graph.nodeProperties.stream now accept String input for parameters nodeLabels and/or nodeProperties.
gds.graph.nodeProperty.stream and gds.graph.nodeProperties.stream, would return the wrong nodes when specifying a nodeLabels filter when using Arrow.
Louvain algorithm would throw an exception when aggregating dense communities.

Improvements

Export to CSV now enabled when GDS is running on a Causal Cluster Read Replica
- gds.beta.graph.export.csv
- gds.beta.graph.export.csv.estimate

Assets 4

15 Sep 16:30

laeg

2.1.12

e7c23e6

Graph Data Science 2.1.12

GDS 2.1.12 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

Improvements

New procedures for enabling and disabling Arrow database import (default: enabled)
- gds.features.enableArrowDatabaseImport
- gds.features.enableArrowDatabaseImport.reset

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes

Breaking changes

New features

Bug fixes

Improvements

Other changes

Bug fixes

Improvements

Bug fixes

Improvements

New features

Bug fixes

Improvements

Other changes

Improvements

Breaking changes

Breaking changes

New features

Bug fixes

Improvements

Bug Fixes

Improvements

Improvements

Releases: neo4j/graph-data-science

Graph Data Science 2.2.5

Bug Fixes

Graph Data Science 2.3.0-Alpha02

Breaking changes

New features

Bug fixes

Improvements

Other changes

Graph Data Science 2.2.4

Bug fixes

Improvements

Graph Data Science 2.2.3

Bug fixes

Improvements

Graph Data Science 2.3.0-alpha01

New features

Bug fixes

Improvements

Other changes

Graph Data Science 2.2.2

Improvements

Graph Data Science 2.2.1

Breaking changes

Graph Data Science 2.2.0

Breaking changes

New features

Bug fixes

Improvements

Graph Data Science 2.1.13

Bug Fixes

Improvements

Graph Data Science 2.1.12

Improvements