Releases · neo4j/graph-data-science

16 Nov 07:52

gminneci

2.5.4

c5de5ed

Graph Data Science 2.5.4

Other changes

Updated Netty dependencies to 4.1.100.Final to fix CVE-2023-44487

Assets 4

13 Nov 15:31

gminneci

2.5.3

3e87502

Graph Data Science 2.5.3

New features

Add support for Neo4j 4.4.27

Bug fixes

Fixed a bug that lead to an unresponsive DBMS or even OS when running GDS projections in Neo4j on MacBooks (x86 or ARM).
Avoid a possible race condition random walking and training in Node2Vec. Also removes the need for a timeout in Node2Vec.

Improvements

Improved state synchronization during Arrow graph import to avoid errors due to out-of-sync messages.

Assets 4

23 Oct 09:05

Mats-SX

2.5.1

c317400

Graph Data Science 2.5.1

New features

Add support for Neo4j 5.13

Assets 4

13 Oct 09:43

gminneci

2.5.0

45df4b5

Graph Data Science 2.5.0

Breaking changes

Dropped support for earlier version of Neo4j 5, in particular 5.1, 5.2, 5.3, 5.4, and 5.5 are no longer supported and GDS is no longer compatible with those versions.

New features

Major

Added new algorithms for directed acyclic graphs:
- gds.dag.topologicalSort.stream
- gds.dag.longestPath.stream
Deprecating alpha and beta namespace for procedures and algorithms, and improving many to production grade - see details in ‘Full list of procedure being promoted’

Minor

Added procedure to retrieve the version of the installed GDS
- CALL gds.version
Add new procedure, gds.license.state to verify the license state of the Graph Data Science library. Also, analogous adding a new function gds.isLicensed().
Added memory estimation for modularity calculation via procedures gds.modularity.[stream|stats].estimate
Added memory estimation for filtered KNN via procedures gds.knn.filtered.[mutate|stream|stats|write].estimate
Added Stats and Write modes for Harmonic Closeness Centrality
Added new procedures for SCC:
- gds.scc.mutate
- gds.scc.stats
Added memory estimation to SCC:
- gds.scc.stream.estimate
- gds.scc.stats.estimate
- gds.scc.mutate.estimate
- gds.scc.write.estimate
Added consecutiveIds parameter to gds.scc procedures to output the components in a consecutive id space.
Added memory estimation for Steiner Tree via procedures gds.steinerTree.[mutate|stream|stats|write].estimate
Added stats mode for gds.modularityOptimization

Bug fixes

Fixed a bug that in logging progress of Prepare Batches in GraphSAGE training.
Fixed a bug where KNN would compute incorrect EUCLIDEAN similarity.
Fixed a bug where limits validation could potentially not be triggered with configuration settings passed by from specified defaults.
Fixed a bug where gds.graph.filter would list a relationship type of __ALL__ even if all relationships were filtered out.
Fixed a bug where Triangle Count could compute an incorrect number of triangles when the maxDegree parameter was specified.
Fixed a bug where Triangle Count could compute an incorrect number of triangles when multiple relationship types are specified.

Improvements

The random graph generation procedure now will return a different graph each time gds.beta.graph.generate is called without specifying a random seed. Furthermore, when the seed is specified, the resulting graph will always have the same topology.
It is now possible to specify common node labels when importing nodes via arrow.
A better error message is thrown when encountering null values in the nodeLabels column when importing nodes via arrow.
Added the configuration option listNodeLabels for the node property stream procedures that will trigger listing all node labels for the respective node.
Added the configuration option list_node_labels for the node property stream arrow endpoints that will trigger listing all node labels for the respective node.
The Cypher projection now returns the executing query as part of the projection result as well as part of the gds.graph.list output.
Support passing startNodes to gds.graph.sample.cnarw as node objects instead of only node ids.
Support passing nodeId to gds.util.nodeProperty as node objects instead of only node id.
Improved validation for relationship projections: If a global SUM, MIN, MAX or COUNT aggregation is defined, there needs to be at least one property mapping.
HITS algorithm procedures have a default hitsIterations value of 20
More accurate progress tracking for the gds.scc algorithm.
The componentDistribution and communityDistribution parameters now also include the p1, p5,p10, p25 percentiles. This affects algorithms in the Community Detection category.

Full list of procedure being promoted

Promoting Model Catalog procedures:
- gds.beta.model.drop, deprecated by gds.model.drop
  - Return column shared renamed to published
  - modelName, modelType extracted to separate return columns
- gds.beta.model.exists, deprecated by gds.model.exists
- gds.beta.model.list, deprecated by gds.model.list
  - Return column shared renamed to published
  - modelName, modelType extracted to separate return columns
- gds.alpha.model.delete, deprecated by gds.model.delete
- gds.alpha.model.load, deprecated by gds.model.load
- gds.alpha.model.publish, deprecated by gds.model.publish
  - Return column shared renamed to published
  - modelName, modelType extracted to separate return columns
- gds.alpha.model.store, deprecated by gds.model.store
Promoting Pipeline Catalog procedures:
- gds.beta.pipeline.drop, deprecated by gds.pipeline.drop
- gds.beta.pipeline.exists, deprecated by gds.pipeline.exists
- gds.beta.pipeline.list, deprecated by gds.pipeline.list
- Procedure gds.alpha.systemMonitor is deprecated by gds.systemMonitor
- Procedure gds.beta.listProgress is deprecated by gds.listProgress
- Procedure gds.alpha.triangles is deprecated by gds.triangles
Deprecating gds.beta.steinerTree procedures
- gds.beta.steinerTree.mutate, deprecated by gds.steinerTree.mutate
- gds.beta.steinerTree.stats, deprecated by gds.steinerTree.stats
- gds.beta.steinerTree.stream, deprecated by gds.steinerTree.stream
- gds.beta.steinerTree.write, deprecated by gds.steinerTree.write
Deprecating gds.beta.spanningTree procedures
- gds.beta.spanningTree.mutate[.estimate], deprecated by gds.spanningTree.mutate[.estimate]
- gds.beta.spanningTree.stats[.estimate], deprecated by gds.spanningTree.stats[.estimate]
- gds.beta.spanningTree.stream[.estimate], deprecated by gds.spanningTree.stream[.estimate]
- gds.beta.spanningTree.write[.estimate], deprecated by gds.spanningTree.write[.estimate]
Deprecating gds.alpha.maxkcut procedures
- gds.alpha.maxkcut.mutate[.estimate], deprecated by gds.maxkcut.mutate[.estimate]
- gds.alpha.maxkcut.stream[.estimate], deprecated by gds.maxkcut.stream[.estimate]
Deprecating gds.beta.closeness procedures
- gds.beta.closeness.mutate, deprecated by gds.closeness.mutate
  - The mutateProperty field has been removed, it can be accessed via the configuration.
- gds.beta.closeness.stats, deprecated by gds.closeness.stats
- gds.beta.closeness.stream, deprecated by gds.closeness.stream
- gds.beta.closeness.write, deprecated by gds.closeness.write
  - The writeProperty field has been removed, it can be accessed via the configuration.
Deprecating gds.beta.leiden procedures
- gds.beta.leiden.mutate[.estimate], deprecated by gds.leiden.mutate[.estimate]
- gds.beta.leiden.stats[.estimate], deprecated by gds.leiden.stats[.estimate]
- gds.beta.leiden.stream[.estimate], deprecated by gds.leiden.stream[.estimate]
- gds.beta.leiden.write[.estimate], deprecated by gds.leiden.write[.estimate]
Deprecating gds.alpha.conductance procedures
- gds.alpha.conductance.stream, deprecated by gds.conductance.stream
Deprecating gds.alpha.modularity procedures
- gds.alpha.modularity.stream, deprecated by gds.modularity.stream
- gds.alpha.modularity.stats, deprecated by gds.modularity.stats
Deprecating gds.beta.modularityOptimization procedures
- gds.beta.modularityOptimization.stream[.estimate], deprecated by gds.modularityOptimization.stream[.estimate]
- gds.beta.modularityOptimization.stats[.estimate], deprecated by gds.modularityOptimization.stats[.estimate]
- gds.beta.modularityOptimization.stream[.estimate], deprecated by gds.modularityOptimization.stream[.estimate]
- gds.beta.modularityOptimization.stats[.estimate], deprecated by gds.modularityOptimization.stats[.estimate]
Deprecating gds.beta.influenceMaximization.celf procedures
- gds.beta.influenceMaximization.celf.mutate[.estimate], deprecated by gds.influenceMaximization.celf.mutate[.estimate]
- gds.beta.influenceMaximization.celf.stats[.estimate], deprecated by gds.influenceMaximization.celf.stats[.estimate]
- gds.beta.influenceMaximization.celfstream[.estimate], deprecated by gds.influenceMaximization.celf.stream[.estimate]
- gds.beta.influenceMaximization.celf.write[.estimate], deprecated by gds.influenceMaximization.celf.write[.estimate]
Deprecating gds.alpha.knn.filtered procedures
- gds.alpha.knn.filtered.mutate, deprecated by gds.knn.filtered.mutate
- gds.alpha.knn.filtered.stats, deprecated by gds.knn.filtered.stats
- gds.alpha.knn.filtered.stream, deprecated by gds.knn.filtered.stream
- gds.alpha.knn.filtered.write, deprecated by gds.knn.filtered.write
Deprecating gds.alpha.nodeSimilarity.filtered procedures
- gds.alpha.nodeSimilarity.filtered.mutate[.estimate], deprecated by gds.nodeSimilarity.filtered.mutate[.estimate]
- gds.alpha.nodeSimilarity.filtered.stats[.estimate], deprecated by gds.nodeSimilarity.filtered.stats[.estimate]
- gds.alpha.nodeSimilarity.filtered.stream[.estimate], deprecated by gds.nodeSimilarity.filtered.stream[.estimate]
- gds.alpha.nodeSimilarity.filtered.write[.estimate], deprecated by gds.nodeSimilarity.filtered.write[.estimate]
Deprecating gds.alpha.closeness.harmonic procedures
- gds.alpha.closeness.harmonic.stream, deprecated by gds.closeness.harmonic.stream
- gds.alpha.closeness.harmonic.write, deprecated by gds.closeness.harmonic.write
Deprecating gds.beta.graph.relationships procedures
- `gds.beta.graph.relati...

Assets 4

15 Sep 14:53

gminneci

2.4.6

32bc9fe

Graph Data Science 2.4.6

`neo4j-graph-data-science-2.4.6`

New features

Added compatibility with Neo4j database 5.12.0.

Bug fixes

Fix a bug where HITS write and mutate procedures failed to parse configuration.

Assets 4

24 Aug 15:20

gminneci

2.4.5

183d62e

2.4.5

`neo4j-graph-data-science-2.4.5`

Bug fixes

Fix a bug in the triangle-related procedures with on graphs with multiple relationship types where triangles could be computed incorrectly. The following procedures are affected:
- gds.triangleCount.[stream|mutate|write|stats]
- gds.localClusteringCoefficient.[stream|mutate|write|stats]
- gds.alpha.triangles

Assets 4

17 Aug 13:01

jjaderberg

2.4.4

42f5ffc

Graph Data Science 2.4.4

Bug fixes

Fixed a bug where arrow processes that are automatically removed when they were aborted would not be properly cleaned up

Assets 4

27 Jul 12:30

jjaderberg

2.4.3

fa1c2c9

Graph Data Science 2.4.3

Improvements

Added COSINE as an available similarityMetric for the gds.nodeSimilarity procedure
When exporting graphs to CSV or using backup and restore, a more diverse node label naming is now possible by using label mapping

Bug fixes

Fixed a bug where array default values would not be serialized or deserialized to csv correctly
Fixed an issue where Speaker-Listener LabelPropagation and other Pregel procedures wouldn’t stream or mutate on graphs that are not persisted in a Neo4j database
Fixed a bug in graph restore on AuraDS, which was failing after shutdown when node label name contained special characters or underscores

Assets 4

27 Jun 11:41

gminneci

2.4.1

180a2ed

Graph Data Science 2.4.1

Bug fixes

Fix a bug in K-Core decomposition that can return invalid values if core values are not consecutive.
Fix a bug when using mutateProperty where using the same name as an existing node property could fail. Affected procedures include:
- gds.alpha.knn.filtered.mutate
- gds.alpha.nodeSimilarity.filtered.mutate
- gds.beta.pipeline.linkPrediction.predict.mutate
- gds.beta.steinerTree.mutate
- gds.beta.spanningTree.mutate
- gds.knn.mutate
- gds.nodeSimilarity.mutate

Improvements

Improved error handling when negative node ids are used as input in the sourceNode, targetNode, sourceNodes, and targetNodes fields.
Improved performance when projecting in-memory graphs when projecting larger graphs.

Assets 4

14 Jun 15:39

gminneci

2.4.0

272fce3

Graph Data Science 2.4.0

Breaking changes

Pass concurrency when training a pipeline to the node property steps. Before they were executed with the default concurrency of 4 if not overridden. This affects
- gds.beta.pipeline.linkPrediction.train
- gds.beta.pipeline.nodeClassification.train
- gds.alpha.pipeline.nodeClassification.train

New features

Major

Added Bellman-Ford algorithm
Added K-Core Decomposition algorithm
Added new Common Neighbour Aware Random Walk graph sampling algorithm
Add Random Forest and MLP classifier serialization support. This makes all node classification and link prediction models serializable

Minor

You can rename node properties when writing them back to the neo4j database using gds.nodeProperties.write by placing them inside a map in the form nodeProperty: 'renamedProperty'.
Added minCommunitySize|minComponentSize parameter to more procedures to allow filtering the result. (Contributed by @airtyon)
Added new procedure gds.alpha.drop.cypherdb to drop created in-memory databases
Added upperDegreeCutoff parameter to Node-Similarity and filtered Node-Similarity algorithm which allows skipping nodes if their degree is higher than the provided value.
Added aggregation to gds.beta.toUndirected to allow the aggregation of the new undirected relationships.
Added new optional parameter storeModelToDisk that automatically saves serializable models after training for licensed users. This affects gds.beta.pipeline.[linkPrediction|nodeClassification].train and gds.beta.graphSage.train.
Added procedure gds.graph.relationshipProperties.write that allows writing relationships with multiple properties to Neo4j.
Cypher Aggregation has graduated, which comes with a new name and API changes:
- The method of projection is now generally called "Cypher projection", possible with an additional "new" or "v2" qualifier.
  - The existing 'Cypher projection' (gds.graph.project.cypher) is now called "Legacy Cypher projection"
- The procedure name is losing the alpha qualifier and is now called gds.graph.project.
- The old name gds.alpha.graph.project is deprecated and usages will forward to the new name while also adapting to the new API.
- The 4th and 5th parameters nodeConfig and relationshipConfig have been merged into a single dataConfig parameter.
- The properties configuration key in this merged dataConfig parameter has been renamed to relationshipProperties.
- The overall projection configuration (e.g. readConcurrency) has moved from the 6th parameter to the 5th parameter.
Graph data retrieved via the GDS Arrow endpoint can now be partitioned via the FlightInfo endpoint.

Bug fixes

Fixed: Arrow server doesn't enable to project graphs with blank names anymore
Fixed: Arrow validates dangling relationships when creating an in-memory graph
Fixed: if an arrow process is aborted, creating a new process with the same name is now possible
Fixed a bug where gds.graph.export could fail when exporting larger graphs
Fixed a bug where gds.alpha.kSpanningTree returned incorrect results when called with the nodeLabels parameter.
Fixed a bug where gds.triangleCount would throw an ArrayIndexOutOfBoundsException when called with the nodeLabels parameter.
Fixed a bug where link prediction mutate results could fail when predicted probability is extremely close to zero.

Improvements

Major

Improve parallel runtime of several algorithms due to improvements of our degree-based partitioning. Note this is highly dataset dependent and is not be visible for all datasets. Affected algorithms are:
- FastRP
- HashGNN
- Leiden
- Approxmaxkcut
- Conductance
- LinkPrediction training
- ToUndirected
Improved partitioning. This affects the parallel runtime of gds.alpha.hits, gds.beta.graph.project.subgraph and gds.beta.pipeline.linkPrediction.predict if sampleRate = 0

Minor

Improve progress tracking for gds.beta.graphSage.train. This will enable progress bars on the python client.
Improve error message for invalid nodeLabels and relationshipTypes for procedures supporting memory estimation.
Allow running gds.debug.sysInfo and gds.debug.arrow to run against the system database.
Improve automatic conversion of array property values during graph projection.
The Yens algorithm can now be run in parallel.
The node regression now verifies upfront that the all targetProperty values provided are valid when calling gds.alpha.pipeline.nodeRegression.train.
The scale properties algorithm has been promoted:
- Added new procedures gds.scaleProperties.[stream,mutate] which replace gds.alpha.scaleProperties.[stream,mutate] that are now deprecated
  - The scalers L1Norm and L2Norm are not supported in the new procedures.
- Added new procedures gds.scaleProperties.[stats,write] to return statistics from a scale properties computation and write scaled properties back to a database respectively
- Procedures gds.scaleProperties.[mutate,stats,stream,write] support progress tracking with volumes. This will enable progress bars on the python client
- Procedures gds.scaleProperties.[mutate,stats,write] return statistics from the performed scale computation
- Added new parameter offset to the log scaler. This also affects procedures:
  - gds.pageRank
  - gds.eigenvector
  - gds.articleRank
- Added new procedures gds.scaleProperties.[mutate|stats|stream|write].estimate for estimating the memory requirements of running the scale properties algorithm
- Nodes with missing properties (null or NaN) are now omitted in the scale computation. Their scale value is set to NaN in the output.
Reduce the memory footprint of the binary embeddings saved by gds.beta.hashgnn.mutate.
Promote random forest classifier to beta tier. Added gds.beta.pipeline.[nodeClassification,linkPrediction].addRandomForest which replace gds.alpha.pipeline.[nodeClassification,linkPrediction].addRandomForest that are now deprecated.
Reduced memory allocation for the Spanning Tree algorithm.
A more effective rerouting algorithm is applied for the minimum Directed Steiner-Tree algorithm when the inverted index is present.
Improve memory usage when projecting very large graphs with very high degree nodes.
Additional validation for Cypher projection configuration to guide migration and avoid common mistakes.
The import of nodes with negative id via arrow into a database is now forbidden.
Graph restore now attempts to use the same id map implementation that has been used for the original graph.
Setting the useBadCollector option to true for the arrow database import will now actually trigger errors if the collector encountered a problem.

Contributors

airtyon

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Other changes

New features

Bug fixes

Improvements

New features

Breaking changes

New features

Major

Minor

Bug fixes

Improvements

Full list of procedure being promoted

`neo4j-graph-data-science-2.4.6`

New features

Bug fixes

`neo4j-graph-data-science-2.4.5`

Bug fixes

Bug fixes

Improvements

Bug fixes

Bug fixes

Improvements

Breaking changes

New features

Major

Minor

Bug fixes

Improvements

Major

Minor

Contributors

Releases: neo4j/graph-data-science

Graph Data Science 2.5.4

Other changes

Graph Data Science 2.5.3

New features

Bug fixes

Improvements

Graph Data Science 2.5.1

New features

Graph Data Science 2.5.0

Breaking changes

New features

Major

Minor

Bug fixes

Improvements

Full list of procedure being promoted

Graph Data Science 2.4.6

neo4j-graph-data-science-2.4.6

New features

Bug fixes

2.4.5

neo4j-graph-data-science-2.4.5

Bug fixes

Graph Data Science 2.4.4

Bug fixes

Graph Data Science 2.4.3

Improvements

Bug fixes

Graph Data Science 2.4.1

Bug fixes

Improvements

Graph Data Science 2.4.0

Breaking changes

New features

Major

Minor

Bug fixes

Improvements

Major

Minor

Contributors

`neo4j-graph-data-science-2.4.6`

`neo4j-graph-data-science-2.4.5`