Skip to content

Releases: neo4j/graph-data-science

Graph Data Science 2.2.5

29 Nov 10:39
Compare
Choose a tag to compare

Neo4j Graph Data Science 2.2.5 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Bug Fixes

  • Some functions would not work as expected with Neo4j 5.x versions
    • gds.alpha.linkprediction.adamicAdar
    • gds.alpha.linkprediction.commonNeighbors
    • gds.alpha.linkprediction.resourceAllocation
    • gds.alpha.linkprediction.totalNeighbors

Graph Data Science 2.3.0-Alpha02

21 Nov 10:41
Compare
Choose a tag to compare
Pre-release

GDS 2.3.0-alpha02 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means is promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • The parameter startNodeId in Spanning Tree algorithms have been replaced with sourceNode.
  • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behavior by specifying the new parameter objective in gds.alpha.spanningTree.

New features

Leiden

  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.stream.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP

HashGNN

  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Spanning Tree

  • New modes supported: gds.alpha.spanningTree.(stats, stream, mutate)
  • New yield output for gds.alpha.spanningTree that outputs the sum of weights in the discovered spanning tree.
  • New yield output for gds.alpha.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode:
  • gds.alpha.spanningTree.stream.estimate
  • gds.alpha.spanningTree.mutate.estimate
  • gds.alpha.spanningTree.stats.estimate
  • gds.alpha.spanningTree.write.estimate

Bug fixes

  • Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow

  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

  • Better parallelization and improved overall performance improvements

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.

Graph Data Science 2.2.4

21 Nov 10:46
Compare
Choose a tag to compare

GDS 2.2.4 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

  • gds.alpha.nodeSimilarity.filtered - would give incorrect node IDs.
  • Pregel framework - the computation would not stop after terminating the underlying transaction. This affects gds.pageRank, gds.articleRank, gds.eigenvector.
  • alpha.hits and gds.alpha.sllpa could not be used as a nodeProperty step inside ml pipeline including gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.
  • nodeProperty steps could not be added to ml pipelines when running against Neo4j 5.x. This affected gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.

Improvements

  • gds.graph.list will only calculate the graph size when the procedure is called without any YIELD or if the fields memoryUsage or sizeInBytes are explicitly YIELD-ed.
    Using YIELD to return other fields but not one of memoryUsage or sizeInBytes can speed up the execution time of gds.graph.list.

Graph Data Science 2.2.3

07 Nov 09:55
Compare
Choose a tag to compare

GDS 2.2.3 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

  • gds.graph.export failed to run on Neo4j 5.X
  • gds.graph.export failed with InvalidRecordException when writeConcurrency is set >1.
  • Enterprise users were unable to load models trained with concurrency > 4.

Improvements

  • Arrow graph import now fully supports external node ids in the 64 Bit space.

Graph Data Science 2.3.0-alpha01

23 Oct 13:42
Compare
Choose a tag to compare
Pre-release

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

  • Added a parameter consecutiveIds to Leiden to assign consecutive ids for the discovered communities.
  • Added a parameter seedProperty to Leiden to seed initial communities for nodes.
  • Added new configuration parameter focusWeight for Logistic Regression training method, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addLogisticRegression

Bug fixes

  • Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

  • Arrow graph import now fully supports external node ids in the 64 Bit space.
  • Arrow graph import now supports 16, 32 or 64 Bit node identifiers.

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.

Graph Data Science 2.2.2

21 Oct 10:17
Compare
Choose a tag to compare

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Improvements

  • Graph Data Science ≥2.2.2 now supports Neo4j 5

Graph Data Science 2.2.1

17 Oct 08:44
Compare
Choose a tag to compare

GDS 2.2.1 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Breaking changes

  • Change the content of some fields from the output of gds.debug.arrow:
    • listenAddress now always returns the same content as advertisedListenAddress
    • serverLocation always returns NULL

Graph Data Science 2.2.0

10 Oct 11:45
Compare
Choose a tag to compare

GDS 2.2.0 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Breaking changes

  • Link Prediction filtering:
    • Change graph filtering in gds.beta.pipeline.linkPrediction.train
      • Replace parameter nodeLabels with sourceNodeLabel and targetNodeLabel.
      • Replace parameter relationshipTypes with targetRelationshipType.
    • Change graph filtering in gds.beta.pipeline.linkPrediction.predict
      • Replace parameter nodeLabels with optional sourceNodeLabels and targetNodeLabels. By default, they will be derived from the model's train configuration.
      • Change the default value for relationshipTypes with the targetRelationshipType from the model's train configuration.
  • Node Classification & Regression filtering:
    • Change graph filtering in gds.beta.pipeline.nodeClassification.train and gds.beta.pipeline.nodeRegression.train
      • Replace parameter nodeLabels with targetNodeLabels
    • Change graph filtering in gds.beta.pipeline.nodeClassification.predict and gds.beta.pipeline.nodeRegression.predict
      • Replace parameter nodeLabels with targetNodeLabels By default, they will be derived from the model's train configuration.
  • Promoting Collapse Path to beta tier
    • Changed the procedure name to gds.beta.collapsePath.mutate
    • Use parameter pathTemplates to now specify multiple_path templates_.
  • Promoting CELF to beta tier
    • Moved gds.alpha.influenceMaximization.celf.stream to gds.beta.influenceMaximization.celf.stream
  • For graphs created, with gds.graph.project.cypher, reduce output of gds.graph.list to only print the names of parameters. This will avoid printing the parameter values, which potentially leads to long procedure execution times.
  • RandomWalk algorithm promoted to product tier
    • gds.beta.randomWalk.stats => gds.randomWalk.stats
    • gds.beta.randomWalk.stats.estimate => gds.randomWalk.stats.estimate
    • gds.beta.randomWalk.stream => gds.randomWalk.stream
    • gds.beta.randomWalk.stream.estimate => gds.randomWalk.stream.estimate
  • Removed debug_log config field from Arrow Create Database action.
  • Node2Vec uses new embedding initializer NORMALIZED as default.
  • Dropped support for older patches:
    • for 4.3, only 4.3.15 and later is supported
    • for 4.4, only 4.4.9 and later is supported

New features

  • Link Prediction filtering:
    • Supports heterogeneous LinkPrediction pipelines by allowing configuring which node labels and relationship type to train and predict for.
    • See Breaking changes above for more details.
  • K-means:
    • Added centroids and average node-centroid distance to result for Mutate, Stats, and Write modes.
    • Added distance to centroid per node result in Stream mode.
    • Introduced a parameter numberOfRestarts that runs K-Means multiple times and picks the one with the minimum node-centroid distance.
    • Introduced a parameter computeSilhouette that if enabled will compute silhouette related metrics.
    • Introduced a parameter initialSampler which can select different sampling strategies for picking the first centroids.
      • Added the K-means++ initialization algorithm which can be enabled by setting initialSampler=kmeans++.
    • Introduced the parameter seedCentroids which seeds input centroids to k-means (in negation of the above).
  • Introduced a new scaler Center for ScaleProperties that subtracts the mean from each value.
  • Expose penaltyL2 to configure the l2 regularization term to the loss function in gds.beta.graphSage.train.
  • Add Multilayer Perceptron as a training method for node classification (gds.alpha.pipeline.nodeClassification.addMLP) and link prediction (gds.alpha.pipeline.linkPrediction.addMLP).
  • Add SAME_CATEGORY feature type to gds.beta.pipeline.linkPrediction.addFeature.
  • Added new procedure gds.beta.graph.relationships.stream that streams relationship topology.
  • Added arrow export endpoint gds.beta.graph.relationships.stream that streams relationship topology.
  • Added new procedure gds.alpha.graph.sample.rwr that creates a new graph projection by sampling using random walk with restarts.
  • Added the ability to collapse multiple paths using gds.beta.collapsePath.mutate.
  • Promoting CELF algorithm to beta tier.
    • Added gds.beta.influenceMaximization.celf.stats
    • Added gds.beta.influenceMaximization.celf.mutate
    • Added gds.beta.influenceMaximization.celf.write
    • Added progress tracking capabilities.
    • Added memory estimation.
  • Progress tracking for KMeans algorithm.
  • Memory estimation for KMeans.
    • added gds.alpha.kmeans.mutate.estimate
    • added gds.alpha.kmeans.stats.estimate
    • added gds.alpha.kmeans.stream.estimate
    • added gds.alpha.kmeans.write.estimate
  • Added procedure to compute modularity for pre-computed communities.
    • gds.alpha.modularity.stats
    • gds.alpha.modularity.stream
  • Added new config options to the GDS Flight server.
    • gds.arrow.encryption.never deactivates the server encryption even if it would otherwise be enabled.
    • gds.arrow.advertised_listen_address sets the server location that clients should connect to.
  • Added support for importing String node identifiers for the Arrow CREATE_DATABASE action.
  • Added capability to run BetweennessCentrality using relationship weights.
    • Added relationshipWeightProperty optional configuration parameter.
  • Added stats mode procedures for RandomWalk.
    • gds.beta.randomWalk.stats
    • gds.beta.randomWalk.stats.estimate
  • Introduced the ability to configure defaults and limits for configuration parameters.
    • gds.alpha.config.defaults.list
    • gds.alpha.config.defaults.set
    • gds.alpha.config.limits.list
    • gds.alpha.config.limits.set
  • Introduce new configuration parameters contextNodeLabels and contextRelationshipTypes in nodePropertySteps.
    • gds.beta.pipeline.linkPrediction.addNodeProperty
    • gds.beta.pipeline.nodeClassification.addNodeProperty
    • gds.alpha.pipeline.nodeRegression.addNodeProperty
    • The context is used to enlarge the input graph to the node property steps when running gds.beta.pipeline.linkPrediction.addNodeProperty.[train|predict], gds.beta.pipeline.nodeClassification.[train|predict] and gds.alpha.pipeline.nodeRegression.[train|predict].
  • Leiden
    • Add capability to mutate intermediateCommunities when includeIntermediateCommunities is set to true.
    • Add capability to write intermediateCommunities when includeIntermediateCommunities is set to true.
  • Node2Vec adds new embedding initializer NORMALIZED configured with the parameter embeddingInitializer.

Bug fixes

  • Fixed a bug where eager checking for business rules around GDS on a Neo4j cluster would cause the cluster to fail to start.
  • Fixed a bug where Neo4j users with admin role could not see all graphs in the catalog on GDS enterprise.
  • Fixed a bug in random graph generation where the resulting graph can end up with an incorrect relationship schema.
  • Fixed a bug where a schema filter would not create a deep copy of the property schema map.
  • Fixed a bug where modularity could have been incorrectly updated in ModularityOptimization. This may affect the number of performed iterations for ModularityOptimization or number of levels for Louvain.
  • Fixed a bug where restoring from csv could not read values wrapped in quotes.
  • Fixed a bug where KNN did not use the expected search space. This will improve the result but also increase the runtime.
  • Fixed a bug in ML autotuning where maxTrials included model evaluations with concrete configs.
  • Fixed a bug where gds.triangleCount and gds.localClusteringCoefficient were allowed to run on directed graphs.
  • Fixed a bug in gds.graph.export and Arrow DB import where the writeConcurrency was not respected.
  • Fixed a bug with Node Operations where gds.graph.nodeProperties.write, gds.graph.nodeProperties.drop and gds.graph.nodeProperties/y.stream would not accept String input for parameters nodeLabels and/or nodeProperties.
  • Fixed a bug, where Node2Vec would report negative losses.
  • Fixed a bug with gds.graph.nodeProperties/y.stream, where the wrong nodes where returned when specifying a nodeLabels filter and using Arrow.
  • Fixed a bug in the Louvain algorithm, where aggregating dense communities could potentially lead to an exception.
  • Fixed a bug where model loading is attempted even for unlicensed user, which might fail database startup.

Improvements

  • Better error handling in K-means
  • Improve memory estimation for gds.beta.pipeline.linkPrediction.train when the nodePropertySteps used a weighted graph.
  • Improve runtime of feature generation in gds.beta.linkPrediction.[train|predict].
  • Improve performance of gds.graph.project.cypher by using the subscriber API.
  • Improve convergence criteria for LogisticRegression and LinearRegression trainers, by making it independent of the number of batches. This affects gds.alpha.pipeline.nodeRegression.train, gds.beta.pipeline.[linkPrediction|nodeClassification].train.
  • Improve error handling on invalid user input.
  • Cypher on GDS projections is now capable of setting labels on nodes.
  • Promoting CELF algorithm to `bet...
Read more

Graph Data Science 2.1.13

29 Sep 18:22
Compare
Choose a tag to compare

GDS 2.1.13 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Bug Fixes

  • gds.graph.nodeProperties.write, gds.graph.nodeProperties.drop, gds.graph.nodeProperty.stream and gds.graph.nodeProperties.stream now accept String input for parameters nodeLabels and/or nodeProperties.
  • gds.graph.nodeProperty.stream and gds.graph.nodeProperties.stream, would return the wrong nodes when specifying a nodeLabels filter when using Arrow.
  • Louvain algorithm would throw an exception when aggregating dense communities.

Improvements

  • Export to CSV now enabled when GDS is running on a Causal Cluster Read Replica
    • gds.beta.graph.export.csv
    • gds.beta.graph.export.csv.estimate

Graph Data Science 2.1.12

15 Sep 16:30
Compare
Choose a tag to compare

GDS 2.1.12 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.

For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8

Improvements

  • New procedures for enabling and disabling Arrow database import (default: enabled)
    • gds.features.enableArrowDatabaseImport
    • gds.features.enableArrowDatabaseImport.reset