-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Path Pattern Queries #187
base: master
Are you sure you want to change the base?
Path Pattern Queries #187
Conversation
This proposal should eventually be able to fulfil the requirements outlined in #179. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice CIP!
|
||
The direction of each relationship is governed by the overall direction of the Regular Path Pattern. | ||
It is however possible to explicitly define the direction for a particular part of the pattern. | ||
This is done by either prefixing that part with `<` for a right-to-left direction or suffixing it with `>` for a left-to-right direction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use 'outgoing' instead of 'left-to-right', and 'incoming' instead of 'right-to-left'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would imply a particular traversal order, which I think is saying too much, with left-to-right
and right-to-left
we are only talking about the direction with regard to how the pattern is written.
|
||
In the case of a Defined Path Predicate where both nodes are the same, the direction of the predicate is irrelevant. | ||
In general the direction of a Defined Path Predicate is quite important, and used for mapping the pattern in the predicate into the Regular Path Patterns that reference it. | ||
The only cases where it is allowed to omit the direction of a Defined Path Predicate is when the defined predicate is reflexive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be the job of the query compiler to determine reflexiveness of the defined predicate, and issue an error if non-reflexive predicates are defined (or used) without direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that either the query compiler issues an error, or the behaviour of the path predicate is undefined (i.e. the behaviour in such a case is outside of the scope of the specification).
I'd prefer for an implementation to issue an error, but I don't know how hard that analysis is to perform, and I thus don't want to mandate that at this point (it needs further analysis).
---- | ||
|
||
In the case of a Defined Path Predicate where both nodes are the same, the direction of the predicate is irrelevant. | ||
In general the direction of a Defined Path Predicate is quite important, and used for mapping the pattern in the predicate into the Regular Path Patterns that reference it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies to the definition of a predicate, right? I imagine that it will be perfectly possible to reverse the direction in the Regular Path Pattern, like so:
MATCH (a)<-/pred/-(b)
PATH (a)-/pred/->(b) IS
(a)-[:KNOWS]->(b)
How about it being used in an undirected Regular Path Pattern?
MATCH (a)-/pred/-(b)
PATH (a)-/pred/->(b) IS
(a)-[:KNOWS]->(b)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of those are perfectly valid. The undirected use of a directed defined path predicate is the same as an OR/UNION between the two directions.
I.e. (a)-/pred/-(b)
is the same as (a)-/<pred | pred>/-(b)
(which is the same as (a)-/pred> | <pred/-(b)
or (a)-/<pred>/-(b)
or (a)<-/pred/->(b)
).
== Regular Path Patterns | ||
|
||
Above and beyond the types of patterns that can be expressed in Cypher using the normal path syntax, Cypher also supports what amounts to regular expressions over paths. | ||
This functionality is called Regular Path Patterns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a good idea to mention somewhere that these are essentially RPQs, since RPQs are the standard term for these sorts of queries.
Above and beyond the types of patterns that can be expressed in Cypher using the normal path syntax, Cypher also supports what amounts to regular expressions over paths. | ||
This functionality is called Regular Path Patterns. | ||
|
||
A Regular Path Pattern is defined as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 1 - 4 below, it would be great if these could be supplemented with regex notation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in an example of what the syntax looks like?
|
||
Contrary to Relationship Patterns, Regular Path Patterns do _not_ allow binding a relationship to a variable. | ||
In order to bind the matching path to a variable, a Path Assignment should be used, by preceding the path with an identifier and an equals sign (`=`). | ||
This avoids a problem that existed in the past with repetition of relationships (a syntax that was deprecated with the introduction of Regular Path Patterns), where a relationship variable would bind to a list, making it hard to express predicates over the actual relationships. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not meant to be "a syntax that is deprecated" (as RPPs are only being introduced now)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried writing the text as if Regular Path Patterns are already in the language, since when this text is merged, it will be in the language. Although is works in that case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha - ok, that makes sense
The direction of each relationship is governed by the overall direction of the Regular Path Pattern. | ||
It is however possible to explicitly define the direction for a particular part of the pattern. | ||
This is done by either prefixing that part with `<` for a right-to-left direction or suffixing it with `>` for a left-to-right direction. | ||
It is possible to both prefix the part with `<` and suffixing it with `>`, giving that part the interpretation of being undirected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"suffixing" -> "suffix"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "giving that part the interpretation of being undirected" section is a bit unwieldy - maybe something along the lines of "indicating that that part of the pattern is undirected"
5b64198
to
b317753
Compare
I've updated the document to unify the Path Pattern syntax with normal Pattern syntax. I have not yet updated the grammar to reflect that, I'll do that imminently. |
I think it is important to note which things that is currently valid Cypher will have changed semantics under this proposal:
|
Also update examples to fit updated syntax.
Notable example queries that cannot be expressed using this syntax includes:
|
The direction of each relationship is governed by the overall direction of the Path Pattern. | ||
It is however possible to explicitly define the direction for a particular part of the pattern. | ||
This is done by either prefixing that part with `<` for a right-to-left direction or suffix it with `>` for a left-to-right direction. | ||
It is possible to both prefix the part with `<` and suffix it with `>`, indicating that this part of the pattern matches in any direction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is mostly repetition of the section Directions
above. Perhaps replace this with a reference to that section? Something like
Using the arrowhead syntax introduced in [[directions]], consider the following query
[source, cypher] | ||
.Find chains of co-authorship | ||
---- | ||
PATH PATTERN co_author = (a)-[:AUTHORED]->(:Book)<-[:AUTHORED]-(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example DPP does define direction of the relationships, which perhaps it should not, in order to be an example of when leaving the definition out is okay (reflexivity)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is an example of when the named path pattern itself is undirected - and the direction is left out on the next row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so the meaning is that it is allowed to omit the direction of a path pattern that references a DPP, but only when the DPP is reflexive. Got it.
Two things:
For example, drawn from life sciences, it would be good to see these queries:
|
Instead of 'Defined Path Predicates'
Including examples of what cannot be expressed, or is hard to express.
==== Differing property values along a path | ||
|
||
While it is possible to express that a certain property should have the same value for all nodes in a path (by saying that each pair of nodes should have the same property value), it is not possible to express that all nodes should have a _different_ property value. | ||
It has been shown that computing such paths would not be tractable in the general case, so perhaps it is a good thing to not be able to express this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should add a reference to this claim.
I wrote up some slides that give an overview of the history of this proposal, what is has been influenced by and what other things has been influenced by it. This information should provide some insight into some of the design choices made in this CIP. The slides are available on the opencypher.org/references page |
|
||
[source, ebnf] | ||
---- | ||
NamedPathPredicate = 'PATH', 'PATTERN', NamedPathName, '=', PathPattern, [Where] ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like examples are not match with grammar for NamedPathPredicate
. In examples like
PATH PATTERN unreciprocated_love = (a)-[:LOVES]->(b)
we see, that it should be something like PatternPart
(maybe without [Variable, '=']
part) in NamedPathPredicate
after =
instead of PathPattern
.
So, I think that correct rule should be
NamedPathPredicate = 'PATH', 'PATTERN', NamedPathName, '=', NodePattern, {(EdgePattern | PathPattern), NodePattern}, [Where] ;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, thank you!
This is first draft of a proposal for adding Path Patterns to Cypher. There is still work to be done here before this is finalised.
CIP2017-02-06