-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable pattern matching semantics in response to #174 #175
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction of this.
The default uniqueness mode used by `MATCH` (without a further specification of the preferred uniqueness mode) is relationship-unique matching. | ||
|
||
`MATCH ALL` does not reject any paths - not even paths containing cycles - and hence can lead to infinite result sets for the whole query. | ||
It is recommended that implementations generate at least a warning when static analysis is not able to proof query termination due to the chosen uniqueness mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proof -> prove
|
||
=== Proposal: Default uniqueness mode | ||
|
||
Additionally, it is proposed that a conforming implementation should provide a pre-parser option for defining a default uniqueness level for use with regular pattern matching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this kind of recommendation belongs in a CIP. Is it not well understood that an implementing system would provide ways of changing defaults?
* `closed(p)`: true if the start and the end node of `p` are the same node | ||
* `trail(p)`: true if `p` contains no duplicate relationships | ||
* `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node | ||
* `trek(p)`: true if `p` contains two identical consecutive relationships |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does identical
mean here? Same rel-type? Same type and properties? Equal?
* `trail(p)`: true if `p` contains no duplicate relationships | ||
* `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node | ||
* `trek(p)`: true if `p` contains two identical consecutive relationships | ||
* `repetetive(p)`: true if `p` contains any closed subpath `q` of `size > 1` that is immediately repeated after itself in `p` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeti
tive
RETURN p | ||
---- | ||
|
||
Note that these functions naturally extend to lists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean lists generally, or lists containing only nodes and relationships? I'm not sure I follow; what does trail(list)
yielding true
mean? That the list is a trail?
Changing the uniqueness mode of a sub query recursively changes the default uniqueness mode for all contained `MATCH` clauses unless it is overridden again. Examples: | ||
|
||
* `MATCH <uniqueness-modes> { MATCH ... } ...` | ||
* `DO <uniqueness-modes> { MATCH ... } ...` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are MATCH
and DO
(this is the first time it appears on this repo I think) the two cases where you'd be able to supply these modes? What about MERGE
?
== Motivation | ||
|
||
Currently Cypher uses pattern matching semantics that treats all patterns that occur in a `MATCH` clause as a unit (called a *uniqueness scope*) and only considers pattern instances that bind different relationships to each fixed length relationship pattern variable and to each element of a variable length relationship pattern variable. | ||
This has come to be called *cypermorphism* informally and is a variation of edge isomorphism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought these two were synonymous; what is the variation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Academic' edge isomorphism only talks about a single, connected candidate walk while cyphermorphism considers all relationships bound by any pattern in the same match (even relationships bound by different, disconnected walks) for uniqueness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is a difference of "morphism". If one followed strict isomorphism ("path isomorphism" in Walks, Trails, Paths terms, no repeated vertices, and therefore also no repeated edges), then Cypher's current "pattern gluing" rules would apply (unless we change those rules), and we would end up evaluating matches against the compound, glued pattern, but using isomorphic semantics. Gluing may be syntactic salt, but is orthogonal to "morphism". Cyphermorphism, in my view, is no different to "Trail morphism", or "edge isomorphism".
- Add more queries - Add EBNF grammar of proposed changes
- Move definition of `disjoint()` to examples
* Now in Appendix * "bind" -> "class" (former deprecated) * Added example
Just for the completeness: there is a fourth option (injective vertices, non-injective edges): (a)-[e1]->(b), (a)-[e2]->(b). In this case, a and b have to be distinct, but e1 and e2 can match to the same edge. |
| 'DIFFERENT', ('RELATIONSHIPS' | 'EDGES'), [ VariableList ] | ||
PatternMorphism = 'DIFFERENT', ('NODES' | 'VERTICES') | ||
| 'DIFFERENT', ('RELATIONSHIPS' | 'EDGES') | ||
| 'DIFFERENT', [ VariableList ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional VariableList
? Is that really right?
As we can see above, patterns in Cypher consist of a comma-separated list of _pattern parts_, where a pattern part is exemplified by `p = (e:Employee)-[:REPORTS_TO*1..3]->(m:Manager)`. | ||
PathClass = 'WALK' | ||
| 'TRAIL' | ||
| [ 'SIMPLE' ], 'PATH' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does TRAIL
, PATH
and SIMPLE PATH
really encode three different classes? If not I wonder why synonyms are allowed. (a WALK
is obviously different from those three)
Note that this CIP is in a heavy state of flux in order to allow for alignment with ongoing discussions. |
Aims to solve #174 |
Hello, are there any updates regarding this CIP? I am very interested in the proposed |
View it here:
https://github.com/boggle/openCypher/blob/isomatch/cip/1.accepted/CIP2017-01-18-configurable-pattern-matching-semantics.adoc