Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Cypher (Neo4j) lexer #1423

Merged
merged 15 commits into from
Apr 14, 2020
Merged

Add a Cypher (Neo4j) lexer #1423

merged 15 commits into from
Apr 14, 2020

Conversation

ggrossetie
Copy link
Contributor

@ggrossetie ggrossetie commented Feb 2, 2020

@pyrmont pyrmont added the needs-review The PR needs to be reviewed label Feb 5, 2020
@pyrmont pyrmont self-assigned this Feb 5, 2020
Copy link
Contributor

@pyrmont pyrmont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mogztter Sorry it's taken such a long time to get to this. Since it's been here for so long, I took the liberty of making the changes myself rather than just ask you to do it. I did have two questions to which I wasn't sure of the answers.

If anything is unclear, let me know!

lib/rouge/lexers/cypher.rb Outdated Show resolved Hide resolved
lib/rouge/lexers/cypher.rb Outdated Show resolved Hide resolved
@pyrmont pyrmont added author-action The PR has been reviewed but action by the author is needed and removed needs-review The PR needs to be reviewed labels Apr 3, 2020
@ggrossetie
Copy link
Contributor Author

Your edits look great, thanks! 👌

ggrossetie and others added 2 commits April 10, 2020 09:27
Co-Authored-By: Michael Camilleri <mike@inqk.net>
Co-Authored-By: Michael Camilleri <mike@inqk.net>
@ggrossetie
Copy link
Contributor Author

@pyrmont I've merged your edits since number can be negative. Please let me know if there anything else that needs to be fixed.

@pyrmont pyrmont added needs-review The PR needs to be reviewed and removed author-action The PR has been reviewed but action by the author is needed labels Apr 10, 2020
@pyrmont
Copy link
Contributor

pyrmont commented Apr 10, 2020

@Mogztter I just tested with the code sample RETURN sign(-17), sign(0.1) from the webpage you linked to before and it's not lexing correctly. I'll try to work out where it's going wrong.

@ggrossetie
Copy link
Contributor Author

@pyrmont I will also have a look 🧐

@ggrossetie
Copy link
Contributor Author

@Mogztter I just tested with the code sample RETURN sign(-17), sign(0.1) from the webpage you linked to before and it's not lexing correctly. I'll try to work out where it's going wrong.

The - symbol is sketchy because it can be used in relations/links:

MERGE (p)-[:DIRECTED]->(m)
RETURN [(a)-->(b) WHERE b:Movie | b.released] AS years
MATCH (joe { name: 'Joe' })-[:knows*2..2]-(friend_of_friend)

But also as a math operator:

[x IN Released | date().year - x + 1] AS YearsAgo

(not sure if the space is mandatory or if we can write [x IN Released | date().year-10] AS YearsAgo)

And in number literal:

RETURN sign(-17), sign(0.1)

Interestingly, the current highlighter in https://neo4j.com/docs/cypher-manual/current/functions/mathematical-numeric/#functions-sign ignores the - symbol (ie. it's not a token)

@jexp
Copy link

jexp commented Apr 10, 2020

Yes, it's treated different in the cypher parser if it's part of a pattern (...)-[...]->(..)
or an expression (minus sign).

see the grammars here: https://www.opencypher.org/resources

@ggrossetie
Copy link
Contributor Author

Thanks @jexp that's very valuable!

(not sure if the space is mandatory or if we can write [x IN Released | date().year-10] AS YearsAgo)

The answer from the EBNF is no, the space is not mandatory:

oC_AddOrSubtractExpression
  :  oC_MultiplyDivideModuloExpression ( ( SP? '+' SP? oC_MultiplyDivideModuloExpression ) | ( SP? '-' SP? oC_MultiplyDivideModuloExpression ) )* ;

oC_UnaryAddOrSubtractExpression
  :  ( ( '+' | '-' ) SP? )* oC_StringListNullOperatorExpression ;

@pyrmont
Copy link
Contributor

pyrmont commented Apr 12, 2020

@Mogztter I added some rules for numbers. Let me know what you think.

@pyrmont pyrmont added author-action The PR has been reviewed but action by the author is needed and removed needs-review The PR needs to be reviewed labels Apr 12, 2020
@ggrossetie
Copy link
Contributor Author

@pyrmont That's great thanks Michael!

Unless I'm missing something in the grammar, -17 is an "UnaryAddOrSubtractExpression" not a negative number.
In fact, the following rule does not match -17 right?

ExponentDecimalReal
  :  ( ( Digit )+ | ( ( Digit )+ '.' ( Digit )+ ) | ( '.' ( Digit )+ ) ) ( 'E' | 'e' ) '-'? ( Digit )+ ;

I think we should also add the Unicode symbols for dash, right arrow head and left arrow head no?

oC_LeftArrowHead
  :  '<'
    | '\u27e8'
    | '\u3008'
    | '\ufe64'
    | '\uff1c'
    ;

oC_RightArrowHead
  :  '>'
    | '\u27e9'
    | '\u3009'
    | '\ufe65'
    | '\uff1e'
    ;

oC_Dash
  :  '-'
    | '\u00ad'
    | '\u2010'
    | '\u2011'
    | '\u2012'
    | '\u2013'
    | '\u2014'
    | '\u2015'
    | '\u2212'
    | '\ufe58'
    | '\ufe63'
    | '\uff0d'
    ;

Should we add the unicode reference or the symbol? I believe that the unicode reference is easier to edit (but less visual)

Dash = '-'
  | '­'
  | '‐'
  | '‑'
  | '‒'
  | '–'
  | '—'
  | '―'
  | '−'
  | '﹘'
  | '﹣'
  | '-'
  ;

@pyrmont
Copy link
Contributor

pyrmont commented Apr 14, 2020

@Mogztter Yeah, I agree—it doesn't look to me like the grammar specifies negative numbers.

I would suggest reverting the changes to the Unicode arrows and dashes. Unless that's actually something that's commonly used in Cypher code, it adds additional overhead to the lexer. Are these characters people use in practice?

@ggrossetie
Copy link
Contributor Author

ggrossetie commented Apr 14, 2020

@pyrmont You're right, I've checked with Neo4j and unicode arrows and dashes are not commonly used as it's really hard to input them anyway.
I will remove them.

I think we are now ready to go? Should I squash my commits?

@pyrmont pyrmont merged commit 66f45b7 into rouge-ruby:master Apr 14, 2020
@pyrmont pyrmont removed the author-action The PR has been reviewed but action by the author is needed label Apr 14, 2020
@pyrmont
Copy link
Contributor

pyrmont commented Apr 14, 2020

@Mogztter Thanks for all your work and, especially, your patience. I realise it took quite a while but it's in master now and will go out in the next version, v3.18.0 :) That's scheduled for release later today! 🎉

@ggrossetie
Copy link
Contributor Author

Yay! Thanks 👍

@jexp
Copy link

jexp commented Apr 16, 2020

Super, thanks so much @pyrmont and @Mogztter !! <3

@ggrossetie ggrossetie deleted the cypher-lexer branch April 16, 2020 12:16
mattt pushed a commit to NSHipster/rouge that referenced this pull request May 21, 2020
This commit adds a lexer for Cypher.

Co-authored-by: Michael Camilleri <mike@inqk.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Request for Cypher query language lexer
3 participants