Rewrote parser in tril #4345

Yuehan-Lin · 2019-09-23T18:10:34Z

Used cpp file instead of yacc and lex.
In the parser.cpp, first used lexer to scanned and tokenlized the tril input, then parsed tokens with grammar rules to build AST.

Fixes: #4489

Signed-off-by: Yuehan-Lin Yuehan.Lin@ibm.com

fjeremic · 2019-09-23T21:42:59Z

We're going to need some Doxygen documentation here. There is not a single comment in the entire file that was added. We'll also need a more descriptive commit message for this rather large change.

Taking a quick glance here it seems the lexer and parser are implemented using a bottom-up approach with an LR(x) parser. What is the value of x? Looking at the code I'm a little surprised we went with an LR parser to parse Tril, given that the grammar for the language is very Lisp like so it is trivially parsable with a recursive descent LL(2) parser which is easy to read (from a code perspective) and to extend to support more features in Tril. Will the current implementation be easy to extend to parse things like node flags, etc.?

Having said that the lexer and parser here is quite concise, so kudos on that! Good work!

fjeremic · 2019-09-23T21:44:00Z

@genie-omr build all

Leonardo2718 · 2019-09-23T22:41:22Z

Sorry, I accidentally close this PR while writing this comment.

We're going to need some Doxygen documentation here. There is not a single comment in the entire file that was added. We'll also need a more descriptive commit message for this rather large change.

Completely agree. In particular, we should document the Tril grammar so it's easy to see what the parser is looking for.

Taking a quick glance here it seems the lexer and parser are implemented using a bottom-up approach with an LR(x) parser. What is the value of x? Looking at the code I'm a little surprised we went with an LR parser to parse Tril, given that the grammar for the language is very Lisp like so it is trivially parsable with a recursive descent LL(2) parser which is easy to read (from a code perspective) and to extend to support more features in Tril.

Actually, the parser is already in recursive decent form. Parsing is top-down, but the AST is built bottom-up. The distinction is that the decision of which production rule to apply is done in a top-down fashion. That is, we figure out which AST nodes to construct top-down. Once we know which node to construct, instead of constructing it right away, we wait for the children to be constructed. Basically instead of

if (token == '(') {
  auto node = new Node();
  node->children = parseChildren();

we do

if (token == '(') {
  auto children = parseChildren();
  auto node = new Node(Children());

The point in the code where we figure out what node to custruct is the same, it's just where we construct it that's different.

That said, I do think the current implementation has room for simplification.

Will the current implementation be easy to extend to parse things like node flags, etc.?

This shouldn't be a problem. As with the old parser, the AST is already expressive enough to be able to represent things such as node flags. What will have to change is the IL generator, which will have to recognize new patterns in the AST.

fjeremic · 2019-09-24T13:05:16Z

Thanks for the explanation @Leonardo2718. This makes sense. What threw me off was that I was expecting to see a more canonical recursive descent implementation with function calls implementing each reduction rule. The current implementation makes this much more concise though.

Let's fix the compile errors seen in the build and in meanwhile we can start reviewing the code.

fjeremic · 2019-09-24T15:36:58Z

@genie-omr build all

Leonardo2718

Here is my intial review. First, some general comments:

Please reword the commit and PR titles to be in the imperative mood. Also having a high-level description of what this change does and why would be good to have in the commit message.
Please add comments for:
- the Tril grammar; this can just be a block comment somewhere near the start of a file or just before the first function of the parser.
- the various functions defined using Doxygen format.
- all occurrences of tokenIt++ so we know which token is being consumed.
Flex/Lex and Bison/Yacc should be removed as dependencies from the CI build scrips.
In other parts of the code, there is no space between identifier and template parameters (i.e. foo<int> instead of foo <int>). Please update the usage of templates to conform to that style.

fvtest/tril/tril/parser.cpp

Leonardo2718

A few more notes:

I think it would be really helpful to specify in the comments of the different parse functions what the state of the iterator is after the function returns. For example, the comments for buildAST() could say something like "When the function returns, the token iterator points to the token immediately after the ')' of the parsed s-expression." Also, document what exceptions can be thrown would be helpful.
There are still spaces that should be removed between the name of templates and the template arguments.
There are several try-catch blocks that should be removed because there is no way to recover from the errors they are catching.

fvtest/tril/tril/parser.cpp

fjeremic · 2019-10-09T20:57:19Z

@Yuehan-Lin a suggestion which may make it much easier for reviewers to review the code is to add individual commits addressing review comments and post them in the corresponding review thread. If you paste the SHA and say something like "Fixed in 79c4baa" for example, GitHub will hyperlink the commit which addresses the review and it gives reviewers a much smaller scope to check.

If commits are being force pushed to the PR branch it can get very difficult for a reviewer to find what exactly changed between the last time they reviewed and the next. Sometimes changes will get lost and new code not properly reviewed. If changes addressing reviews are split up into commits it makes it very easy to follow. We can then squash unneeded commits at the end before a final merge.

Leonardo2718

Just a few minor comments.

fvtest/tril/tril/parser.cpp

Leonardo2718

I think this is a great change! Thanks @Yuehan-Lin!

There are still opportunities for further improvement but I think it's best to leave that work for subsequent PRs (I need to open issues for this). As is, the change is already a significant improvement over the existing solution and will enable other work once merged.

@fjeremic I'm planning on merging this by EOD today so I am kindly requesting for your approval before then. 🙂

Signed-off-by: Yuehan-Lin <Yuehan.Lin@ibm.com>

Leonardo2718 · 2019-10-25T17:36:57Z

@genie-omr build zos

Leonardo2718 · 2019-10-25T18:07:26Z

@genie-omr build all

fjeremic

Looks much better. Thanks @Leonardo2718 for the thorough review! Agreed we can address other improvements in separate PRs as merging this work unblocks other things. Great work everyone!

With the merge of eclipse-omr#4345 we no longer depend on Flex and Bison being installed for our testing. AppVeyor seems to be broken at the moment in that checksums are failing when installing this package. We fix the issue by removing the dependency as it is no longer needed. Signed-off-by: Filip Jeremic <fjeremic@ca.ibm.com>

Yuehan-Lin requested review from 0xdaryl, fjeremic and Leonardo2718 as code owners September 23, 2019 18:10

Yuehan-Lin force-pushed the parser branch 2 times, most recently from c9b792d to 673f85c Compare September 23, 2019 18:29

Leonardo2718 added comp:test comp:tril labels Sep 23, 2019

Leonardo2718 closed this Sep 23, 2019

Leonardo2718 reopened this Sep 23, 2019

Yuehan-Lin force-pushed the parser branch from 673f85c to 30d75a6 Compare September 24, 2019 15:20

Leonardo2718 suggested changes Sep 25, 2019

View reviewed changes

fjeremic reviewed Sep 26, 2019

View reviewed changes

fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved

fvtest/tril/tril/parser.cpp Show resolved Hide resolved

fvtest/tril/tril/parser.cpp Show resolved Hide resolved

fvtest/tril/tril/parser.cpp Show resolved Hide resolved

Yuehan-Lin force-pushed the parser branch from 30d75a6 to d76ec9f Compare September 30, 2019 15:50

aviansie-ben mentioned this pull request Oct 4, 2019

Enable Tril and JitBuilder tests on AIX #4411

Merged

Leonardo2718 suggested changes Oct 7, 2019

View reviewed changes

Yuehan-Lin force-pushed the parser branch 2 times, most recently from 79c4baa to 2f9747e Compare October 8, 2019 18:02

Leonardo2718 mentioned this pull request Oct 21, 2019

TRIL parser generator fails on RHEL #4489

Closed

Yuehan-Lin force-pushed the parser branch from 2f9747e to fdb8b67 Compare October 21, 2019 15:58

Leonardo2718 suggested changes Oct 25, 2019

View reviewed changes

Yuehan-Lin force-pushed the parser branch from fdb8b67 to 60f5d7f Compare October 25, 2019 14:59

Leonardo2718 approved these changes Oct 25, 2019

View reviewed changes

Yuehan-Lin force-pushed the parser branch from 60f5d7f to 1f59b97 Compare October 25, 2019 16:06

Rewrote parser in tril

393044a

Signed-off-by: Yuehan-Lin <Yuehan.Lin@ibm.com>

Yuehan-Lin force-pushed the parser branch from 1f59b97 to 393044a Compare October 25, 2019 17:31

Leonardo2718 self-assigned this Oct 25, 2019

fjeremic approved these changes Oct 25, 2019

View reviewed changes

Leonardo2718 merged commit 7caf559 into eclipse-omr:master Oct 25, 2019

fjeremic mentioned this pull request Oct 28, 2019

Remove Flex/Bison from AppVeyor and Travis #4515

Merged

aviansie-ben mentioned this pull request Oct 30, 2019

tril_compiler fails with a parse error when using stdin #4524

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrote parser in tril #4345

Rewrote parser in tril #4345

Yuehan-Lin commented Sep 23, 2019 •

edited by fjeremic

Loading

fjeremic commented Sep 23, 2019 •

edited

Loading

fjeremic commented Sep 23, 2019

Leonardo2718 commented Sep 23, 2019 •

edited

Loading

fjeremic commented Sep 24, 2019 •

edited

Loading

fjeremic commented Sep 24, 2019

Leonardo2718 left a comment

Leonardo2718 left a comment

fjeremic commented Oct 9, 2019

Leonardo2718 left a comment

Leonardo2718 left a comment

Leonardo2718 commented Oct 25, 2019

Leonardo2718 commented Oct 25, 2019

fjeremic left a comment

Rewrote parser in tril #4345

Rewrote parser in tril #4345

Conversation

Yuehan-Lin commented Sep 23, 2019 • edited by fjeremic Loading

fjeremic commented Sep 23, 2019 • edited Loading

fjeremic commented Sep 23, 2019

Leonardo2718 commented Sep 23, 2019 • edited Loading

fjeremic commented Sep 24, 2019 • edited Loading

fjeremic commented Sep 24, 2019

Leonardo2718 left a comment

Choose a reason for hiding this comment

Leonardo2718 left a comment

Choose a reason for hiding this comment

fjeremic commented Oct 9, 2019

Leonardo2718 left a comment

Choose a reason for hiding this comment

Leonardo2718 left a comment

Choose a reason for hiding this comment

Leonardo2718 commented Oct 25, 2019

Leonardo2718 commented Oct 25, 2019

fjeremic left a comment

Choose a reason for hiding this comment

Yuehan-Lin commented Sep 23, 2019 •

edited by fjeremic

Loading

fjeremic commented Sep 23, 2019 •

edited

Loading

Leonardo2718 commented Sep 23, 2019 •

edited

Loading

fjeremic commented Sep 24, 2019 •

edited

Loading