-
-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fixes, improvements, optimization & refactoring before parser generation #288
Bug fixes, improvements, optimization & refactoring before parser generation #288
Conversation
Both strconv.ParseInt and strconv.ParseFloat already report the input. The word invalid isn't always true - it could also just be out of range. The type is contained within the Parse* function being called. So as the extra context added nothing useful, I decided to remove it altogether, making things much easier for the generated parser.
If an optional group within a capture matches no input (either because of an unexpected token or EOF), obtaining partial AST can cause `setField` to be called with no tokens, causing a panic. Fixed by not doing anything.
It didn't properly escape the token, the error would be malformed if the token contained single quotes. It also used a custom error instead of `UnexpectedTokenError` for no good reason.
The last error in the added test explains the difference. Without this change, it would have been `1:15: unexpected token "(" (expected ";")`, which would be very confusing as '(' was actually allowed after the class name. After this change, it's `1:20: unexpected token ")" (expected <ident>)`, pointing out the real problem - that an identifier was expected after the comma.
…ction Optimize finding case insensitive tokens to be done at parser construction Doing it before each parsing was unnecessary. One could worry the results would be different if `ParseFromLexer` was called with a different lexer than used to build the grammar, but then case-insentive tokens would be the least of our problems - literal & reference would check agaist the wrong TokenType.
Not very significant for the reflective parser, more significant for the generated one.
The generated parser would be the second place that would construct a temporary disjunction in the implementation, I think it might as well contain the disjunction directly.
They could never be created and were private, so this certainly can't break anything.
It was n probably because I copied it from negation and assumed n was for node. Naming it l is more consistent with the convention for other nodes.
@@ -1874,11 +1874,11 @@ func TestParseNumbers(t *testing.T) { | |||
} | |||
parser := participle.MustBuild[grammar]() | |||
_, err := parser.ParseString("", `300 0 x`) | |||
require.EqualError(t, err, `grammar.Int: invalid integer "300": strconv.ParseInt: parsing "300": value out of range`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alecthomas I included the previous commit basically just to make it visible what the error message looked like before. IMHO none of the context being added was actually useful and just duplicated the stuff deeper in the error.
The main reason why I removed it had to do with generated parser performance - my parser avoids allocations for errors as much as possible as most errors are later suppressed, so it just collects some error information while parsing and then only allocates the final error once parsing is done. This made that approach tricky as I would need to depend on fmt
just because of these int/uint/float errors and allocate extra stuff that might later be thrown away. Hope it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, makes sense 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM 👍
@@ -1874,11 +1874,11 @@ func TestParseNumbers(t *testing.T) { | |||
} | |||
parser := participle.MustBuild[grammar]() | |||
_, err := parser.ParseString("", `300 0 x`) | |||
require.EqualError(t, err, `grammar.Int: invalid integer "300": strconv.ParseInt: parsing "300": value out of range`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, makes sense 👍
Hmm I realised I could have just rebase merged this. I'm so used to most multi-commit PRs being full of garbage commit messages that I just squash by default now. Will keep that in mind in future. |
This PR contains a whole bunch of changes I did to the core code in order to implement code generation, but doesn't contain code generation itself.
UnexpectedTokenError.Expect
thing allows creating the error without having access to the parser nodes.The next PR will be code generation itself (#213).