-
Notifications
You must be signed in to change notification settings - Fork 10
feat: add node helpers to scanner #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| return node | ||
| } | ||
|
|
||
| abort (node, expectedTokens) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea: if this is too much for every abort, we can also add fail or something like that. We can move the error to that method, and only use abort for rewinding to the starting position of the node.
Note that aborting a failed parsing attempt of a token, always rewinds the position back to node's start. Right now, that seems to work pretty good but I'm not sure how good that works with @bcoe's new recursive body parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that calling abort would cause any issues, let's hold off on adding an additional method until we know we need to differentiate between an abort and failure.
| } | ||
|
|
||
| abort (node, expectedTokens) { | ||
| const position = `${this.pos.line}:${this.pos.column}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also use unist-util-stringify-position instead of the ${pos.line}:${pos.column}, because we have the attempted node information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there will be tooling value in having this as as well { line: <line>, column: <column> } on the error. Maybe before line 80 we could add error.position = this.pos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another helpful pattern I like to follow (used in node, which is good precedence I think) is to have distinct .code properties for these errors. If we were tooling around errors, checking err.code === 'EOF_ERROR is much better than err.message.startsWith('unexpected token EOF at').
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there would be value in adding additional context to the error object. I also wonder if we should consider returning an error rather than ever throwing? or perhaps have a "best effort" mode, and a mode that throws?
| expect(() => { | ||
| parser('feat add support for scopes') | ||
| }).to.throw("unexpected token ' ' at position 1:5 valid tokens [(, !, :]") | ||
| }).to.throw("unexpected token ' ' at 1:5, valid tokens [(, !, :]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Position itself is a pretty distinct format, keeping this short and descriptive is important. But, feel free to roll this rewording back!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm supportive of this rewording.
wesleytodd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } | ||
|
|
||
| abort (node, expectedTokens) { | ||
| const position = `${this.pos.line}:${this.pos.column}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there will be tooling value in having this as as well { line: <line>, column: <column> } on the error. Maybe before line 80 we could add error.position = this.pos?
| } | ||
|
|
||
| abort (node, expectedTokens) { | ||
| const position = `${this.pos.line}:${this.pos.column}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another helpful pattern I like to follow (used in node, which is good precedence I think) is to have distinct .code properties for these errors. If we were tooling around errors, checking err.code === 'EOF_ERROR is much better than err.message.startsWith('unexpected token EOF at').
bcoe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really solid.
| node.children.push(bodyFooter(scanner)) | ||
| node.position = { start, end: scanner.position() } | ||
| return node | ||
| return scanner.exit(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a much nicer API 😄
| return node | ||
| } | ||
|
|
||
| abort (node, expectedTokens) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that calling abort would cause any issues, let's hold off on adding an additional method until we know we need to differentiate between an abort and failure.
| } | ||
|
|
||
| abort (node, expectedTokens) { | ||
| const position = `${this.pos.line}:${this.pos.column}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree there would be value in adding additional context to the error object. I also wonder if we should consider returning an error rather than ever throwing? or perhaps have a "best effort" mode, and a mode that throws?
| expect(() => { | ||
| parser('feat add support for scopes') | ||
| }).to.throw("unexpected token ' ' at position 1:5 valid tokens [(, !, :]") | ||
| }).to.throw("unexpected token ' ' at 1:5, valid tokens [(, !, :]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm supportive of this rewording.
f4aff15 to
ad0ae52
Compare
fixes #7
This adds a really simple node builder to the scanner.
How it works
scanner.enter(<type>, <content>): <node>Creates a node, using content as leading factor to determine if this is a
literalorparentnode. It adds the current position asstartto the node.scanner.exit(<node>): <node>Adds the current position as
endand returns the node for single line usage likereturn scanner.exit(node). It mutates the node, so you can exit the node and return it later.scanner.abort(<node>, <expected-tokens>?): <Error>This is a replacement for our
invalidTokenmethod fromindex.js. It creates an error to return or throw. It also rewinds the scanner position to thestartof the node, to support optional tokens.Other thoughts
I removed some of the expected tokens from the stateful methods. They were containing the actual node type that was expected, not the token characters. For replacement, I added a fallback to
scanner.abortthat uses<type>notation as expected/valid token.I had another idea to create a dedicated node builder instance. The reason main reason I didn't go for this route is related to abstraction of the node content. If we do this, we have to provide an (unnecessary) abstraction for adding children or value. I really like the flexibility that we have right now, e.g.
node = scanner.enter(); node.customProp = '('; return node.We could split up the
entermethod intoenterLiteral/enterParent, that makes it more explicit. But, looking atunist-builder, they provide a similar interface to what we have now. (with an exception for void nodes, but I don't think we will use them)We can also move these methods to other helpers. But, because we were doing a lot of
invalidToken(scanner, ...statements, I think it's better to have them in the same context.Adding debug statements are also pretty easy, we could add
debug(type)('enter')anddebug(type)('exit')to these helpers to "visualize" the codepath of the parser. But for now, I don't think this is necessary.