Add small validator utility for PEG grammars #23519
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
One common gotcha of PEG grammars is having two alternatives in a rule in which one that appears first is contained in one that appears last. In this scenario, the second one will never match as if there is input text that matches it, it will also match the first and that comes first.
To avoid this problem, this PR create an initial version of a small utility that detects these cases and raises to alert the user.
We don't need to detect all cases, only the ones that is easy enough to implement. The current form is just an initial prototype that does the matching based in the string representation. To make this better we need to improve the matching algorithm into a visitor to allow checking rules that contain options. For example:
One "straightforward enough" way to do this is to generate all possible string representations: with and without optional and doing substring matching, but at this point a visitor that visits both alternatives at the same time and advances accordingly is probably better.
We could also support some rule expansion so we also detect something like this: