-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extending grammar integration tests #6644
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very neat!
1b98885
to
e875161
Compare
Okay, finally got around to implementing your suggestions, @ochafik. How does this look to you?
I took this suggestion to heart, and this is now the output one gets from running this test. Is this closer to what you were envisioning?
In the event of a failure, it will show up like this:
Edit: Amended to add quotes around test strings to explicitly show whitespace. Rebased on latest master. Might even be good to expand these tests to include some stress tests, and put some timing information in there -- but I don't want to load down the regular tests too much. Maybe we do that in a separate test file that is only run optionally...? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!
tests/test-grammar-integration.cpp
Outdated
|
||
auto grammar = build_grammar(grammar_str); | ||
|
||
bool matched = match_string("123+456", grammar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot this one I think :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could still call test_grammar from this one I think
tests/test-grammar-integration.cpp
Outdated
break; | ||
} | ||
if (!matched) { | ||
fprintf(stderr, " ❌ Failed to match string: \"%s\"\n", test_string.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 There's nothing like a bit of color to brighten a dreadful grammar debugging session!
Nit: I was wondering if it's possible to write the string before testing it (flushing the output after writing its name). In my clumsy and limited testing I got a few non-catchable segfaults and I struggled to find which strings where causing the failure.
Maybe something like:
⚪️ Testing grammar:
root ::= expression
expression ::= term ws (("+"|"-") ws term)*
term ::= factor ws (("*"|"/") ws factor)*
factor ::= number | variable | "(" expression ")" | function-call
number ::= [0-9]+
variable ::= [a-zA-Z_][a-zA-Z0-9_]*
function-call ::= variable ws "(" (expression ("," ws expression)*)? ")"
ws ::= [ \t\n\r]?
Checking valid strings:
"42" ✅︎
"1*2*3*4*5" ✅︎
"4" ❌ (erroneously rejected)
Checking *invalid* strings:
"+" ✅︎
"/ 3x" ✅︎
"11111" ❌ (erroneously matched)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea about pre-printing the strings and flushing buffers. I'll implement that! I intend to revisit #6492 soon, so better output on segfaults will come in handy I think.
Thanks for the nits!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the full test output looks like now:
Running grammar integration tests...
⚪ Testing grammar:
root ::= expression
expression ::= term ws (("+"|"-") ws term)*
term ::= factor ws (("*"|"/") ws factor)*
factor ::= number | variable | "(" expression ")" | function-call
number ::= [0-9]+
variable ::= [a-zA-Z_][a-zA-Z0-9_]*
function-call ::= variable ws "(" (expression ("," ws expression)*)? ")"
ws ::= [ \t\n\r]?
Checking valid strings:
"42" ✅︎
"1*2*3*4*5" ✅︎
"x" ✅︎
"x+10" ✅︎
"x1+y2" ✅︎
"(a+b)*(c-d)" ✅︎
"func()" ✅︎
"func(x,y+2)" ✅︎
"a*(b+c)-d/e" ✅︎
"f(g(x),h(y,z))" ✅︎
"x + 10" ✅︎
"x1 + y2" ✅︎
"(a + b) * (c - d)" ✅︎
"func()" ✅︎
"func(x, y + 2)" ✅︎
"a * (b + c) - d / e" ✅︎
"f(g(x), h(y, z))" ✅︎
"123+456" ✅︎
"123*456*789-123/456+789*123" ✅︎
"123+456*789-123/456+789*123-456/789+123*456-789/123+456*789-123/456+789*123-456" ✅︎
Checking invalid strings:
"+" ✅︎
"/ 3x" ✅︎
"x + + y" ✅︎
"a * / b" ✅︎
"func(,)" ✅︎
"func(x y)" ✅︎
"(a + b" ✅︎
"x + y)" ✅︎
"a + b * (c - d" ✅︎
"42 +" ✅︎
"x +" ✅︎
"x + 10 +" ✅︎
"(a + b) * (c - d" ✅︎
"func(" ✅︎
"func(x, y + 2" ✅︎
"a * (b + c) - d /" ✅︎
"f(g(x), h(y, z)" ✅︎
"123+456*789-123/456+789*123-456/789+123*456-789/123+456*789-123/456+789*123-456/" ✅︎
⚪ Testing grammar: root ::= "a"*
Checking valid strings:
"" ✅︎
"a" ✅︎
"aaaaa" ✅︎
"aaaaaaaaaaaaaaaaaa" ✅︎
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" ✅︎
Checking invalid strings:
"b" ✅︎
"ab" ✅︎
"aab" ✅︎
"ba" ✅︎
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab" ✅︎
⚪ Testing grammar: root ::= "a"+
Checking valid strings:
"a" ✅︎
"aaaaa" ✅︎
"aaaaaaaaaaaaaaaaaa" ✅︎
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" ✅︎
Checking invalid strings:
"" ✅︎
"b" ✅︎
"ab" ✅︎
"aab" ✅︎
"ba" ✅︎
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab" ✅︎
⚪ Testing grammar: root ::= "a"?
Checking valid strings:
"" ✅︎
"a" ✅︎
Checking invalid strings:
"b" ✅︎
"ab" ✅︎
"aa" ✅︎
"ba" ✅︎
⚪ Testing grammar:
root ::= cons+ vowel* cons? (vowel cons)*
vowel ::= [aeiouy]
cons ::= [bcdfghjklmnpqrstvwxyz]
Checking valid strings:
"yes" ✅︎
"no" ✅︎
"noyes" ✅︎
"crwth" ✅︎
"four" ✅︎
"bryyyy" ✅︎
Checking invalid strings:
"yess" ✅︎
"yesno" ✅︎
"forty" ✅︎
"catyyy" ✅︎
🟢 Testing for missing root node:
✅︎ Passed
🟢 Testing for missing reference node:
Expected error: parse: error parsing grammar: Undefined rule identifier 'numero'
End of expected error.
✅︎ Passed
All tests passed.
And a failure-case:
Running grammar integration tests...
⚪ Testing grammar:
root ::= expression
expression ::= term ws (("+"|"-") ws term)*
term ::= factor ws (("*"|"/") ws factor)*
factor ::= number | variable | "(" expression ")" | function-call
number ::= [0-9]+
variable ::= [a-zA-Z_][a-zA-Z0-9_]*
function-call ::= variable ws "(" (expression ("," ws expression)*)? ")"
ws ::= [ \t\n\r]?
Checking valid strings:
"42" ✅︎
"1*2*3*4*5" ✅︎
"x" ✅︎
"x+10" ✅︎
"x1+y2" ✅︎
"x + + y" ❌ (failed to match)
Assertion failed: (matched), function test_grammar, file test-grammar-integration.cpp, line 79.
[1] 54975 abort ./tests/test-grammar-integration
Or:
Running grammar integration tests...
⚪ Testing grammar:
root ::= expression
expression ::= term ws (("+"|"-") ws term)*
term ::= factor ws (("*"|"/") ws factor)*
factor ::= number | variable | "(" expression ")" | function-call
number ::= [0-9]+
variable ::= [a-zA-Z_][a-zA-Z0-9_]*
function-call ::= variable ws "(" (expression ("," ws expression)*)? ")"
ws ::= [ \t\n\r]?
Checking valid strings:
"42" ✅︎
"1*2*3*4*5" ✅︎
"x" ✅︎
"x+10" ✅︎
"x1+y2" ✅︎
"(a+b)*(c-d)" ✅︎
"func()" ✅︎
"func(x,y+2)" ✅︎
"a*(b+c)-d/e" ✅︎
"f(g(x),h(y,z))" ✅︎
"x + 10" ✅︎
"x1 + y2" ✅︎
"(a + b) * (c - d)" ✅︎
"func()" ✅︎
"func(x, y + 2)" ✅︎
"a * (b + c) - d / e" ✅︎
"f(g(x), h(y, z))" ✅︎
"123+456" ✅︎
"123*456*789-123/456+789*123" ✅︎
"123+456*789-123/456+789*123-456/789+123*456-789/123+456*789-123/456+789*123-456" ✅︎
Checking invalid strings:
"+" ✅︎
"/ 3x" ✅︎
"x1+y2" ❌ (incorrectly matched)
Assertion failed: (!matched), function test_grammar, file test-grammar-integration.cpp, line 99.
[1] 55489 abort ./tests/test-grammar-integration
… simpler to add new tests.
… with quantifiers.
…dding verbose messages to give visibility for what is being tested.
…ow print and flush before tests to aid in debugging segfaults and whatnot.
5d7a459
to
7fe2fb3
Compare
Rebased and ready for review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks @HanClinto !
tests/test-grammar-integration.cpp
Outdated
|
||
auto grammar = build_grammar(grammar_str); | ||
|
||
bool matched = match_string("123+456", grammar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could still call test_grammar from this one I think
…ess. Added comments for more verbose descriptions of what each test is accomplishing.
Good call on modifying I also missed a couple of green circles for white circles, and cleaned up a bit more language -- this is now what it looks like. It's still not as visually pleasing to the eyes as I would like, but I don't know how to make it better. Visually I don't like the checkmarks being at the end of the lines, but I'm not sure I want to put them on a newline, and I like having them afterwards in order to aid in debugging.
|
@HanClinto Looks great and very functional, ship it ;-) (looking forward to using it!!) |
Thanks for all the help and feedback -- it's much better as a result!
Yeah!! I'm really looking forward to you using this as well! Can hardly wait to wrap |
* Cleaning up integration tests to share code between tests and make it simpler to add new tests. * Add tests around quantifiers to ensure both matching and non-matching compliance. * Add slightly more complex grammar with quantifiers to test references with quantifiers. * Fixing build when C++17 is not present. * Separating test calls to give more helpful stack traces on failure. Adding verbose messages to give visibility for what is being tested. * Adding quotes around strings to explicitly show whitespace * Removing trailing whitespace. * Implementing suggestions from @ochafik -- grammars and test strings now print and flush before tests to aid in debugging segfaults and whatnot. * Cleaning up forgotten symbols. Modifying simple test to use test harness. Added comments for more verbose descriptions of what each test is accomplishing. * Unicode symbol modifications to hopefully make log easier to parse visually.
As part of the discussion around #6640, I wanted to expand the integration tests to help add end-to-end confirmation of rework around grammar quantifiers (both in that PR and others in the works).
There are two sets of commits here -- the first commit refactors our tests and de-duplicates code.
The later commits extends our tests by adding some new grammars focused on exercising the quantifiers, and provides a structure to (hopefully) make this simple to extend in the future.
The diff is a mess, but hopefully breaking it up into two commits like this can help simplify what's going on.