Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea regarding the exhaustive integration testing of the parser #8

Open
lagodiuk opened this issue May 15, 2016 · 2 comments
Open

Idea regarding the exhaustive integration testing of the parser #8

lagodiuk opened this issue May 15, 2016 · 2 comments

Comments

@lagodiuk
Copy link
Owner

As far as parser could handle potentially all possible context free grammars (CFG), and can parse all possible valid statements of the corresponding CFG, it was a bit unclear how to find the proper and exhaustive approach for testing the implementation of the parser.

Few days ago I've came up with an idea regarding the exhaustive testing of the parser.
Of course, as far as there is an infinite amount of CFGs, the idea of testing is based on randomization principle.

Hence, the algorithm:

Repeat multiple times:
1. Generate random CFG
1.1. Using generated CFG - produce the random valid statement, which belongs to the given CFG. It can be done using the recursive top-down approach (also, meanwhile, the top-down approach - helps to generate the correct parse tree as well).
1.2. Parse generated statement using the implementation of the parsing algorithm. Assert, that expected parse tree is present inside the generated parse forest (because, in case of ambiguous CFG - there might be generated multiple parse trees).
1.3. go to 1.1.

@lagodiuk lagodiuk changed the title Idea regarding the Unit Testing of the Parser Idea regarding the exhaustive Unit Testing of the Parser May 15, 2016
@gitmh
Copy link

gitmh commented May 15, 2016

We tried something similar just without generating random CFG. We started with a set of grammars at step 1.1. which was not so easy. Let's take a real world example for a calculator grammar like this:

S -> Input
Input      -> Expression | Input Expression 
Expression -> Expression * Expression 
            | Expression + Expression 
            | Expression - Expression 
            | Expression / Expression 
            | Number
Number     -> Digit | Number Digit 
Digit      -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 

We used a weighted random generator for generating valid statements. So that it will be more likely to choose a path not yet chosen and so that the algorithm prefers rules that do not contain them self (Expression -> Number over Expression -> Expression ...). It is still not easy to configure the algorithm so that we get real world input statements (not too short, not too long and no infinit runtime).

As soon as we find a grammar not working with your parser implementation I'll post it here :-)

@lagodiuk lagodiuk changed the title Idea regarding the exhaustive Unit Testing of the Parser Idea regarding the exhaustive integration testing of the parser May 15, 2016
@lagodiuk
Copy link
Owner Author

@gitmh thank you for sharing your experience regarding the randomized testing, and thank you for collaboration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants