This repository contains a suite of test cases and implementations for parsing text using Python. Both speed an memory usage are compared for each implementation. Each implementation must be:
- Correct -- it implements the specification exactly
- Syntax -- it parses all valid inputs and rejects bad ones
- Semantics -- it produces the expected data structures
- Plausible -- it only uses generally known parser features or parsing strategies
Validation tests can ensure the correctness of an implementation, but its plausibility is harder to pin down. Basically, grammars that require knowlege of a parser's internals or undocumented features to increase performance are discouraged. The grammar should be understandable by someone with access to the parser's documentation and examples.
This table lists the tasks performed and the libraries benchmarked.
Implementation | Algorithm | JSON | Arithmetic | INI |
---|---|---|---|---|
stdlib | handwritten | yes | yes | yes |
Lark | LALR | yes | yes | yes |
parsimonious | Recursive Descent | yes | ||
pe | Recursive Descent | yes | yes | yes |
pyparsing | Recursive Descent | yes | ||
SLY | LALR | yes | ||
textX | Recursive Descent | yes |
The following bar chart shows the time in milliseconds to parse a ~5MB JSON file using both CPython and PyPy.
[cpython] stdlib : ▏ 109 ms
[cpython] pe : ▇▇ 819 ms
[cpython] sly : ▇▇▇▇▇▇▇▇▇▇▇▇ 3716 ms
[cpython] lark : ▇▇▇▇▇▇▇▇▇▇▇▇▇ 3967 ms
[cpython] parsimonious: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 8150 ms
[cpython] pyparsing : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 11518 ms
[cpython] textx : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 14738 ms
[pypy] stdlib : ▇ 338 ms
[pypy] pe : ▇▇ 599 ms
[pypy] sly : ▇▇ 763 ms
[pypy] lark : ▇▇ 796 ms
[pypy] pyparsing : ▇▇▇▇▇▇▇ 2181 ms
[pypy] parsimonious: ▇▇▇▇▇▇▇▇▇ 2669 ms
[pypy] textx : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5242 ms
Here are the results for parsing 10k complicated (from a parsing point of view) arithmetic expressions:
[cpython] stdlib: ▇▇ 101 ms
[cpython] pe : ▇▇▇▇▇▇▇▇▇▇▇▇▇ 551 ms
[cpython] lark : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2109 ms
[pypy] lark : ▇▇▇▇▇ 253 ms
[pypy] stdlib: ▇▇▇▇▇▇ 260 ms
[pypy] pe : ▇▇▇▇▇▇▇▇ 344 ms
And here is an INI file with 1000 sections:
[cpython] stdlib: ▇▇▇▇▇▇▇▇▇▇▇▇ 91 ms
[cpython] pe : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 125 ms
[cpython] lark : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 352 ms
[pypy] stdlib: ▇▇▇▇ 34 ms
[pypy] pe : ▇▇▇▇▇▇ 44 ms
[pypy] lark : ▇▇▇▇▇▇▇ 53 ms
Charts produced with termgraph
These benchmarks were run on a Lenovo Thinkpad with an Intel Core-i5 6200U with 8GB memory running Pop!_OS Linux 21.04. The millisecond values will change across systems but the relative performance should be similar (but I'd be interested in hearing if you find otherwise!).
Python 3.6+ is required.
First create a virtual environment (recommended) and install the package and requirements:
$ python3 -m venv cpy
$ source cpy/bin/activate
(cpy) $ pip install -r requirements.txt
You can also use PyPy by creating its own virtual environment:
$ pypy3 -m venv pypy
$ source pypy/bin/activate
(pypy) $ pip install -r requirements.txt
From here on it's assumed you're in one of the (cpy)
or (pypy)
environments.
The benchmarks are implemented using pytest and
pytest-benchmark, so
you can run the tests with pytest
if you adjust PYTHONPATH
:
$ PYTHONPATH=. pytest
But it might be easier to just run the included validate.py
and
benchmark.py
scripts, which pass the appropriate options on to
pytest
:
$ python validate.py # skip performance tests (they take a while)
$ python benchmark.py # skip validity tests
You can give specific library names to limit the tests that are run:
$ python validate.py pe stdlib # only run validation for pe and stdlib
Some tests print to stdout diagnostic information that can be useful, such as the recursion limit in JSON parsing. Use the following option to see that information:
$ python validate.py -rP # print stdout for each test
Performance benchmarks are not the only criterion one should use when choosing a parsing library, and this repository is not meant to declare some winner. On the one hand, there are many other valid criteria (ease of use, stability, security, availability, support, etc.), but on the other hand we can't discuss relative performance without numbers, so here we are.