rexpy is a regular expression parser built in python, primarily as an exercise.
In the interest of simplicity, I only support the following operations:
- kleene star, denoted by '*'
- union, denoted by '|'
- concatentation
- parenthesization
I was originally inspired to build this while reading about regular languages and finite automata in Michael Sipser's excellent Introduction to the Theory of Computation. This book also inspired me to build pycc.
Regular expressions are compiled to nondeterministic finite automata (NFA) and matching is then implemented as a breadth-first search through the NFA while sequentially reading the string. For more on this technique, see here.
The main entry point is the rexpy.regex.match
method.
Example usage:
> import rexpy.regex
> regex.match("(a|bc)d", "bcd")
True
> regex.match("(a|bc)d", "acd")
False
- (optional) convert nested calls in parse code to sequential calls
- add escape characters functionality
- add capture functionality