The KMC Kleenex compiler

This is a compiler that can:

compile regular expressions into fast, streaming parsers (see [http://www.diku.dk/kmc/documents/GHR14-0-paper.pdf]), and
compile programs written in the regular expression-based language Kleenex into fast, streaming string transformations ([http://www.diku.dk/kmc/documents/ghrst2016-0-paper.pdf]).

Download VM for testing

If you want to quickly get a sense of what Kleenex is, you can download a VirtualBox image that is ready to play around with from http://kleenexlang.org.

Build

To clone, run git clone --recursive https://github.com/diku-kmc/kleenexlang.git.

Due to dependencies not on Hackage, it is easiest to build in a sandbox. After cloning, cd into project directory and run cabal sandbox init && cabal sandbox add-source regexps-syntax. Then pull in dependencies by cabal install --dependencies-only.

To build, run cabal configure && cabal build. This will place a binary in dist/build/kexc/kexc.

Use

First write a Kleenex program:

> cat add-commas.kex
main := (num /[^0-9]/ | other)*
num := digit{1,3} ("," digit{3})*
digit := /[0-9]/
other := /./

Next compile a transducer using the kexc executable:

> kexc compile add-commas.kex --out add-commas

Finally, pipe input to the transducer:

> echo "2016" | ./add-commas
2,016

Test suite

A number of test suites are included.

To run the unit tests: cabal test.
To test the C runtime: cd crt_test && make. Note that this uses the Valgrind tool.
To run the end-to-end blackbox tests: cd test/test_compiled && make.

Benchmark suite

The repository includes a benchmark suite that compares the performance of string transformation programs written in Kleenex with equivalent programs written using other regular expression-based libraries and tools.

To run the benchmarks and generate the plots, first cd bench and then:

generate the test data: make generate-test-data
install the external benchmark dependencies (libraries, etc.): make install-benchmark-dependencies
build the benchmark programs (not Kleenex programs): make build-benchmark-programs
(optional) check that the benchmark programs are equivalent: make -k equality-check
build the Kleenex programs: ./compiletime.sh -f
run /all/ the benchmark programs N times with M warm-up rounds: ./runningtime.sh -r <N> -w <M> -f
generate the plots: ./mkplots.py
the plots are placed in bench/plots (unless otherwise specified to mkplots.py)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

The KMC Kleenex compiler

Download VM for testing

Build

Use

Test suite

Benchmark suite

Files

README.md

Latest commit

History

README.md

File metadata and controls

The KMC Kleenex compiler

Download VM for testing

Build

Use

Test suite

Benchmark suite