TokenizeAnything A re-implementation of redpony/cdec's tokenize-anything.pl script in python samples/ is a bunch of data pulled from Wikipedia in a bunch of languages. tok/ holds the same data, as tokenized by the original tokenize-anything.pl.