Some fun weekend explorations to learn more about earley parsing and constrained decoding with LLMs.
Early stage exploration and prototyping.
earley.py
implements earley chart filling algorithm with extensions to support incremental parsing and backtracking. Example usage inexample_incremental_parsing.py
(using thetry_accept_tokens
function).logits_processor.py
implements a hf transformers LogitProcessor which forces sampling/decoding to choose the token with highest predicted probablity that is permitted by the grammar. Example usage inexample_constrained_decoding.py
.
- Developed with python 3.12. Though, python 3.10+ will likely work.
- Create an environment for this project. One way is using conda:
conda create -n earley-decoding python=3.12
- Activate your environment:
conda activate earley-decoding
- Install dependencies using
pip install -r requirements.txt
in your environment.
- Leo optimizations and support for nullable grammars.
- Token healing in constrained decoding similar to the
guidance
library. Currently we do not try alternate token sequences with the same prefix when sampling the next tokens. Without this, generation seems to often be degenerate. - Logits processor implementation is very naive right now and slow -- there must be ways we can optimize (e.g. caching prefixes). We can probably use microsoft
aici
library for more low level control. - ...