Open
Description
- Migrate project to scikit-build-core #489
- Simplify dev / local setup using pyproject.toml and makefile #490
- Use numpy arrays for LogitsProcessors and StopCriteria to avoid copies / allocations #491
- Llama2 #488
- Configurable chat templates #492
- Add support for OpenAI-style functions #494
- Expose
scores
andinput_ids
inLlama
model #493 - Add batched inference #771
- Speculative sampling #675