Closed
Description
🚀 The feature, motivation and pitch
While existing Outline state machine provide great state of the art performance, it is trading off a one-off compile time when working with the schema. For endpoint products running model as a service with customers supplying many different schemas, the cost might not be acceptable. In that case, we should integrate with lm-format-enforcer from @noamgat.
We already have an existing logits processor interface and guided decoding tested. It should be quite straightforward to add it integration for it. In the end it should be some flag choosing --guided-decoding-backend=...
.
Alternatives
No response
Additional context
No response