[Feature Request] Output Restricted by Context Free Grammar or JSON Schema

## 🚀 Feature

Pass in a grammar or JSON schema to restrict the output of generated tokens. This would make data extraction and potentially tool-usage use cases simpler to implement.

## Motivation

Having the ability to constrain responses to a specified grammar or a JSON schema would unlock data extraction and function calling use cases.

## Alternatives

Prompt-engineering isn't sufficient and not transferrable between models.
Fine-tuning would be a much heavier lift compared a grammar that could drive output.

## Additional context

Some prior art to consider:
https://github.com/ggerganov/llama.cpp/pull/1773
https://huggingface.co/spaces/mishig/jsonformer
https://github.com/normal-computing/outlines
https://github.com/r2d4/rellm

Great project, I have been able to get highly performant and high quality response in a couple of hours of effort. Huge kudos to the MLC team and to Simon Willison for this project that got me started with his llm library:

https://github.com/simonw/llm-mlc


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Output Restricted by Context Free Grammar or JSON Schema #758

🚀 Feature

Motivation

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Output Restricted by Context Free Grammar or JSON Schema #758

Description

🚀 Feature

Motivation

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions