-
Notifications
You must be signed in to change notification settings - Fork 1
UPSTREAM PR #17136: common : introduce composable PEG parser combinators for chat parsing #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
loci-dev
wants to merge
219
commits into
main
Choose a base branch
from
upstream-PR17136-branch_aldehir-parser-combinators
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
UPSTREAM PR #17136: common : introduce composable PEG parser combinators for chat parsing #359
loci-dev
wants to merge
219
commits into
main
from
upstream-PR17136-branch_aldehir-parser-combinators
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…g functions for each parser
c7d40d0 to
9182b13
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Mirrored from ggml-org/llama.cpp#17136
Supporting new models requires implementing several features:
tool_choice = auto)response_format(reasoning models)For reasoning models, the grammar must include reasoning or performance degrades significantly.
The real challenge is that each model uses a different output format:
[get_weather(location="..."), ...])Currently, the grammar and parsing exist as separate functions, which works but feels a bit fragile. I believe we can unify the two by using parser combinators to compose a PEG parser. That way the grammar definition becomes the parser.
Proposed Solution
This PR introduces a generic PEG (Parsing Expression Grammar) parser to the common library, along with chat-specific extensions and a complete reference implementation for Qwen3-Coder.
I've noticed there's often a lag between when a model is supported by llama.cpp and when proper tool calling is fully implemented. This parser aims to close that gap by letting you define the grammar and parser at the same time, making it easier to add full tool calling support for new models.
Parsing Expression Grammars (PEG)
PEG parsers are straightforward to implement as recursive descent parsers. While recursive descent parsers are known for backtracking, the majority of model output can be parsed with minimal backtracking, making them practical for this use case.
Parser combinators allow us to compose complex parsers from simple, reusable building blocks. This creates a DSL that closely mimics the grammar itself.
Rather than defining both a grammar and parsing function, we can build a PEG parser that generates a compatible GBNF grammar (with exceptions) and parses model output.
Features
simple- Content with optional reasoningnative- Tool arguments as JSON objectsconstructed- Tool arguments as separate entities (XML or pseudo-functions)Examples
Parser for models that emit tool arguments as JSON
Parser for models that emit XML tags for each argument
Grammar generation
Implementation Details
The PEG parsers are implemented using
std::variantrather than traditional inheritance. This reduces boilerplate and leveragesstd::visitfor type-safety. I initially had an OOP implementation, but it started becoming quite cumbersome and this seems like the lesser evil of the two.using common_peg_parser_variant = std::variant< common_peg_epsilon_parser, common_peg_start_parser, common_peg_end_parser, common_peg_literal_parser, common_peg_sequence_parser, common_peg_choice_parser, common_peg_repetition_parser, common_peg_and_parser, common_peg_not_parser, common_peg_any_parser, common_peg_space_parser, common_peg_chars_parser, common_peg_json_string_parser, common_peg_until_parser, common_peg_schema_parser, common_peg_rule_parser, common_peg_ref_parser, common_peg_atomic_parser, common_peg_tag_parser >;Both parsers and AST nodes are allocated in arena structures to minimize memory allocations.
Each parser variant is wrapped in a
common_peg_parservalue type to produce a DSL for composing parser combinators.Parsers can return results
FAIL,SUCCESS, orNEED_MORE_INPUT. This is how the partial parsing is implemented. It does not raise an exception on partial parse likecommon/chat-parser.cpp, because partial parses are still valid for streaming.Additional Changes
common_chat_peg_parse()tocommon/chat.cppand chat formatsCOMMON_CHAT_FORMAT_PEG_(SIMPLE|NATIVE|CONSTRUCTED)to support models parsed by a PEG parser.common_chat_syntax.parser. I'm not a fan, but this seems the least intrusive method to integrate. I'll implement any alternative mechanisms if desired.common/unicode.{cpp,h}derived fromsrc/unicode.{cpp,h}. As I understand, we should not include headers fromsrc/, so I had to copy the implementation. It does deviate by returning a result rather than raising an exception.More comprehensive documentation is added in
docs/development/parsing.md. The tests are also fairly comprehensive,tests/test-chat-peg-parser.cpp.I know this is a big PR. I tried to minimize the implementation, while keeping enough to demonstrate value. #15703 shows community desire for something like this, although it doesn't have to be this implementation.