common : introduce composable PEG parser combinators for chat parsing #17136

aldehir · 2025-11-10T03:28:32Z

Supporting new models requires implementing several features:

Lazy grammar for tool calling (tool_choice = auto)
Full grammar for forced tool calls and response_format (reasoning models)
Parallel tool calls support
Parsing of reasoning and tool call outputs

For reasoning models, the grammar must include reasoning or performance degrades significantly.

The real challenge is that each model uses a different output format:

Harmony response output (gpt-oss)
XML with typed parameters (Qwen3-Coder, MiniMax M2)
- These models expect string arguments as raw content rather than JSON, which requires type awareness at parse time.
Pseudo-function call (LFM2 e.g. [get_weather(location="..."), ...])

Currently, the grammar and parsing exist as separate functions, which works but feels a bit fragile. I believe we can unify the two by using parser combinators to compose a PEG parser. That way the grammar definition becomes the parser.

Proposed Solution

This PR introduces a generic PEG (Parsing Expression Grammar) parser to the common library, along with chat-specific extensions and a complete reference implementation for Qwen3-Coder.

I've noticed there's often a lag between when a model is supported by llama.cpp and when proper tool calling is fully implemented. This parser aims to close that gap by letting you define the grammar and parser at the same time, making it easier to add full tool calling support for new models.

Parsing Expression Grammars (PEG)

PEG parsers are straightforward to implement as recursive descent parsers. While recursive descent parsers are known for backtracking, the majority of model output can be parsed with minimal backtracking, making them practical for this use case.

Parser combinators allow us to compose complex parsers from simple, reusable building blocks. This creates a DSL that closely mimics the grammar itself.

Rather than defining both a grammar and parsing function, we can build a PEG parser that generates a compatible GBNF grammar (with exceptions) and parses model output.

Features

Partial parsing for streaming input
Built-in JSON parsers for common patterns
Grammar generation for generating compatible GBNF grammars
AST generation with semantic tags for structured extraction
Three common AST shapes covering most model formats:
- simple - Content with optional reasoning
- native - Tool arguments as JSON objects
- constructed - Tool arguments as separate entities (XML or pseudo-functions)

Examples

Parser for models that emit tool arguments as JSON

auto parser = build_chat_peg_native_parser([&](common_chat_peg_native_builder & p) {
    // Build choice of available tools
    auto tool_choice = p.choice();
    for (const auto & tool : tools) {
        const auto & function = tool.at("function");
        std::string name = function.at("name");
        const auto & schema = function.at("parameters");

        auto tool_name = p.json_member("name", "\"" + p.literal(name) + "\"");
        auto tool_args = p.json_member("arguments", p.schema(p.json(), "tool-" + name + "-schema", schema));

        tool_choice |= p.rule("tool-" + name, "{" << tool_name << "," << tool_args << "}");
    }

    // Define tool call structure
    auto tool_call = p.trigger_rule("tool-call",
        p.sequence({
            p.literal("<tool_call>["),
            tool_choice,
            p.literal("]</tool_call>")
        })
    );

    return p.sequence({
        p.content(p.until("<tool_call>")),
        p.optional(tool_call),
        p.end()
    });
});

Parser for models that emit XML tags for each argument

auto parser = build_chat_peg_constructed_parser([&](common_chat_peg_constructed_builder & p) {
    auto location_arg = p.tool_arg(
        p.tool_arg_open("<parameter name=\"" + p.tool_arg_name(p.literal("location")) + "\">"),
        p.tool_arg_string_value(p.until("</parameter>")),
        p.tool_arg_close(p.literal("</parameter>"))
    );

    auto get_weather_tool = p.tool(p.sequence({
        p.tool_open("<function name=\"" + p.tool_name(p.literal("get_weather")) + "\">"),
        location_arg,
        p.tool_close(p.literal("</function>"))
    }));

    return p.sequence({
        p.content(p.until("<tool_call>")),
        p.literal("<tool_call>"),
        get_weather_tool,
        p.literal("</tool_call>"),
        p.end()
    });
});

Grammar generation

data.grammar = build_grammar([&](const common_grammar_builder & builder) {
    foreach_function(params.tools, [&](const json & fn) {
        builder.resolve_refs(fn.at("parameters"));
    });
    parser.build_grammar(builder, data.grammar_lazy);
});

Implementation Details

The PEG parsers are implemented using std::variant rather than traditional inheritance. This reduces boilerplate and leverages std::visit for type-safety. I initially had an OOP implementation, but it started becoming quite cumbersome and this seems like the lesser evil of the two.

using common_peg_parser_variant = std::variant<
    common_peg_epsilon_parser,
    common_peg_start_parser,
    common_peg_end_parser,
    common_peg_literal_parser,
    common_peg_sequence_parser,
    common_peg_choice_parser,
    common_peg_repetition_parser,
    common_peg_and_parser,
    common_peg_not_parser,
    common_peg_any_parser,
    common_peg_space_parser,
    common_peg_chars_parser,
    common_peg_json_string_parser,
    common_peg_until_parser,
    common_peg_schema_parser,
    common_peg_rule_parser,
    common_peg_ref_parser,
    common_peg_atomic_parser,
    common_peg_tag_parser
>;

Both parsers and AST nodes are allocated in arena structures to minimize memory allocations.

class common_peg_arena {
    std::vector<common_peg_parser_variant> parsers_;
    std::unordered_map<std::string, common_peg_parser_id> rules_;
    common_peg_parser_id root_ = COMMON_PEG_INVALID_PARSER_ID;
    ...

class common_peg_ast_arena {
    std::vector<common_peg_ast_node> nodes_;
    ...

Each parser variant is wrapped in a common_peg_parser value type to produce a DSL for composing parser combinators.

Parsers can return results FAIL, SUCCESS, or NEED_MORE_INPUT. This is how the partial parsing is implemented. It does not raise an exception on partial parse like common/chat-parser.cpp, because partial parses are still valid for streaming.

Additional Changes

Added common_chat_peg_parse() to common/chat.cpp and chat formats COMMON_CHAT_FORMAT_PEG_(SIMPLE|NATIVE|CONSTRUCTED) to support models parsed by a PEG parser.
- The parser must be passed from chat param initialization to the parse function. To do this, I currently serialize the parser to JSON and then deserialize to common_chat_syntax.parser. I'm not a fan, but this seems the least intrusive method to integrate. I'll implement any alternative mechanisms if desired.
Added common/unicode.{cpp,h} derived from src/unicode.{cpp,h}. As I understand, we should not include headers from src/, so I had to copy the implementation. It does deviate by returning a result rather than raising an exception.

More comprehensive documentation is added in docs/development/parsing.md. The tests are also fairly comprehensive, tests/test-chat-peg-parser.cpp.

I know this is a big PR. I tried to minimize the implementation, while keeping enough to demonstrate value. #15703 shows community desire for something like this, although it doesn't have to be this implementation.

Note

I wrote the majority of the parsing myself, as well as the complete examples in test-chat-peg-parser.cpp. AI assistance was used in some portions, notably:

Prototyping the interface to get an understanding for what works well.
Writing some of the unit tests, such as the basic ones and unicode.
The serialization/deserialization procedures were done mostly with AI.

pwilkin · 2025-11-10T12:00:28Z

Yes! This is exactly what I was thinking about :) can you give me push writes to your repo so I can contribute without doing PRs to PRs?

aldehir · 2025-11-10T15:47:05Z

Yes! This is exactly what I was thinking about :) can you give me push writes to your repo so I can contribute without doing PRs to PRs?

Sure. I've never managed permissions on a GitHub repo, but let me know if you can't push.

The interface isn't solidified, so hammer away. I do want to clean up the header and move stuff into the source file. Figured I'd handle that as I get further along.

The partial parsing works, but does require careful attention if editing. The idea is to "succeed" if the parse tree is partially traversed and the input is marked as incomplete. With some caveats: if a literal is partially matched, it will propagate a result indicating we need more input. I intend to add a regex parser that uses the builtin partial regex matching support, which should do the same thing. This allows us to collect the results when sending a streaming response.

I need to clean up the caching. Initially, I thought, maybe we could reuse the cache as we get more and more input. I'm finding it very difficult to find the correct time to cache. So I'm thinking about nixing that idea and just provide a cache per parsing run--as the packrat algorithm originally intended. Then we can profile if caching is beneficial or not on a real example. I suspect there shouldn't be a whole lot of backtracking, so the memory cost might not be worth it if the gains are minuscule.

pwilkin · 2025-11-10T17:58:14Z

Aight, let me bounce my original idea - what if we just created a GBNF parser builder and used that to parse the messages? Then we have both problems (tool call / reasoning and compatibility with normal parsing) done in one go. Unless (haven't looked into it) it would just be too inefficient for normal content parsing?

Because right now it feels like we're adding another intermediate abstraction while GBNF is already implemented in GGML - so maybe just use a builder as an abstraction layer to create all the needed objects and add any missing partial parse support?

This is just an idea, not very fixated on it, just thought I'd share it. Regarding memory coatsnand the packrat parser, I think O(n) with typical LLM inputs is negligible, even with super long contexts we're looking at like a few MB overhead at most.

aldehir · 2025-11-10T18:52:49Z

Sounds like you're thinking of a parser generator. Something like yacc, bison, or ANTLR. The problem I see with those solutions is they require building a parse table upfront, which is less intuitive than building a parse tree such as in this PR. You could create a recursive descent parser but that would have to be done at compile time. If you did it at runtime, I think the solution would look a lot like this!

I haven't examined the GBNF code with a scalpel, but taking a brief look it seems like it uses a pushdown automata and may be challenging to extract content. Not that we would want to, since it is part of the core and not common. I believe there is a desire to keep the chat parsing isolated in common.

I also think you lose the expressiveness of being able to define the grammar in C++. For example, with this solution we could add a execute() parser to take in a user lambda and run when the parse subtree succeeds. You could define prune() that removes parts of the tree on a condition, such as if there no tools are provided. Not saying we want to do that, just to demonstrate the flexibility offered.

The solutions I mentioned above do this by defining their own language to insert code--not pretty in my experience.

That said, I am open to ideas. If you have a clearer picture of what that looks like, I'm happy to review. I understand inserting a new abstraction is a tough ask. I wanted to roll out a PoC to hopefully show value.

pwilkin · 2025-11-10T20:40:23Z

@aldehir Nah, you're probably right. I looked at the GBNF code and in fact it would take too much effort to extract the parsed content from there. We're better off just doing it your way. I'll try to code some of the missing pieces.

aldehir · 2025-11-10T22:13:59Z

@pwilkin great! If you have any questions, feel free to ask.

pwilkin · 2025-11-12T00:08:32Z

Aight, I'm done with the hybrid ops and convert_hf_to_gguf refactoring cleanup, so I'll probably finally look at this tomorrow :>

…g functions for each parser

aldehir · 2025-11-12T06:07:16Z

No rush. I am getting closer to a set of parsing functions that I'm happy with. The unfortunate part is I had to roll specialized parsers to maintain comparable performance with the existing parsing. A lexer would likely help, but optimized parsers for certain use cases is enough for now.

I added a benchmark in the test that implements the Command R2B parser, and compares it to the existing one. It seemed like a good one to illustrate.

// Benchmarks are over 100 iterations
Reasoning + Content:
   New parser avg: 23 us
Legacy parser avg: 450 us

Reasoning + Tool Call:
   New parser avg: 263 us
Legacy parser avg: 151 us

The existing parsing has a leg up with JSON. That said, it's still a fraction of a millisecond for a full prompt. I think most of the cost will go into the constrained decoding anyway. I'll have to benchmark larger JSON documents. Worst case, we can fall back to the implementation in json-partial.cpp. The intent here is to better support streaming JSON.

aldehir · 2025-11-28T06:32:59Z

I have pushed the Qwen3-Coder implementation to a separate branch in my repo: https://github.com/aldehir/llama.cpp/tree/qwen3-coder-peg-parser

The helpers and any testing recommendations should probably go in a separate PR with the first usage of this new parsing mechanism (if approved). Those tests would most likely impact tests/test-chat.cpp. You're free to hack in that branch, unless you're adamant we need to prescribe a solution here.

I don't view the existing mechanism as a buggy mess, but I do believe there is an opportunity for improvement by aligning the grammar and parsing.

aldehir · 2025-11-28T17:08:48Z

@ggerganov @CISC @ngxson Please take a look. Curious whether you think something like this would be a good fit.

pwilkin · 2025-11-28T17:13:28Z

@aldehir Nah, I agree that this has become large enough as it is that any implementations and helpers are best saved for another PR. If you don't mind, I can actually open it after this is merged, will be easier to show you my vision that way ;)

ngxson · 2025-11-28T19:38:32Z

I really like the idea overall. Actually, before minja was introduced to the project, I envisioned a parser with a syntax like this:

parser << "<" << name "=" << value << ">";

Which is pretty close to the example in your documentation:

tool_choice |= p.rule("tool-" + name, "{" << tool_name << "," << tool_args << "}");

Honestly this system is more readable and more robust than the regex-based approach current used by most parsers in chat.cpp

One questions though: Do you think we can potentially replace the recently-merged XML parser with this system? IMO the current XML parser is over-complicated because (unfortunately) it has to rely on the GBNF grammar system

aldehir · 2025-11-28T20:17:22Z

If you don't mind, I can actually open it after this is merged,

Sure thing.

One questions though: Do you think we can potentially replace the recently-merged XML parser with this system? IMO the current XML parser is over-complicated because (unfortunately) it has to rely on the GBNF grammar system

The XML parsing is what inspired this PR. As model outputs become more complex and different, I think we'll need more sophisticated parsing than regex can provide. So, yes, I envision this replacing the recently merged XML parsing.

This system still relies on GBNF grammar for constrained decoding, but it produces compatible GBNF grammars from the parser instead of handwriting one. I am exploring an alternative sampler that uses this parser implementation during sampling instead (albeit, with a more optimized underlying implementation for incremental parsing). I'll need to roll out a prototype to better understand its performance characteristics, but it feels like a viable, alternative, option.

pwilkin · 2025-11-28T20:24:39Z

@ngxson Yeah, that's what I'm trying to convince @aldehir of: if we replace the current approach with this PEG parser + helpers, we can hopefully not only have an easier way of adding new chat formats, but also a unified test framework for all supported templates to go with it.

ngxson · 2025-11-28T20:37:29Z

@ngxson Yeah, that's what I'm trying to convince @aldehir of: if we replace the current approach with this PEG parser + helpers, we can hopefully not only have an easier way of adding new chat formats, but also a unified test framework for all supported templates to go with it.

Replacing all regex-based parser with PEG seems to be a logical next step, what's current holding it back?

aldehir · 2025-11-28T21:13:58Z

what's current holding it back?

Nothing! I have already demonstrated use with Qwen3-Coder and it works really well. Where @pwilkin and I have differing opinions is in what should be a utility function, but we'll iron that out as we go.

There are a few edge cases in the json-schema-to-grammar that do need addressing, and I am actively working those (as a separate PR).

ggerganov

@aldehir Looks like nice work!

tests/peg-parser/simple-tokenize.h

aldehir · 2025-12-03T00:52:45Z

@aldehir Looks like nice work!

Thank you!

I added myself to CODEOWNERS for the related files. I also updated the description in light of the recent AI disclosure policy.

ggerganov

Great! Let's merge this if there are no additional comments @pwilkin @ngxson

ngxson

LGTM, good jobs @aldehir !

pwilkin · 2025-12-03T10:43:38Z

Let's merge so we can start building upon it :) Kudos to @aldehir for all his hard work here.

…ggml-org#17136) * common : implement parser combinators to simplify chat parsing * add virtual destructor to parser_base * fix memory leak from circular references of rules * implement gbnf grammar building * remove unused private variable * create a base visitor and implement id assignment as a visitor * fix const ref for grammar builder * clean up types, friend classes, and class declarations * remove builder usage from until_parser * Use a counter class to help assign rule ids * cache everything * add short description for each parser * create a type for the root parser * implement repetition parser * Make optional, one_or_more, and zero_or_more subclasses of repetition * improve context constructor * improve until parsing and add benchmarks * remove cached() pattern, cache in parser_base with specialized parsing functions for each parser * improve json parsing performance to better match legacy parsing * fix const auto * it for windows * move id assignment to classes instead of using a visitor * create named rules in the command r7b example * use '.' for any in GBNF * fix parens around choices in gbnf grammar * add convenience operators to turn strings to literals * add free-form operators for const char * to simplify defining literals * simplify test case parser * implement semantic actions * remove groups in favor of actions and a scratchpad * add built in actions for common operations * add actions to command r7b example * use std::default_searcher for platforms that don't have bm * improve parser_type handling and add cast helper * add partial result type to better control when to run actions * fix bug in until() * run actions on partial results by default * use common_chat_msg for result * add qwen3 example wip * trash partial idea and simplify * move action arguments to a struct * implement aho-corasick matcher for until_parser and to build exclusion grammars * use std::string for input, since std::string_view is incompatible with std::regex * Refactor tests * improve qwen3 example * implement sax-style parsing and refactor * fix json string in test * rename classes to use common_chat_ prefix * remove is_ suffix from functions * rename from id_counter to just counter * Final refactored tests * Fix executable name and editorconfig-checker * Third time's the charm... * add trigger parser to begin lazy grammar rule generation * working lazy grammar * refactor json rules now that we check for reachability * reduce pointer usage * print out grammars in example * rename to chat-peg-parser* and common_chat_peg_parser* * Revert unrelated changes * New macros for CMakeLists to enable multi-file compilations * starting unicode support * add unicode support to char_parser * use unparsed args as additional sources * Refactor tests to new harness * Fix CMakeLists * fix rate calculation * add unicode tests * fix trailing whitespace and line endings skip-checks: true * Helpers + rewrite qwen3 with helpers * Fix whitespace * extract unicode functions to separate file * refactor parse unicode function * fix compiler error * improve construction of sequence/choice parsers * be less clever * add make_parser helper function * expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source * lower bench iterations * add unicode support to until_parser * add unicode support to json_string_parser * clean up unicode tests * reduce unicode details to match src/unicode.cpp * simplify even further * remove unused functions * fix type * reformat char class parsing * clean up json string parser * clean up + fix diagnostics * reorder includes * compact builder functions * replace action_parser with capture_parser, rename env to semantics * rename env to semantics * clean up common_chat_parse_context * move type() to below constant * use default constructor for common_chat_peg_parser * make all operators functions for consistency * fix compilation errors in test-optional.cpp * simplify result values * rename json_string_unquoted to json_string_content * Move helper to separate class, add separate explicit and helper classes * Whitespace * Change + to append() * Reformat * Add extra helpers, tests and Minimax example * Add some extra optional debugging prints + real example of how to use them * fix bug in repetitions when min_count = 0 reports failures * dump rule in debug * fix token accumulation and assert parsing never fails * indent debug by depth * use LOG_* in tests so logs sync up with test logs * - Add selective testing - Refactor all messaging to use LOG_ERR - Fix lack of argument / tool name capturing - Temporary fix for double event capture * refactor rule() and introduce ref() * clean up visitor * clean up indirection in root parser w.r.t rules * store shared ptr directly in parser classes * replace aho-corasick automation with a simple trie * Reset prev for qwen3 helper example variant * refactor to use value semantics with std::variant/std::visit * simplify trie_matcher result * fix linting issues * add annotations to rules * revert test workaround * implement serializing the parser * remove redundant parsers * remove tests * gbnf generation fixes * remove LOG_* use in tests * update gbnf tests to test entire grammar * clean up gbnf generation and fix a few bugs * fix typo in test output * remove implicit conversion rules * improve test output * rename trie_matcher to trie * simplify trie to just know if a node is the end of a word * remove common_chat_ prefix and ensure a common_peg_ prefix to all types * rename chat-peg-parser -> peg-parser * promote chat-peg-parser-helper to chat-peg-parser * checkpoint * use a static_assert to ensure we handle every branch * inline trivial peg parser builders * use json strings for now * implement basic and native chat peg parser builders/extractors * resolve refs to their rules * remove packrat caching (for now) * update tests * compare parsers with incremental input * benchmark both complete and incremental parsing * add raw string generation from json schema * add support for string schemas in gbnf generation * fix qwen example to include \n * tidy up example * rename extractor to mapper * rename ast_arena to ast * place basic tests into one * use gbnf_format_literal from json-schema-to-grammar * integrate parser with common/chat and server * clean up schema and serialization * add json-schema raw string tests * clean up json creation and remove capture parser * trim spaces from reasoning and content * clean up redundant rules and comments * rename input_is_complete to is_partial to match rest of project * simplify json rules * remove extraneous file * remove comment * implement += and |= operators * add comments to qwen3 implementation * reorder arguments to common_chat_peg_parse * remove commented outdated tests * add explicit copy constructor * fix operators and constness * wip: update test-chat for qwen3-coder * bring json parser closer to json-schema-to-grammar rules * trim trailing space for most things * fix qwen3 coder rules w.r.t. trailing spaces * group rules * do not trim trailing space from string args * tweak spacing of qwen3 grammar * update qwen3-coder tests * qwen3-coder small fixes * place parser in common_chat_syntax to simplify invocation * use std::set to collect rules to keep order predictable for tests * initialize parser to make certain platforms happy * revert back to std::unordered_set, sort rule names at the end instead * uncomment rest of chat tests * define explicit default constructor * improve arena init and server integration * fix chat test * add json_member() * add a comprehensive native example * clean up example qwen test and add response_format example to native test * make build_peg_parser accept std::function instead of template * change peg parser parameters into const ref * push tool call on tool open for constructed parser * add parsing documentation * clean up some comments * add json schema support to qwen3-coder * add id initializer in tests * remove grammar debug line from qwen3-coder * refactor qwen3-coder to use sequence over operators * only call common_chat_peg_parse if appropriate format * simplify qwen3-coder space handling * revert qwen3-coder implementation * revert json-schema-to-grammar changes * remove unnecessary forward declaration * small adjustment to until_parser * rename C/C++ files to use dashes * codeowners : add aldehir to peg-parser and related files --------- Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>

common : implement parser combinators to simplify chat parsing

c822e73

github-actions bot added the testing Everything test related label Nov 10, 2025

DajanaV mentioned this pull request Nov 10, 2025

UPSTREAM PR #17136: common : implement parser combinators for chat parsing [WIP] auroralabs-loci/llama.cpp#153

Closed

8 tasks

aldehir added 4 commits November 9, 2025 22:34

add virtual destructor to parser_base

e6153bb

fix memory leak from circular references of rules

4ced999

implement gbnf grammar building

2a9a13d

remove unused private variable

2286532

aldehir added 11 commits November 10, 2025 20:17

create a base visitor and implement id assignment as a visitor

3e6662f

fix const ref for grammar builder

76cf0b5

clean up types, friend classes, and class declarations

9c7b3e8

remove builder usage from until_parser

f02e2b0

Use a counter class to help assign rule ids

66cf038

cache everything

2b3caef

add short description for each parser

adac6ba

create a type for the root parser

0be2a93

implement repetition parser

31b386f

Make optional, one_or_more, and zero_or_more subclasses of repetition

ffb7a6f

improve context constructor

085404a

aldehir added 3 commits November 11, 2025 22:22

improve until parsing and add benchmarks

6bd9a95

remove cached() pattern, cache in parser_base with specialized parsin…

62656db

…g functions for each parser

improve json parsing performance to better match legacy parsing

18557f3

fix const auto * it for windows

f6aa608

Merge branch 'master' into parser-combinators

a460437

aldehir marked this pull request as ready for review November 28, 2025 17:04

aldehir requested review from ggerganov and ngxson as code owners November 28, 2025 17:04

aldehir added 2 commits November 28, 2025 17:10

small adjustment to until_parser

6479e73

Merge branch 'master' into parser-combinators

b94d4aa

loci-dev mentioned this pull request Nov 29, 2025

UPSTREAM PR #17136: common : introduce composable PEG parser combinators for chat parsing auroralabs-loci/llama.cpp#359

Open

ggerganov reviewed Nov 29, 2025

View reviewed changes

tests/peg-parser/simple-tokenize.h Show resolved Hide resolved

aldehir added 3 commits November 29, 2025 16:08

rename C/C++ files to use dashes

b69e9dd

Merge branch 'master' into parser-combinators

c49a57d

codeowners : add aldehir to peg-parser and related files

32a6ed9

aldehir mentioned this pull request Dec 3, 2025

common : add parser for ministral/mistral large 3/devstral 2 #17713

Merged

ggerganov approved these changes Dec 3, 2025

View reviewed changes

ngxson approved these changes Dec 3, 2025

View reviewed changes

pwilkin approved these changes Dec 3, 2025

View reviewed changes

ggerganov merged commit 0a8026e into ggml-org:master Dec 3, 2025
73 of 79 checks passed

gabe-l-hart mentioned this pull request Dec 10, 2025

feat: llama.cpp bump (17f7f4) for SSM performance improvements ollama/ollama#13408

Merged

ochafik mentioned this pull request Dec 24, 2025

[WIP] tool-call: experimental migration of all parsers to peg-parser infra (w/ better test coverage) #18353

Draft

12 tasks

common : introduce composable PEG parser combinators for chat parsing #17136

common : introduce composable PEG parser combinators for chat parsing #17136

Uh oh!

Conversation

aldehir commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Solution

Parsing Expression Grammars (PEG)

Features

Examples

Implementation Details

Additional Changes

Uh oh!

pwilkin commented Nov 10, 2025

Uh oh!

aldehir commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Nov 10, 2025

Uh oh!

aldehir commented Nov 10, 2025

Uh oh!

pwilkin commented Nov 10, 2025

Uh oh!

aldehir commented Nov 10, 2025

Uh oh!

pwilkin commented Nov 12, 2025

Uh oh!

aldehir commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldehir commented Nov 28, 2025

Uh oh!

pwilkin commented Nov 28, 2025

Uh oh!

ngxson commented Nov 28, 2025

Uh oh!

aldehir commented Nov 28, 2025

Uh oh!

pwilkin commented Nov 28, 2025

Uh oh!

ngxson commented Nov 28, 2025

Uh oh!

aldehir commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aldehir commented Dec 3, 2025

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

pwilkin commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

aldehir commented Nov 10, 2025 •

edited

Loading

aldehir commented Nov 10, 2025 •

edited

Loading

aldehir commented Nov 12, 2025 •

edited

Loading

aldehir commented Nov 28, 2025 •

edited

Loading

aldehir commented Nov 28, 2025 •

edited

Loading