Skip to content

Conversation

@mattt
Copy link
Collaborator

@mattt mattt commented Sep 30, 2025

Resolves #34

The {% generation %} … {% endgeneration %} block is a nonstandard Jinja extension provided by Hugging Face Transformers. Its role is to mark assistant-generated segments in a chat template so that methods such as tokenizer.apply_chat_template(..., return_assistant_tokens_mask=True) can produce a mask identifying those tokens.

The 2.0 release of Jinja throws an error when parsing unknown tokens. There are a few different ways we could support generation tags, but I think the simplest way is what we do in this PR, which is add new AST tokens for generation and endgeneration. (Though, I'd welcome any alternative perspectives, as I don't have as much context around how these are used in Transformers, or how Swift API consumers expect this to work)

@johnmai-dev
Copy link
Collaborator

huggingface/transformers#30650

It seems that {% generation %}is specifically intended for finetuning. For inference, it doesn’t matter whether you use this tag or not.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generation is indeed used to extract assistant responses during fine-tuning. This solution looks fine to me (I assume it's the same thing the previous Jinja version did, is that right?)

We could potentially make it explicit in a test that rendering with or without these enclosure tags yields the same results, but I think the current tests are good as they are.

@mattt
Copy link
Collaborator Author

mattt commented Oct 1, 2025

We could potentially make it explicit in a test that rendering with or without these enclosure tags yields the same results, but I think the current tests are good as they are.

Good suggestion. I just added that with 32cdf99. And I just fixed a test failure caused by hard-coding today's date in the test expectation with f244c82 🙃

@johnmai-dev
Copy link
Collaborator

Adding new AST tokens is definitely the simplest approach for now. In the future, though, I think implementing proper Jinja Extensions might be a better solution.https://jinja.palletsprojects.com/en/stable/extensions/

@mattt mattt force-pushed the mattt/generation branch from f244c82 to a01a236 Compare October 1, 2025 12:04
@mattt
Copy link
Collaborator Author

mattt commented Oct 1, 2025

In the future, though, I think implementing proper Jinja Extensions might be a better solution. https://jinja.palletsprojects.com/en/stable/extensions/

My 2 cents: I actively dislike Jinja's extension system. To me, it's a classic example of the inner-platform effect. While it can be used to extend functionality, I'd prefer a solution that's more composable. And looking at real-world usage over time, it seems to be a niche feature that's primarily served as a way to incubate new Jinja features (for example, do and with started as extensions, but were made built-ins with the 2.9 release).

The pattern I'd advocate for is for projects that need more functionality to fork / vendor this package and make changes directly to the implementation.

If there's a strong demand for an equivalent extension system, we can definitely look into that. But for now, I'm happy to take a kitchen sink approach, like we do here, and support everything that folks need as built-ins.

@mattt mattt merged commit c1ef596 into main Oct 1, 2025
3 checks passed
@pcuenca pcuenca deleted the mattt/generation branch October 1, 2025 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible regression since 2.0.0 for {% generation %} in SmolLM3 template

4 participants