Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Oct 21, 2025

Fix logic error in prepare_inputs_for_generation cache slicing condition:

TypeError: 'NoneType' object is not subscriptable

Background

I think the PR

introduced a logic error where the condition for calling _cache_dependant_input_preparation uses is None instead of is not None, causing crashes when prepare_inputs_for_generation is called with past_key_values=None and use_cache=False.

Bug

PR #41505 introduced this condition:

if past_key_values is None or use_cache:
    inputs_embeds, input_ids = self._cache_dependant_input_preparation(...)

The condition past_key_values is None or use_cache means:

  • Do cache-dependent slicing if we DON'T have a cache OR if caching is enabled

This triggers the function even when:

  • past_key_values=None (no cache)
  • use_cache=False (caching disabled)
  • cache_position=None (no position information)

This combination is invalid for cache-dependent preparation and causes a crash when accessing cache_position[-1] (line 456).

Note that during normal generation, it works fine because use_cache=True, making the buggy past_key_values is None part irrelevant.

Fix

This PR changes the condition to:

- if past_key_values is None or use_cache:
+ if past_key_values is not None or use_cache:

The condition past_key_values is not None or use_cache means:

  • Do cache-dependent slicing if we DO have a cache OR if caching is enabled

This is semantically correct and matches the intent described in the PR #41505 comment: #41505 (comment)

stateful models like recurrent_gemma assume that slicing happens, but don't have a Cache

The use_cache part handles stateful models, while past_key_values is not None handles normal cached models.

Testing

This PR fixes the downstream failing test in TRL:

tests/test_modeling_geometric_mixture_wrapper.py::TestGeometricMixtureWrapper::test_prepare_inputs_for_generation

See the associated issue:

Related

CC:

@zucchini-nlp
Copy link
Member

I had same question, so I think it's better to ping @gante who has more understanding of the situation. IIRC it had smth to do with Reformer model

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante
Copy link
Contributor

gante commented Nov 7, 2025

Hi @albertvillanova @zucchini-nlp 👋

Have a look at this slack thread, I think it contains the full discussion related to this bug (assuming it is the same issue, I think it is) and how it interacts with TRL: https://huggingface.slack.com/archives/C01N44FJDHT/p1760725474636279

TL;DR I think there is a fix we can implement in TRL right now (see thread) to work around it, the transformers-side fix is trickier

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just looked at the special RecurrentGemma model which has no past_key_values, and it should not have an edge case if we check values of use_cache. For other special models like Mamba, they have their own input preparation method

So I don't think we needed to check past_key_values is None, unless the edge case mentioned by Joao isn't tested. So let's merge it

@zucchini-nlp
Copy link
Member

I will just rebase main to check no new test failures :)

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) November 11, 2025 10:08
@zucchini-nlp zucchini-nlp merged commit f30c225 into huggingface:main Nov 11, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI fails with dev dependencies: TypeError: 'NoneType' object is not subscriptable

4 participants