static cache: RuntimeError: cannot mutate tensors with frozen storage

With:
* https://github.com/pytorch/pytorch/commit/aa31e7019a49e1d36b23a5132dc52f2414b65055 (~2.5.0a0)
* https://github.com/huggingface/accelerate/commit/3fcc9461c4fcb7228df5e5246809ba09cfbb232e
* https://github.com/huggingface/transformers/commit/5c1027bf09717f664b579e01cbb8ec3ef5aeb140

On:
* CPU
* NVidia A10

Test that "static cache works with `torch.export()`" fails with:
```
# RUN_SLOW=1 python3 -m pytest --pspec -vv -k CacheTest tests/utils/test_cache_utils.py

RuntimeError: cannot mutate tensors with frozen storage

While executing %index_copy_ : [num_users=0] = call_method[target=index_copy_](args = (%k_out, 2, %l_input_pos_, %k_embed), kwargs = {})
Original traceback:
  File "/home/dvrogozh/git/huggingface/transformers/tests/utils/test_cache_utils.py", line 210, in forward
    outs = self.model(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 1076, in forward
    outputs = self.model(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 889, in forward
    layer_outputs = decoder_layer(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 611, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 521, in forward
    key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/cache_utils.py", line 1101, in update
    k_out.index_copy_(2, cache_position, key_states)
```

I observe that adding a `.clone()` to the following 2 tensors does fix the issue. Such solution was suggested in https://github.com/pytorch/pytorch/issues/127571#issuecomment-2141143578. However I am not sure whether that's the correct fix. See #33178 draft PR with this change.
https://github.com/huggingface/transformers/blob/5c1027bf09717f664b579e01cbb8ec3ef5aeb140/src/transformers/cache_utils.py#L1090-L1091


CC: @gante @SunMarc 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	k_out = self.key_cache[layer_idx]
	v_out = self.value_cache[layer_idx]

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions