-
Notifications
You must be signed in to change notification settings - Fork 30k
Open
Labels
Description
System Info
- `transformers` version: 4.56.0.dev0
- Platform: Linux-5.15.0-143-generic-x86_64-with-glibc2.35
- Python version: 3.11.10
- Huggingface_hub version: 0.34.3
- Safetensors version: 0.4.5
- Accelerate version: 1.0.1
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.7.1+cu126 (CUDA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: no
- Using GPU in script?: yes
- GPU type: NVIDIA H100 80GB HBM3
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("pfnet/plamo-2-1b", trust_remote_code=True, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-2-1b", trust_remote_code=True)
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model.generate(**encoded_input)
produces:
Traceback (most recent call last):
File "/home/user/vllm/tests/test_hf.py", line 8, in <module>
output = model.generate(**encoded_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/user/transformers/src/transformers/generation/utils.py", line 2522, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/home/user/transformers/src/transformers/generation/utils.py", line 3503, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/net/storage149/mnt/md0/user/modules/transformers_modules/pfnet/plamo-2-1b/a99ff56aee4f73b4e36e376c83130050d05dc178/modeling_plamo.py", line 1576, in forward
outputs = self.model(
^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/net/storage149/mnt/md0/user/modules/transformers_modules/pfnet/plamo-2-1b/a99ff56aee4f73b4e36e376c83130050d05dc178/modeling_plamo.py", line 1454, in forward
out = self.layers(
^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/net/storage149/mnt/md0/user/modules/transformers_modules/pfnet/plamo-2-1b/a99ff56aee4f73b4e36e376c83130050d05dc178/modeling_plamo.py", line 1281, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/net/storage149/mnt/md0/user/modules/transformers_modules/pfnet/plamo-2-1b/a99ff56aee4f73b4e36e376c83130050d05dc178/modeling_plamo.py", line 1206, in forward
hidden_states_sa, present_key_value = self.mixer(
^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/dev-env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/net/storage149/mnt/md0/user/modules/transformers_modules/pfnet/plamo-2-1b/a99ff56aee4f73b4e36e376c83130050d05dc178/modeling_plamo.py", line 906, in forward
elif past_states[self.layer_idx] is None:
~~~~~~~~~~~^^^^^^^^^^^^^^^^
File "/home/user/transformers/src/transformers/cache_utils.py", line 939, in __getitem__
raise KeyError(
KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'
Expected behavior
Doesn't crash :)