Closed
Description
System Info
transformers
version: 4.48.0- Platform: Linux-5.4.0-174-generic-x86_64-with-glibc2.31
- Python version: 3.9.20
- Huggingface_hub version: 0.26.2
- Safetensors version: 0.4.5
- Accelerate version: 1.0.1
- Accelerate config: not found
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?:
- Using GPU in script?:
- GPU type: NVIDIA A10
Who can help?
@zucchini-nlp @aymeric-roucher
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
>>> import numpy as np
>>> from transformers import AutoProcessor
>>> processor = AutoProcessor.from_pretrained("rhymes-ai/Aria")
>>> processor(text="<|img|>", images=[np.zeros((3, 224, 224))])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/cyrus/miniconda3/envs/vllm/lib/python3.9/site-packages/transformers/models/aria/processing_aria.py", line 129, in __call__
sample = sample.replace(self.tokenizer.image_token, self.tokenizer.image_token * num_crops)
File "/home/cyrus/miniconda3/envs/vllm/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1108, in __getattr__
raise AttributeError(f"{self.__class__.__name__} has no attribute {key}")
AttributeError: LlamaTokenizerFast has no attribute image_token
The processor for Aria model doesn't work because of missing image_token
attribute.
Expected behavior
Similar to other models, we should add image_token
directly to the processor and use that instead of referring to tokenizer.image_token
.