Skip to content

Unable to pass images to Aya Vision processor #38350

Closed
@DarkLight1337

Description

@DarkLight1337

System Info

transformers 4.52

Who can help?

@zucchini-nlp @hmellor

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

In the new version of transformers, _check_special_mm_tokens is being called inside AyaVisionProcessor. However, _check_special_mm_tokens assumes that the image placeholder <image> can be represented as a single token. This is not the case for Aya Vision 8B which encodes <image> into [35, 6504, 37]. As a result, the validation always fails whenever an image is passed.

I discovered this issue when attempting to update transformers version in vLLM: vllm-project/vllm#18678

Error log: https://buildkite.com/vllm/fastcheck/builds/25098/steps?sid=019706c6-1a33-4922-9358-d72dfc525fe2 https://buildkite.com/vllm/fastcheck/builds/25098/steps?sid=019706c6-1a35-46ac-aa2b-8d6d811109fd

Expected behavior

_check_special_mm_tokens should handle the case where the modality text takes up multiple tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions