Skip to content

Bug in Paligemma usage docs for v4.50.3 #37181

@EricCousineau-TRI

Description

@EricCousineau-TRI

The following doc version has a minor bug in its usage:
https://github.com/huggingface/transformers/blob/v4.50.3/docs/source/en/model_doc/paligemma.md#single-image-inference
At the time of writing, this is the default version of the docs people come across via
https://huggingface.co/docs/transformers/en/model_doc/paligemma

This has bug where it's using the tokenized input length:

print(processor.decode(output[0], skip_special_tokens=True)[inputs.input_ids.shape[1]: ])

but it should actually be the text input length:

print(processor.decode(output[0], skip_special_tokens=True)[len(prompt): ])

Found this out by cross-referencing the HF spaces example:
https://huggingface.co/spaces/big-vision/paligemma-hf/blob/d914d44/app.py#L38

Note sure if it's worth patching the existing docs, having a new minor release, or just closing this out as a note for others.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions