Bug in Paligemma usage docs for v4.50.3

The following doc version has a minor bug in its usage:
https://github.com/huggingface/transformers/blob/v4.50.3/docs/source/en/model_doc/paligemma.md#single-image-inference
At the time of writing, this is the default version of the docs people come across via
https://huggingface.co/docs/transformers/en/model_doc/paligemma

This has bug where it's using the tokenized input length:
```py
print(processor.decode(output[0], skip_special_tokens=True)[inputs.input_ids.shape[1]: ])
```
but it should actually be the text input length:
```py
print(processor.decode(output[0], skip_special_tokens=True)[len(prompt): ])
```

Found this out by cross-referencing the HF spaces example:
https://huggingface.co/spaces/big-vision/paligemma-hf/blob/d914d44/app.py#L38

Note sure if it's worth patching the existing docs, having a new minor release, or just closing this out as a note for others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in Paligemma usage docs for v4.50.3 #37181

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug in Paligemma usage docs for v4.50.3 #37181

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions