Fix false positive right-padding warning for decoder-only models in pipeline by ManasVardhan · Pull Request #44021 · huggingface/transformers

ManasVardhan · 2026-02-15T21:45:58Z

What does this PR do?

Fixes #43906 (related to #38071)

Problem

When using pipeline('text-generation') with batched inference on Qwen3 (and other models where pad_token_id == bos_token_id), a spurious warning is emitted:

A decoder-only architecture is being used, but right-padding was detected!

This happens for two reasons:

The TextGenerationPipeline doesn't set padding_side='left' for decoder-only models, so the default 'right' padding is used during batch collation
The right-padding detection heuristic in generate() only checks if the last token equals pad_token_id, which can produce false positives when pad_token_id equals other special tokens

Fix

TextGenerationPipeline.__init__: Automatically set tokenizer.padding_side = 'left' for decoder-only models (since TextGenerationPipeline is exclusively for causal LM)
GenerationMixin.generate: When an attention_mask is available, use it to detect right-padding (check if last position has mask=0) instead of relying solely on the token id heuristic. Falls back to the original check when no attention mask is provided.

Root Cause Analysis

For Qwen3, pad_token_id = bos_token_id = 151643 (<|endoftext|>). The tokenizer's default padding_side='right' means shorter sequences in a batch get right-padded. The existing check inputs_tensor[:, -1] == pad_token_tensor then correctly detects this — but the real issue is that the pipeline should have been left-padding all along.

Even after fixing the pipeline, the attention-mask-based detection is more robust for cases where users call model.generate() directly with properly left-padded inputs whose content happens to end with the pad token.

Who can review?

@gante @ArthurZucker

Two changes to fix the spurious 'right-padding was detected' warning that fires for Qwen3 and other models during batched pipeline inference: 1. TextGenerationPipeline: Set padding_side='left' automatically for decoder-only models. The default tokenizer padding_side is 'right', which causes incorrect padding for batched generation. The pipeline now overrides this to 'left' on initialization. 2. GenerationMixin.generate: Improve right-padding detection by using the attention_mask when available, instead of only checking if the last token equals pad_token_id. The old heuristic produced false positives when pad_token_id == eos_token_id or bos_token_id (as is the case for Qwen3 where both are token 151643). Fixes huggingface#43906 Related to huggingface#38071

zucchini-nlp

Thanks, can you chekc failing whisper pipeline tests?

https://app.circleci.com/pipelines/github/huggingface/transformers/164182/workflows/d3394016-7983-47ee-963d-6592873bdf01/jobs/2163511/tests

…isperForCausalLM) Only set tokenizer.padding_side='left' when no feature_extractor exists, to avoid ValueError from pad_collate_fn when they disagree.

ManasVardhan · 2026-02-17T00:29:01Z

Thanks for flagging! The Whisper pipeline tests were failing because WhisperForCausalLM has both a tokenizer and a feature_extractor — my change set tokenizer.padding_side='left' which conflicted with the feature_extractor's padding_side='right', causing a ValueError in pad_collate_fn.

Fixed in 3c63b39 — now we only override padding_side when no feature_extractor is present.

zucchini-nlp · 2026-02-17T09:31:10Z

Thanks, I pushed a fix so we can merge soon. We don't want to check self.feature_extractor in this pipe, the pipe generates text from text and thus is expected to not load a feature_extractor. There was an issue in how test is written

HuggingFaceDocBuilderDev · 2026-02-17T09:48:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp approved these changes Feb 16, 2026

View reviewed changes

Fix padding_side conflict when feature_extractor is present (e.g., Wh…

3c63b39

…isperForCausalLM) Only set tokenizer.padding_side='left' when no feature_extractor exists, to avoid ValueError from pad_collate_fn when they disagree.

fix the test itself, pipe doesn't need feat extractors

a4acc6b

zucchini-nlp enabled auto-merge (squash) February 17, 2026 09:31

Merge branch 'main' into fix-qwen3-warning-bug-43906

1f74f44

zucchini-nlp merged commit 48ad2d5 into huggingface:main Feb 17, 2026
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix false positive right-padding warning for decoder-only models in pipeline#44021

Fix false positive right-padding warning for decoder-only models in pipeline#44021
zucchini-nlp merged 4 commits intohuggingface:mainfrom
ManasVardhan:fix-qwen3-warning-bug-43906

ManasVardhan commented Feb 15, 2026

Uh oh!

zucchini-nlp left a comment

Uh oh!

ManasVardhan commented Feb 17, 2026

Uh oh!

zucchini-nlp commented Feb 17, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ManasVardhan commented Feb 15, 2026

What does this PR do?

Problem

Fix

Root Cause Analysis

Who can review?

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

ManasVardhan commented Feb 17, 2026

Uh oh!

zucchini-nlp commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zucchini-nlp commented Feb 17, 2026 •

edited

Loading