[generate] beam search -- fix output cropping #37080

gante · 2025-03-28T15:21:24Z

What does this PR do?

VLLM is seeing some output differences in their CI when beam search is being used. The difference can be tracked to the beam search refactor (#35802).

Inspecting the outputs, we can see that there are a few additional pad tokens on the right. This is because the output was not being cropped correctly when the selected beam is shorter than the generation length (i.e. when the highest-scoring beam is NOT from the latest decoding iteration, but rather some previously completed beam).

After #35802: output length = input length + number of decoding iterations
Before #35802 and in this PR: output length = length of the longest selected beam

This PR also changed a few beam search tests to check their special tokens, which would have prevented this bug.

github-actions · 2025-03-28T15:21:36Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

gante · 2025-03-28T15:45:24Z

tests/models/bart/test_modeling_bart.py

        )

        dct = tok(ARTICLE, return_tensors="pt")
        generated_ids = hf.generate(**dct, num_beams=4)
-        result = tok.batch_decode(generated_ids, skip_special_tokens=True)[0]
+        result = tok.batch_decode(generated_ids)[0]


Tests: update beam search tests to also print special tokens

e.g. this updated test fails on main because it is returning extra pad tokens, because of the incorrect crop

ArthurZucker

LGTM thanks for digging and fixing this quickly!

* handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment

handle jagged beams

df6ad6e

github-actions bot marked this pull request as draft March 28, 2025 15:21

gante added 3 commits March 28, 2025 15:27

better comment

01d9c0c

bart -- beam search tests print special tokens

8640186

more bart test updates

650f12a

gante commented Mar 28, 2025

View reviewed changes

more tests!

04ae26e

gante changed the title ~~[generate] beam search -- handle jagged beams~~ [generate] beam search -- handle jagged output length Mar 28, 2025

gante changed the title ~~[generate] beam search -- handle jagged output length~~ [generate] beam search -- fix output cropping Mar 28, 2025

gante marked this pull request as ready for review March 28, 2025 16:23

gante requested a review from zucchini-nlp March 28, 2025 16:23

better comment

f2f4b80

gante added the for patch Tag issues / labels that should be included in the next patch label Mar 28, 2025

ArthurZucker approved these changes Mar 28, 2025

View reviewed changes

ArthurZucker merged commit 9fd9476 into huggingface:main Mar 28, 2025
20 of 21 checks passed

ArthurZucker pushed a commit that referenced this pull request Mar 28, 2025

[generate] beam search -- fix output cropping (#37080)

a78e884

* handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment

hmellor mentioned this pull request Mar 28, 2025

Upgrade transformers to v4.50.3 vllm-project/vllm#13905

Merged

9 tasks

gante deleted the beam_search_jagged_beams branch March 28, 2025 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[generate] beam search -- fix output cropping #37080

[generate] beam search -- fix output cropping #37080

Uh oh!

gante commented Mar 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

gante Mar 28, 2025 •

edited

Loading

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

[generate] beam search -- fix output cropping #37080

[generate] beam search -- fix output cropping #37080

Uh oh!

Conversation

gante commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

gante Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gante commented Mar 28, 2025 •

edited

Loading

gante Mar 28, 2025 •

edited

Loading