Skip to content

[generate] beam search -- fix output cropping #37080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 28, 2025

Conversation

gante
Copy link
Member

@gante gante commented Mar 28, 2025

What does this PR do?

VLLM is seeing some output differences in their CI when beam search is being used. The difference can be tracked to the beam search refactor (#35802).

Inspecting the outputs, we can see that there are a few additional pad tokens on the right. This is because the output was not being cropped correctly when the selected beam is shorter than the generation length (i.e. when the highest-scoring beam is NOT from the latest decoding iteration, but rather some previously completed beam).

After #35802: output length = input length + number of decoding iterations
Before #35802 and in this PR: output length = length of the longest selected beam

This PR also changed a few beam search tests to check their special tokens, which would have prevented this bug.

@github-actions github-actions bot marked this pull request as draft March 28, 2025 15:21
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

)

dct = tok(ARTICLE, return_tensors="pt")
generated_ids = hf.generate(**dct, num_beams=4)
result = tok.batch_decode(generated_ids, skip_special_tokens=True)[0]
result = tok.batch_decode(generated_ids)[0]
Copy link
Member Author

@gante gante Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests: update beam search tests to also print special tokens

e.g. this updated test fails on main because it is returning extra pad tokens, because of the incorrect crop

@gante gante changed the title [generate] beam search -- handle jagged beams [generate] beam search -- handle jagged output length Mar 28, 2025
@gante gante changed the title [generate] beam search -- handle jagged output length [generate] beam search -- fix output cropping Mar 28, 2025
@gante gante marked this pull request as ready for review March 28, 2025 16:23
@gante gante requested a review from zucchini-nlp March 28, 2025 16:23
@gante gante added the for patch Tag issues / labels that should be included in the next patch label Mar 28, 2025
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for digging and fixing this quickly!

@ArthurZucker ArthurZucker merged commit 9fd9476 into huggingface:main Mar 28, 2025
20 of 21 checks passed
ArthurZucker pushed a commit that referenced this pull request Mar 28, 2025
* handle jagged beams

* better comment

* bart -- beam search tests print special tokens

* more bart test updates

* more tests!

* better comment
@gante gante deleted the beam_search_jagged_beams branch March 28, 2025 18:44
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* handle jagged beams

* better comment

* bart -- beam search tests print special tokens

* more bart test updates

* more tests!

* better comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for patch Tag issues / labels that should be included in the next patch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants