Generate: `assisted_decoding` now accepts arbitrary candidate generators #27750

gante · 2023-11-28T18:34:26Z

What does this PR do?

A common trend is starting to pop up: people are experimenting with new strategies to generate candidate sequences, to then run an assisted-generation-like strategy. A key example is the new technique in #27722, which is equal to assisted_decoding except for the candidate generation part. This technique in particular achieves nice speedups in some settings, and doesn't need an assistant model -- a model-free speedup!

To facilitate experimentation and the addition of new candidate generation techniques, this PR abstracts the candidate generation part in assisted_decoding to a new class with a stable API. This was inspired in classes like LogitsProcessor or StoppingCriteria -- components of generate that can easily be replaced. All these changes are backwards compatible! 🤗

Suggested review order:

utils.py, to see the shape of assisted_decoding under the abstracted API
candidate.py, to see the structure of the new base class (and the specific case of the original assisted generation)

The following tests are passing:

RUN_SLOW=1 py.test tests/models/whisper/ -k speculative
py.test tests/ -k test_assisted (which catches mixin and integration tests associated with assisted generation)

Happy to add more tests if needed :)

gante · 2023-11-28T18:40:39Z

src/transformers/generation/candidates.py

Note: these functions were moved here to avoid circular imports

might be good to use the new cache format no?

soon! (we still need to maintain retrocompatibility)

gante · 2023-11-28T18:41:02Z

src/transformers/generation/utils.py

This function will be expanded as we add more CandidateGenerator :)

so will the logic here be something like

check params in generation_config (some if else condition)

based on params, set candidate_generator

@apoorvumang exactly!

got it, so I'll write a PromptLookupCandidateGenerator that implements CandidateGenerator, and then wire it up in this function

So the plan is to have similar checks to the ones we have for the supported logits processor I guess?

@ArthurZucker precisely, we will have flags to control which candidate generation strategies we have in place. I suspect that, because some candidate generation strategies are so cheap (like the one proposed in #27722), assisted generation may become mainstream!

HuggingFaceDocBuilderDev · 2023-11-28T18:57:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

apoorvumang · 2023-11-29T16:21:27Z

One more place needs to change I think - generation_mode is currently set using _get_generation_mode , where this is the logic:

if assistant_model is not None:
            if generation_mode in ("greedy_search", "sample"):
                generation_mode = GenerationMode.ASSISTED_GENERATION
            else:
                raise ValueError(
                    "You've set `assistant_model`, which triggers assisted generate. Currently, assisted generate "
                    "is only supported with Greedy Search and Sample."
                )

how do u suggest this should change to support prompt lookup decoding? @gante

gante · 2023-11-30T11:28:44Z

@apoorvumang I'd add an or after if assistant_model is not None

ArthurZucker

🔥 would maybe rename the file with candiate_generators like we have logits_processor but otherwise great!

ArthurZucker · 2023-12-06T16:03:54Z

src/transformers/generation/utils.py

So the plan is to have similar checks to the ones we have for the supported logits processor I guess?

src/transformers/generation/utils.py

src/transformers/generation/candidates.py

ArthurZucker · 2023-12-08T13:19:59Z

src/transformers/generation/candidates.py

might be good to use the new cache format no?

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

…ors (huggingface#27750) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

gante mentioned this pull request Nov 28, 2023

Adding support for prompt lookup decoding (variant of assisted generation) #27722

Closed

gante requested a review from ArthurZucker November 28, 2023 18:37

gante commented Nov 28, 2023

View reviewed changes

ArthurZucker approved these changes Dec 8, 2023

View reviewed changes

gante and others added 6 commits December 11, 2023 19:57

MVP

b5289e2

fix ci

f676845

more ci

44d0844

remove redundant kwarg

be72667

Update src/transformers/generation/utils.py

2808cdf

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

rename file

a57367b

gante force-pushed the arbitrary_candidate_fn branch from 77a8b67 to a57367b Compare December 11, 2023 19:57

gante merged commit 4b759da into huggingface:main Dec 12, 2023

gante deleted the arbitrary_candidate_fn branch December 12, 2023 09:26

This was referenced Dec 12, 2023

Adding Prompt lookup decoding #27775

Merged

Generate: speculative decoding #27979

Merged

iantbutler01 pushed a commit to BismuthCloud/transformers that referenced this pull request Dec 16, 2023

Generate: assisted_decoding now accepts arbitrary candidate generat…

f279f68

…ors (huggingface#27750) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Generate: assisted_decoding now accepts arbitrary candidate generators #27750

Generate: assisted_decoding now accepts arbitrary candidate generators #27750

Uh oh!

Conversation

gante commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2023

Uh oh!

apoorvumang commented Nov 29, 2023

Uh oh!

gante commented Nov 30, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Generate: `assisted_decoding` now accepts arbitrary candidate generators #27750

Generate: `assisted_decoding` now accepts arbitrary candidate generators #27750

gante commented Nov 28, 2023 •

edited

Loading