Adding support for prompt lookup decoding (variant of assisted generation)

### Feature request

Recently proposed method prompt lookup decoding, which replaces the draft model with string matching in prompt

Code: https://github.com/apoorvumang/prompt-lookup-decoding

### Motivation

- The method gives significant speedups in input grounded tasks (2x-4x)
- Applicable to all decoder models, supports sampling
- Easy to implement - we can just modify assisted generation to also support a function for assistant model (rather than a LLM)

### Your contribution

I have a not-so-well written implementation [here](https://github.com/apoorvumang/prompt-lookup-decoding/blob/main/demo-pld.ipynb) (python notebook). I can contribute in making it better, but will need help since its my first time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding support for prompt lookup decoding (variant of assisted generation) #27722

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding support for prompt lookup decoding (variant of assisted generation) #27722

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions