Skip to content

llama : lookahead decoding example #4157

Closed
@wsxiaoys

Description

@wsxiaoys

Claim providing 1.5~2x decoding speedup without a speculative model

Blog post: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
Twitter thread: https://twitter.com/lmsysorg/status/1727056892671950887
Reference implementation: https://github.com/hao-ai-lab/LookaheadDecoding/tree/main

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions