Closed
Description
Add an example implementing the "Prompt Lookup Decoding" technique:
https://github.com/apoorvumang/prompt-lookup-decoding
This should be a great exercise for people looking to become familiar with llama.cpp
's KV cache management and batched decoding API. Looking for contributions.
The following examples can be used as starting points:
speculative
lookahead
batched