Fix prompt cache saving and chat-persistent rollover #1678

ejones · 2023-06-03T05:13:36Z

Fixes #1670, by reworking the original fix for #1585 from #1609.

The original fix examined embd to determine if the prompt had been evaluated, but embd is limited to the batch size. In addition, that fix left session_tokens in its original state (i.e., the longer, cached prompt), while normal session evaluation truncates it at the first eval. This combination meant that any prompts with a cache hit on just the first batch (512 by default) would begin eval-ing ~from the second batch, and all of that eval would get appended to the end of the full, original cached prompt. This had the downstream effect of diverging the cache from the prompt and overrunning the context size in the cache, as seen in #1670.

For the fix, I opted to move the re-eval logic to main's initialization rather than at the eval stage. Here, it transforms session_tokens such that it will only match (prompt - 1) tokens.

Testing:

for chat-persistent.sh not rotating cache files correctly #1670, conducted a long chat with 30B, past the context rotation
for Bug when prompt stored in --prompt-cache is longer than the new one #1585, applied the Z/joke test and got a joke that did not start with "Z"

…#1670)

github-actions

clang-tidy made some suggestions

examples/main/main.cpp

DannyDaemonic

This is a clever fix. Feel free to merge after the suggested size() to !empty() fix.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

ejones · 2023-06-03T11:28:59Z

Thanks!

Fix prompt cache saving and chat-persistent rollover (fixes ggerganov…

c812ff2

…#1670)

ejones requested a review from DannyDaemonic June 3, 2023 05:13

ejones mentioned this pull request Jun 3, 2023

chat-persistent.sh not rotating cache files correctly #1670

Closed

github-actions bot reviewed Jun 3, 2023

View reviewed changes

examples/main/main.cpp Outdated Show resolved Hide resolved

DannyDaemonic approved these changes Jun 3, 2023

View reviewed changes

clang-tidy

fb14faf

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

ejones merged commit 136476e into ggerganov:master Jun 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix prompt cache saving and chat-persistent rollover #1678

Fix prompt cache saving and chat-persistent rollover #1678

ejones commented Jun 3, 2023

github-actions bot left a comment

DannyDaemonic left a comment

ejones commented Jun 3, 2023

Fix prompt cache saving and chat-persistent rollover #1678

Fix prompt cache saving and chat-persistent rollover #1678

Conversation

ejones commented Jun 3, 2023

github-actions bot left a comment

Choose a reason for hiding this comment

DannyDaemonic left a comment

Choose a reason for hiding this comment

ejones commented Jun 3, 2023