Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix prompt cache saving and chat-persistent rollover #1678

Merged
merged 2 commits into from
Jun 3, 2023

Conversation

ejones
Copy link
Collaborator

@ejones ejones commented Jun 3, 2023

Fixes #1670, by reworking the original fix for #1585 from #1609.

The original fix examined embd to determine if the prompt had been evaluated, but embd is limited to the batch size. In addition, that fix left session_tokens in its original state (i.e., the longer, cached prompt), while normal session evaluation truncates it at the first eval. This combination meant that any prompts with a cache hit on just the first batch (512 by default) would begin eval-ing ~from the second batch, and all of that eval would get appended to the end of the full, original cached prompt. This had the downstream effect of diverging the cache from the prompt and overrunning the context size in the cache, as seen in #1670.

For the fix, I opted to move the re-eval logic to main's initialization rather than at the eval stage. Here, it transforms session_tokens such that it will only match (prompt - 1) tokens.

Testing:

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

examples/main/main.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@DannyDaemonic DannyDaemonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clever fix. Feel free to merge after the suggested size() to !empty() fix.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@ejones ejones merged commit 136476e into ggerganov:master Jun 3, 2023
@ejones
Copy link
Collaborator Author

ejones commented Jun 3, 2023

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chat-persistent.sh not rotating cache files correctly
2 participants