Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around for recalculating logits in cached prompts (Fixes #1585) #1609

Merged
merged 3 commits into from
May 29, 2023

Conversation

DannyDaemonic
Copy link
Contributor

This code just checks if we've used up all the given prompt but still have leftover tokens from the cache. In this case it instead leaves the last token in embd so it will be evaluated.

Evaluating one token isn't the end of the world, but it would be nice if it were faster. What I really wanted to do was something like this:

    llama_eval(ctx, embd.data(), 0, n_past - 1, params.n_threads);

But you can't eval 0 tokens.

I tried to edit llama_eval_internal to skip over the input calculations where n_past (N) is 0, but it's actually way more integrated than I expected. Perhaps there was something to #1281 after all. I think it would be nicer if llama_eval just supported 0 tokens, but it would also be nice to have a different api call that would evaluate the logits at a given position n_past.

This API call could also be used for implementing the mockup in #1608 where you can click on a token and see alternatives.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

examples/main/main.cpp Outdated Show resolved Hide resolved
@KerfuffleV2
Copy link
Collaborator

I can no longer reproduce the problem using the steps described here: #1550 (comment)

Also, as far as I can see prompt caching still works.

I don't know enough about the code to actually review this pull, but I can confirm that from a user's perspective it seems to fix the problem and doesn't break other stuff.

@SlyEcho
Copy link
Collaborator

SlyEcho commented May 29, 2023

Yeah, I discovered the same "fix" for #1570

@SlyEcho
Copy link
Collaborator

SlyEcho commented May 29, 2023

Merge in the latest master to fix the build errors.

@DannyDaemonic
Copy link
Contributor Author

@SlyEcho Thanks, that fixed the CI errors. For some reason I thought I could just wait until things were fixed and then rerun the checks.

@DannyDaemonic DannyDaemonic merged commit 2483676 into ggerganov:master May 29, 2023
@DannyDaemonic DannyDaemonic deleted the cached-logits-bandaid branch May 29, 2023 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants