Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HellaSwag: speed up by parallelizing log-prob evaluation #5020

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

ikawrakow
Copy link
Contributor

After PR #5017 that has significantly improved the performance of HellaSwag evaluation via batching, the fraction of time spent in evaluating token log-probabilities in single threaded mode has become significant.

With this PR, this part of the calculation is parallelized.

For Mistral-7B and fp16, time on my system (32-core Ryze-5975WX + RTX-4080) goes down from 536 seconds after PR #5017 to 423 seconds for the full evaluation dataset (10042 tasks).

For reference, evaluation time before #5017 was 1285 seconds.

For Mistral-7B and fp16, time on my system goes down from 536 seconds
to 423 seconds for the full evaluation dataset (10042 tasks).
@ggerganov ggerganov merged commit 3e945cc into master Jan 18, 2024
39 of 47 checks passed
cebtenzzre added a commit that referenced this pull request Jan 19, 2024
cebtenzzre added a commit that referenced this pull request Jan 19, 2024
ggerganov pushed a commit that referenced this pull request Jan 20, 2024
* perplexity : fix MSVC build after #5020

* try a differerent fix
crasm pushed a commit that referenced this pull request Jan 23, 2024
* perplexity : fix MSVC build after #5020

* try a differerent fix
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
)

For Mistral-7B and fp16, time on my system goes down from 536 seconds
to 423 seconds for the full evaluation dataset (10042 tasks).

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
* perplexity : fix MSVC build after ggerganov#5020

* try a differerent fix
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
)

For Mistral-7B and fp16, time on my system goes down from 536 seconds
to 423 seconds for the full evaluation dataset (10042 tasks).

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* perplexity : fix MSVC build after ggerganov#5020

* try a differerent fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants