-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
OOM Issues in MMLU Evaluation with lm_eval Using vllm as Backend
#2490
opened Nov 14, 2024 by
wchen61
Cannot reproduce LLaMA 3 8B on hendrycks_math
validation
For validation of task implementations.
#2479
opened Nov 11, 2024 by
liuxiaozhu01
Why Different Versions Make a Big Difference in HellaSwag zero-shot
validation
For validation of task implementations.
#2478
opened Nov 11, 2024 by
cquxl
Issue with Perplexity Score Using
max_length > 1024
in lm-evaluation-harness
#2467
opened Nov 7, 2024 by
fnusid
Evaluate local tasks failed when using lm-eval --tasks <local-folder>
#2463
opened Nov 7, 2024 by
ChenXiaoTemp
auto-detect batchsize finding too large a batchsize to fit in VRAM at the end when used with multi-gpu
bug
Something isn't working.
#2458
opened Nov 5, 2024 by
SmerkyG
Why is using vLLM via lm-eval-harness slower than using vLLM directly?
asking questions
For asking for clarification / support on library usage.
#2445
opened Oct 30, 2024 by
WuXnkris
Wrong format of the few-shot examples in mgsm_direct tasks
good first issue
Good for newcomers
validation
For validation of task implementations.
#2444
opened Oct 30, 2024 by
zxcvuser
Improve preprocessing for paws-x and xnli tasks
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#2442
opened Oct 30, 2024 by
zxcvuser
vllm with tensor_parallel_mode is not working at all because of multiprocessing problem
#2431
opened Oct 28, 2024 by
95jinchul
Task winogrande does not work in 0-shot setting together with --apply_chat_template
#2430
opened Oct 27, 2024 by
ArtemBiliksin
Llama3.1-8B-Instruct evaluation fails
asking questions
For asking for clarification / support on library usage.
#2428
opened Oct 25, 2024 by
Isaaclgz
test speculative decode accuracy
asking questions
For asking for clarification / support on library usage.
#2424
opened Oct 24, 2024 by
baoqianmagik
Question related to how to use the validation and training splits.
asking questions
For asking for clarification / support on library usage.
#2423
opened Oct 24, 2024 by
sorobedio
bbh_zeroshot fails during to a custom filter issue.
bug
Something isn't working.
#2422
opened Oct 23, 2024 by
shamanez
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.