Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Add Dataset Update] KBL 2025
#3000 opened May 20, 2025 by abzb1 Loading…
Enable text-only evals for VLM models
#2999 opened May 19, 2025 by ysulsky Loading…
add Mbpp instruct
#2995 opened May 19, 2025 by baberabb Loading…
Output path fix
#2993 opened May 19, 2025 by Niccolo-Ajroldi Loading…
[Fix] Update resolve_hf_chat_template arguments
#2992 opened May 19, 2025 by fxmarty-amd Loading…
[longbench] fix metric calculation
#2983 opened May 14, 2025 by baberabb Loading…
use images with api models
#2981 opened May 14, 2025 by baberabb Loading…
Add mlqa tag to run all variants.
#2977 opened May 12, 2025 by ivanbaldo Loading…
add image hashing
#2973 opened May 11, 2025 by artemorloff Loading…
Adding resize images support
#2958 opened May 6, 2025 by artemorloff Loading…
add ultravox models support for audio tasks
#2957 opened May 6, 2025 by artemorloff Loading…
Final putnam axiom bm
#2946 opened May 1, 2025 by brando90 Loading…
feat: Add LIBRA benchmark for long-context evaluation
#2943 opened Apr 30, 2025 by karimovaSvetlana Loading…
4 tasks done
Fix gsm8k task to enhance accuracy
#2924 opened Apr 21, 2025 by hfadzxy Loading…
Added selection filter: take_last
#2923 opened Apr 18, 2025 by JamesClarke7283 Loading…
3 tasks done
enable few-shots for multimodal tasks
#2912 opened Apr 15, 2025 by artemorloff Loading…
Adapting multimodal tasks for phi3.5 vision
#2909 opened Apr 14, 2025 by artemorloff Loading…
Audio modality: add openbmb/MiniCPM-o-2_6 model
#2908 opened Apr 14, 2025 by artemorloff Loading…
enable evaluation from yaml config file
#2893 opened Apr 8, 2025 by artemorloff Loading…
Added AIME Support
#2892 opened Apr 8, 2025 by Zephyr271828 Loading…
2 of 3 tasks
Fix GPQA CoT n shot
#2888 opened Apr 7, 2025 by anmarques Loading…
ProTip! Updated in the last three days: updated:>2025-05-17.