Skip to content

Bump LightEval to enable DP>1 #629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 30, 2025
Merged

Bump LightEval to enable DP>1 #629

merged 4 commits into from
Apr 30, 2025

Conversation

lewtun
Copy link
Member

@lewtun lewtun commented Apr 28, 2025

This PR bumps lighteval to enable DP>1 again since it is compatible with vllm v0.8.4. See: huggingface/lighteval#670 (comment)

Waiting for evals to run, but code can be reviewed as is.

Update: evals finished and the diff is up to a few percentage points, but most on AIME24 which is rather noisy due to small sample size:

output-46

Code snippet to run all evals:

#!/bin/bash

MODELS=(
"DeepSeek-R1-Distill-Qwen-1.5B"
"DeepSeek-R1-Distill-Qwen-7B"
"DeepSeek-R1-Distill-Qwen-14B"
"DeepSeek-R1-Distill-Qwen-32B"
"DeepSeek-R1-Distill-Llama-8B"
"DeepSeek-R1-Distill-Llama-70B"
)

for M in "${MODELS[@]}"; do
  echo "Running benchmark for model: $M"
  python scripts/run_benchmarks.py --model-id deepseek-ai/$M --benchmarks aime24 math_500 gpqa lcb
done

TODO

  • Re-run evals to check score variance

@lewtun lewtun requested a review from edbeeching April 28, 2025 13:51
@@ -40,7 +40,7 @@ evaluate:
fi \
),))
$(if $(filter tensor,$(PARALLEL)),export VLLM_WORKER_MULTIPROC_METHOD=spawn &&,) \
MODEL_ARGS="pretrained=$(MODEL),dtype=bfloat16,$(PARALLEL_ARGS),max_model_length=32768,max_num_batched_tokens=32768,gpu_memory_utilization=0.8,generation_parameters={max_new_tokens:32768,temperature:0.6,top_p:0.95}" && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_num_batched_tokens is no longer needed to be included, so I've removed it for simplicity

@lewtun lewtun merged commit 75c3999 into main Apr 30, 2025
1 check passed
@lewtun lewtun deleted the bump-light branch April 30, 2025 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants