enhance auto-round eval with vllm backend #815

xin3he · 2025-09-11T02:21:44Z

It was tested on HPU and CUDA. GGUF is not supported since vllm doesn't allow passing model object, only model string is allowed.

Command: VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_SKIP_WARMUP=true auto-round --model facebook/opt-125m --eval --vllm --tasks lambada_openai --limit 10

Copilot

Pull Request Overview

This PR enhances the auto-round evaluation functionality by adding support for vLLM backend for model evaluation. The change allows users to leverage vLLM's optimized inference capabilities for running evaluations on CUDA and HPU devices.

Added comprehensive vLLM argument support for model evaluation
Integrated vLLM backend option into the CLI evaluation workflow
Added test coverage for the new vLLM evaluation functionality

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
auto_round/script/llm.py	Added vLLM-specific arguments and eval_with_vllm function for vLLM backend evaluation
auto_round/main.py	Added vLLM backend routing in the evaluation entry point
test/test_cuda/test_vllm.py	Added integration test for vLLM evaluation functionality
docs/step_by_step.md	Added documentation note about using --vllm flag

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

auto_round/script/llm.py

Signed-off-by: xinhe3 <xinhe3@habana.ai>

WeiweiZhang1 · 2025-09-11T05:28:52Z

how about support quantize & eval with vllm mode at the same time, for example:
--model /$model_dir/${model}
--iters 0
--scheme "W4A16"
--eval_bs 16
--tasks 'lambada_openai,hellaswag,piqa,winogrande,mmlu'
--format fake
--vllm ## like this

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

xin3he · 2025-09-11T05:56:26Z

how about support quantize & eval with vllm mode at the same time, for example: --model /$model_dir/${model} --iters 0 --scheme "W4A16" --eval_bs 16 --tasks 'lambada_openai,hellaswag,piqa,winogrande,mmlu' --format fake --vllm ## like this

Hi @WeiweiZhang1 It's not supported to load a model object for evaluation, only string is allowed. Please refer to the requirement of lm_eval here.
https://github.com/EleutherAI/lm-evaluation-harness/blob/3bc7cc8a72c66bac8d5b830cb3ccec9a5f691b12/lm_eval/models/vllm_causallms.py#L114

wenhuach21 · 2025-09-11T05:59:07Z

it's a little difficult to infer the meaning --vllm, I'd suggest to add an arg lm_eval_backend="hf"/"vllm"/"sglang"(TODO)

auto_round/script/llm.py

docs/step_by_step.md

xin3he · 2025-09-12T02:02:17Z

it's a little difficult to infer the meaning --vllm, I'd suggest to add an arg lm_eval_backend="hf"/"vllm"/"sglang"(TODO)

I agree, we can update like that when more backends are added.

* support lm_eval vllm backend --------- Signed-off-by: xinhe3 <xinhe3@habana.ai> Co-authored-by: xinhe3 <xinhe3@habana.ai> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

xin3he requested review from WeiweiZhang1, Copilot, n1ck-guo and wenhuach21 and removed request for n1ck-guo and wenhuach21 September 11, 2025 02:28

Copilot AI reviewed Sep 11, 2025

View reviewed changes

auto_round/script/llm.py Outdated Show resolved Hide resolved

auto_round/script/llm.py Outdated Show resolved Hide resolved

auto_round/script/llm.py Show resolved Hide resolved

WeiweiZhang1 reviewed Sep 11, 2025

View reviewed changes

auto_round/script/llm.py Outdated Show resolved Hide resolved

WeiweiZhang1 requested review from wenhuach21 and removed request for wenhuach21 September 11, 2025 02:38

support lm_eval vllm backend

04fdb94

Signed-off-by: xinhe3 <xinhe3@habana.ai>

xin3he force-pushed the xinhe/vllm_lm_eval branch from 53633f0 to 04fdb94 Compare September 11, 2025 03:18

xin3he requested a review from WeiweiZhang1 September 11, 2025 03:18

Update auto_round/script/llm.py

2c8a1ef

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

wenhuach21 requested a review from yiliu30 September 11, 2025 05:59

yiliu30 reviewed Sep 11, 2025

View reviewed changes

auto_round/script/llm.py Show resolved Hide resolved

auto_round/script/llm.py Show resolved Hide resolved

docs/step_by_step.md Show resolved Hide resolved

xin3he requested a review from yiliu30 September 12, 2025 06:20

yiliu30 approved these changes Sep 12, 2025

View reviewed changes

xin3he merged commit d74f4bc into main Sep 16, 2025
7 of 8 checks passed

xin3he deleted the xinhe/vllm_lm_eval branch September 16, 2025 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enhance auto-round eval with vllm backend #815

enhance auto-round eval with vllm backend #815

Uh oh!

xin3he commented Sep 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WeiweiZhang1 commented Sep 11, 2025 •

edited

Loading

Uh oh!

xin3he commented Sep 11, 2025

Uh oh!

wenhuach21 commented Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xin3he commented Sep 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

enhance auto-round eval with vllm backend #815

enhance auto-round eval with vllm backend #815

Uh oh!

Conversation

xin3he commented Sep 11, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WeiweiZhang1 commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xin3he commented Sep 11, 2025

Uh oh!

wenhuach21 commented Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xin3he commented Sep 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

WeiweiZhang1 commented Sep 11, 2025 •

edited

Loading