Skip to content

Commit 0bd5ff5

Browse files
authored
Fix accuracy test config and add DeepSeek-V2-Lite test (#2261)
### What this PR does / why we need it? This PR fix accuracy test related to #2073, users can now perform accuracy tests on multiple models simultaneously and generate different report files by running: ```bash cd ~/vllm-ascend pytest -sv ./tests/e2e/models/test_lm_eval_correctness.py \ --config-list-file ./tests/e2e/models/configs/accuracy.txt ``` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? <img width="1648" height="511" alt="image" src="https://github.com/user-attachments/assets/1757e3b8-a6b7-44e5-b701-80940dc756cd" /> - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: Icey <1790571317@qq.com>
1 parent ad10837 commit 0bd5ff5

13 files changed

+46
-418
lines changed

.github/workflows/accuracy_test.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ jobs:
7070
runner: linux-aarch64-a2-1
7171
- model_name: Qwen3-30B-A3B
7272
runner: linux-aarch64-a2-2
73+
- model_name: DeepSeek-V2-Lite
74+
runner: linux-aarch64-a2-2
7375
fail-fast: false
7476

7577
name: ${{ matrix.model_name }} accuracy
@@ -200,9 +202,8 @@ jobs:
200202
markdown_name="${model_base_name}"
201203
echo "markdown_name=$markdown_name" >> $GITHUB_OUTPUT
202204
mkdir -p ./benchmarks/accuracy
203-
pytest -sv ./tests/e2e/singlecard/models/test_lm_eval_correctness.py \
204-
--config ./tests/e2e/singlecard/models/configs/${{ matrix.model_name }}.yaml \
205-
--report_output ./benchmarks/accuracy/${model_base_name}.md
205+
pytest -sv ./tests/e2e/models/test_lm_eval_correctness.py \
206+
--config ./tests/e2e/models/configs/${{ matrix.model_name }}.yaml
206207
207208
- name: Generate step summary
208209
if: ${{ always() }}
@@ -312,7 +313,7 @@ jobs:
312313
head: `vllm-ascend-ci:${{ env.BRANCH_NAME }}`,
313314
base: '${{ github.event.inputs.vllm-ascend-version }}',
314315
title: `[Doc] Update accuracy reports for ${{ github.event.inputs.vllm-ascend-version }}`,
315-
body: `The accuracy results running on NPU Altlas A2 have changed, updating reports for: All models (Qwen/Qwen3-30B-A3B, Qwen2.5-VL-7B-Instruct, Qwen3-8B-Base)
316+
body: `The accuracy results running on NPU Altlas A2 have changed, updating reports for: All models (Qwen3-30B-A3B, Qwen2.5-VL-7B-Instruct, Qwen3-8B-Base, DeepSeek-V2-Lite)
316317
317318
- [Workflow run][1]
318319

.github/workflows/vllm_ascend_test.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -211,8 +211,7 @@ jobs:
211211
--ignore=tests/e2e/singlecard/test_embedding.py \
212212
--ignore=tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py \
213213
--ignore=tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py \
214-
--ignore=tests/e2e/singlecard/test_offline_inference_310p.py \
215-
--ignore=tests/e2e/singlecard/models/test_lm_eval_correctness.py
214+
--ignore=tests/e2e/singlecard/test_offline_inference_310p.py
216215
e2e-2-cards:
217216
needs: [e2e]
218217
if: ${{ needs.e2e.result == 'success' }}

.github/workflows/vllm_ascend_test_long_term.yaml

Lines changed: 0 additions & 102 deletions
This file was deleted.

tests/e2e/long_term/accuracy/accuracy_multicard.py

Lines changed: 0 additions & 167 deletions
This file was deleted.

0 commit comments

Comments
 (0)