[Bug] Fix AssertionError: Do not capture `num_reqs > max_num_reqs` for uniform batch #25498

yewentao256 · 2025-09-23T18:11:36Z

Purpose

Test Plan

============ Serving Benchmark Result ============
Successful requests:                     1         
Benchmark duration (s):                  0.90      
Total input tokens:                      129999    
Total generated tokens:                  1         
Request throughput (req/s):              1.11      
Output token throughput (tok/s):         1.11      
Peak output token throughput (tok/s):    1.00      
Peak concurrent requests:                1.00      
Total Token throughput (tok/s):          144353.62 
---------------Time to First Token----------------
Mean TTFT (ms):                          899.48    
Median TTFT (ms):                        899.48    
P99 TTFT (ms):                           899.48    
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          0.00      
Median TPOT (ms):                        0.00      
P99 TPOT (ms):                           0.00      
---------------Inter-token Latency----------------
Mean ITL (ms):                           0.00      
Median ITL (ms):                         0.00      
P99 ITL (ms):                            0.00      
==================================================

Signed-off-by: yewentao256 <zhyanwentao@126.com>

gemini-code-assist

Code Review

This pull request attempts to fix an AssertionError related to num_reqs > max_num_reqs in uniform batches by modifying a call to _dummy_run. While the intention is correct, the proposed change introduces a new ZeroDivisionError under the same conditions that caused the original error. The root cause appears to be in how _dummy_run handles cases where the maximum number of requests is zero, which is not addressed by this change. A more robust solution would involve modifying _dummy_run to gracefully handle this edge case for all its call paths.

vllm/v1/worker/gpu_worker.py

mgoin · 2025-09-24T00:00:42Z

Superseded by #25505

revert _dummy_run back

75657c8

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners September 23, 2025 18:11

mergify bot added the v1 label Sep 23, 2025

gemini-code-assist bot reviewed Sep 23, 2025

View reviewed changes

vllm/v1/worker/gpu_worker.py Show resolved Hide resolved

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025

ywang96 approved these changes Sep 23, 2025

View reviewed changes

ywang96 mentioned this pull request Sep 23, 2025

[BugFix] AssertionError: Do not capture num_reqs > max_num_reqs for uniform batch #25505

Merged

mgoin closed this Sep 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] Fix AssertionError: Do not capture `num_reqs > max_num_reqs` for uniform batch #25498

[Bug] Fix AssertionError: Do not capture `num_reqs > max_num_reqs` for uniform batch #25498

Uh oh!

yewentao256 commented Sep 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mgoin commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Bug] Fix AssertionError: Do not capture num_reqs > max_num_reqs for uniform batch #25498

[Bug] Fix AssertionError: Do not capture num_reqs > max_num_reqs for uniform batch #25498

Uh oh!

Conversation

yewentao256 commented Sep 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mgoin commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Bug] Fix AssertionError: Do not capture `num_reqs > max_num_reqs` for uniform batch #25498

[Bug] Fix AssertionError: Do not capture `num_reqs > max_num_reqs` for uniform batch #25498

yewentao256 commented Sep 23, 2025 •

edited by github-actions bot

Loading