Skip to content

Conversation

@jiangkuaixue123
Copy link

@jiangkuaixue123 jiangkuaixue123 commented Dec 5, 2025

Purpose

As mentioned in #30105
This PR implements the extension of DBO to XBO in the codebase. I have added a command-line argument --num-of-microbatches to be used in conjunction with --enable-dbo.
At present, the core code modification for the DBO→XBO extension has been completed. The test results with 3 microbatches are still in progress and will be supplemented and attached to this PR as soon as they are available.
The main purpose of submitting this PR in advance is to seek feedback from reviewers on whether the current code modification approach is reasonable。

cc @SageMoore @LucasWilkinson

Usage

I have added the command-line argument --ubatch-size with a default value of 0. When --enable-dbo is enabled, --ubatch-size takes no effect. When --enable-dbo is disabled, if --ubatch-size is greater than 1, the ubatch process will be initiated with the corresponding number.

Test Result On gsm8k

Without ubatches

vllm serve "/home/dyvm6xra/dyvm6xrauser08/jcz/deepseek-v2-lite"  --data_parallel_size=2 --enable_expert_parallel
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.3682 ± 0.0188
strict-match 5 exact_match 0.3667 ± 0.0188

With 2 ubatches

deepep_high_throughput:

vllm serve "/home/dyvm6xra/dyvm6xrauser08/jcz/deepseek-v2-lite"  --data_parallel_size=2 --enable_expert_parallel \
        --enable-dbo --dbo-prefill-token-threshold 12 --dbo-decode-token-threshold 12 \
        --all2all-backend="deepep_high_throughput"
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.3712 ± 0.0188
strict-match 5 exact_match 0.3697 ± 0.0188

deepep_low_latency:

vllm serve "/home/dyvm6xra/dyvm6xrauser08/jcz/deepseek-v2-lite"  --data_parallel_size=2 --enable_expert_parallel \
        --enable-dbo --dbo-prefill-token-threshold 12 --dbo-decode-token-threshold 12 \
        --all2all-backend="deepep_low_latency"
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.3758 ± 0.0189
strict-match 5 exact_match 0.3727 ± 0.0188

The above experimental results indicate that the original DBO function works properly. Regarding microbatch splitting into 3 or 4 parts, its validity has been verified with the PR #29772.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends the Dual Batch Overlap (DBO) functionality to support a configurable number of microbatches (XBO). The changes primarily involve generalizing hardcoded values of 2 to a variable number of microbatches, which includes dynamically resizing lists for handles and buffers. The overall approach is sound, but there are a few areas for improvement regarding code clarity and robustness. Specifically, I've identified commented-out code that should be removed, debug logging statements that need to be cleaned up, and a fragile implementation using global variables that could be refactored for better maintainability.

Signed-off-by: jiangkuaixue123 <jiangxiaozhou111@163.com>
@mergify
Copy link

mergify bot commented Dec 5, 2025

Documentation preview: https://vllm--30120.org.readthedocs.build/en/30120/

@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: jiangkuaixue123 <jiangxiaozhou111@163.com>
@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@jiangkuaixue123 jiangkuaixue123 force-pushed the TBO branch 3 times, most recently from 6f9ca26 to 752260f Compare December 8, 2025 08:54
@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@jiangkuaixue123
Copy link
Author

@LucasWilkinson This PR is ready for review. Could you please help take a look? Thank you!

@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@jiangkuaixue123 jiangkuaixue123 force-pushed the TBO branch 3 times, most recently from 3a70348 to 4fd5301 Compare December 8, 2025 11:40
@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@hmellor
Copy link
Member

hmellor commented Dec 8, 2025

PRs with failing pre-commit are not merged.

@jiangkuaixue123
Copy link
Author

PRs with failing pre-commit are not merged.

I will fix it soon.

@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

@jiangkuaixue123 jiangkuaixue123 force-pushed the TBO branch 3 times, most recently from 6dfea50 to b47f798 Compare December 8, 2025 12:53
Signed-off-by: jiangkuaixue123 <jiangxiaozhou111@163.com>
@mergify
Copy link

mergify bot commented Dec 8, 2025

Hi @jiangkuaixue123, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: jiangkuaixue123 <jiangxiaozhou111@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants