-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Description
Description
Currently, the Dual Batch Overlap (DBO) implementation in vLLM hardcodes the number of microbatches to 2, which limits its applicability to only dual-batch overlap scenarios (e.g., wide EP use cases).
The core microbatch slicing capabilities (e.g., UBatchSlice, UBatchWrapper) underlying DBO are generic and reusable for multi-microbatch overlap scenarios (beyond 2 microbatches). For example:
- AF (Attention/Feed-forward) Disaggregation [RFC]: ATTN-FFN Disaggregation for MoE Models #22799 [Feature] AFD basic implemetation #29772 scenarios may require 3, 4, or more microbatches to be sliced and overlapped.
- Other emerging multi-batch overlap use cases would benefit from a generalized microbatch slicing framework.
Extending DBO to a flexible Multi-Batch Overlap (XBO) design (where "X" represents a configurable number of microbatches) will:
- Unlock support for multi-microbatch overlap scenarios
- Decouple microbatch slicing logic from hardcoded batch counts
- Make the core slicing capabilities a shared, reusable component in the codebase
Proposed Solution
I plan to refactor the existing DBO code to:
- Replace hardcoded references to "2 microbatches" with a configurable parameter (e.g.,
num_microbatches) - Generalize
UBatchSlice,UBatchWrapper, and related components to support arbitrary numbers of microbatches - Maintain backward compatibility with existing DBO use cases (default to 2 microbatches)
- Add validation for valid microbatch counts (e.g., minimum 2, positive integers)
Timeline
I will submit a PR implementing this refactor within the next few days.
Questions/Feedback
If anyone has questions about the design approach, edge cases to consider, or other requirements for multi-microbatch overlap support, please feel free to comment on this issue or reach out directly.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.