- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
[Bugfix] Fix DeepEP config for DP4TP4 #23619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix DeepEP config for DP4TP4 #23619
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes an assertion error in DeepEP by using the dispatcher count instead of the data-parallel size for the combine configuration. The change appears correct based on the issue description. However, I've identified a critical issue where the check for supported rank configurations remains inconsistent with the new value, which could lead to a runtime crash. I've provided a suggestion to correct this.
        
          
                vllm/model_executor/layers/fused_moe/deepep_ht_prepare_finalize.py
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | cc @tlrmchlsmth | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to change _get_dispatch_config too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like both of these should be passing in the ep_size - could you update the PR to pass that in @minosfuture ?
| @tlrmchlsmth I didn't hit the assertion failure for dispatch. But yea, let me update that and test. | 
b59b7e0    to
    f91bc9c      
    Compare
  
    Signed-off-by: Ming Yang <minos.future@gmail.com>
Signed-off-by: Ming Yang <minos.future@gmail.com>
Signed-off-by: Ming Yang <minos.future@gmail.com>
f91bc9c    to
    805017e      
    Compare
  
    Signed-off-by: Ming Yang <minos.future@gmail.com>
Signed-off-by: Ming Yang <minos.future@gmail.com>
Signed-off-by: Ming Yang <minos.future@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Ming Yang <minos.future@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
To fix the following assertion error, the rank count should be dispatcher count (EP count) instead of DP count.
Test Plan
Test DP4TP4EP16
Test Result
can run
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.