-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable autoTP for MPT #3861
enable autoTP for MPT #3861
Conversation
@yao-matrix @delock please help review |
Hi @tjruwase is @molly-smith owner of AutoTP related changes? There are contribution to enable AutoTP for different LLM model, should we invite Molly to review? |
@sywangyi can you also add mpt into supported models in docs? |
done |
@delock, sure I will ask for Molly for help. Thanks! |
@delock, July 4th is this week and so there might be some delay with reviews as team members take some time off. Apologies for the inconvenience. |
Got it, thanks for reminding! |
@tjruwase @sywangyi @delock AutoTP creates a dictionary of which model layers require all reduce and is implemented in deepspeed/module_inject/auto_tp.py. This part is actually already functional for MPT. However, I assume the lower level implementation of tensor parallelism is not functional for MPT. I am not the code owner for this part of TP. I think @RezaYazdaniAminabadi is the right person to review these changes. |
@molly-smith, thanks for the clarification. @RezaYazdaniAminabadi, can you please help review this? |
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Hi @sywangyi, I was trying out deepspeed for MPT30B as well and saw you had added the changes. I just tried out your changes by building deepspeed from source by cloning the repository with changes and doing
I am getting the following error when replace_with_kernel_inject is False
and w/o it I am getting OOM with 4xA100 40GB machines. Anything I should be doing differently? |
I could run mpt30B 4TP successfully in 4 RTX 8000 using AutoTP, seem the attn_bias is incorrect in your side, could you have a check if https://github.com/microsoft/DeepSpeed/pull/3861/files#diff-27be33d45da8a29a59628046c212ecbdb630e85a9bd987a98431272f1472a3fbR63 is correctly called. |
I did a mistake in my setup with your changes. Can confirm it now works. Thanks |
@mrwyattii I have moved the model specific ops to a specific file. could you review the change? |
* enable autoTP for MPT Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add model specific func to auto_tp_model_utils.py Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
No description provided.