Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support tensor parallel #1599

Merged
merged 4 commits into from
Mar 5, 2024

Conversation

minhthuc2502
Copy link
Collaborator

@minhthuc2502 minhthuc2502 commented Jan 11, 2024

WIP for the feature tensor parallel. There are some points to investigate:

  • Make new version of converter to move forward the number heads before the appearance of weight, bias in self attention to deal with group query attention.
  • Packaging python wrapper: how to deal with MPI and NCCL when packaging

Update:

  • Fully support tensor parallel mode. The model could be split between GPUs on the same machine and even on machines different by configuring more network and NFS things.

@duydq12
Copy link

duydq12 commented Jan 12, 2024

LGTM. It helps me a lot. I'm looking forward to seeing the full release version.

@minhthuc2502 minhthuc2502 marked this pull request as draft February 15, 2024 13:39
@minhthuc2502 minhthuc2502 force-pushed the dev/tensor_parallel_nvcc branch 2 times, most recently from 71b440c to 5e9dc28 Compare February 28, 2024 15:21
@minhthuc2502 minhthuc2502 marked this pull request as ready for review March 1, 2024 15:32
@minhthuc2502 minhthuc2502 changed the title tensor parallel by nccl + mpi support tensor parallel Mar 2, 2024
@minhthuc2502 minhthuc2502 merged commit cec65f1 into OpenNMT:master Mar 5, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants