-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support MUSA (Moore Threads GPU) backend in accelerate #2917
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@muellerzr @SunMarc Hi, buddies! Can you take a look at this PR, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @fmo-mt ! The integration looks very clean ! Nice to see a new backend 🔥 Can you have a second look @muellerzr ? Also, i'm not sure if you are on the on team working on torch_musa but if that's the case, it would be great to spin some runners on your side to make sure that we don't have failing accelerate tests with musa hardware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as Marc, very nice PR @fmo-mt !
For the quality check to pass, please do |
Yes I'm working on torch_musa currently, and we have trained/fine-tuned some models like BERT, Mistral etc. |
@muellerzr @SunMarc Oh, I fixed a typo and force-pushed with rebase which clean the change history, but it seems that the CI workflow needs to be activate by you guys 🥲 |
No issues ! I'm merging ! |
What does this PR do?
To train 🤗 Transformers models using MUSA (Moore Threads GPU), the support should be added in Accelerate first and then will come in the Trainer for free.
This PR will support MUSA in accelerate, and we just followed how MLU supporting ( #2552 ) was merged.
Below are the output logs: