-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Mistral Model Inference with transformers-neuronx #3153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DAIZHENWEI Thank you for your contribution. I think the overall changes looks good. I left some comments on undo unnecessary changes.
Thanks for addressing the comments. The format.sh script here would help fix the format issue: |
@liangfu The format issue has been fixed. Ready to Merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments. The changes look good to me.
@liangfu @WoosukKwon ready to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for submitting the PR!
This PR enables mistral model inference on Inferentia with transformers-neuronx backend.
To demonstrate offline inference with transformers-neuronx, run
python3 examples/offline_inference_neuron.py