Support Mistral Model Inference with transformers-neuronx #3153

DAIZHENWEI · 2024-03-02T01:30:01Z

This PR enables mistral model inference on Inferentia with transformers-neuronx backend.

To demonstrate offline inference with transformers-neuronx, run

python3 examples/offline_inference_neuron.py

vllm/model_executor/models/__init__.py

liangfu

@DAIZHENWEI Thank you for your contribution. I think the overall changes looks good. I left some comments on undo unnecessary changes.

examples/offline_inference_neuron.py

vllm/model_executor/models/__init__.py

vllm/model_executor/models/neuron/mistral.py

…nto neuron_mistral

liangfu · 2024-03-05T21:17:22Z

Thanks for addressing the comments. The format.sh script here would help fix the format issue:
https://github.com/vllm-project/vllm/blob/main/format.sh

DAIZHENWEI · 2024-03-06T19:24:42Z

Thanks for addressing the comments. The format.sh script here would help fix the format issue: https://github.com/vllm-project/vllm/blob/main/format.sh

@liangfu The format issue has been fixed. Ready to Merge.

examples/offline_inference_neuron.py

vllm/model_executor/models/neuron/mistral.py

liangfu

Thanks for addressing the comments. The changes look good to me.

examples/offline_inference_neuron.py

DAIZHENWEI · 2024-03-11T17:34:00Z

@liangfu @WoosukKwon ready to merge

WoosukKwon

LGTM! Thanks for submitting the PR!

…ct#3153)

Ubuntu added 3 commits March 2, 2024 01:08

support_mistral_for_neuron

68afeb9

support_mistral_for_neuron

166f471

support_mistral_for_neuron

3e5b19a

WoosukKwon added the Inferentia/Trainium label Mar 2, 2024

Ubuntu and others added 4 commits March 4, 2024 17:46

add_mistral_support

2537040

Merge branch 'vllm-project:main' into neuron_mistral

7ec51cb

add_mistral_support

ffda652

support_mistral_model

589085d

WoosukKwon reviewed Mar 4, 2024

View reviewed changes

vllm/model_executor/models/__init__.py Outdated Show resolved Hide resolved

liangfu suggested changes Mar 4, 2024

View reviewed changes

examples/offline_inference_neuron.py Outdated Show resolved Hide resolved

vllm/model_executor/models/__init__.py Outdated Show resolved Hide resolved

vllm/model_executor/models/neuron/mistral.py Outdated Show resolved Hide resolved

Ubuntu and others added 4 commits March 4, 2024 22:20

add_mistral_support

237560b

Merge branch 'vllm-project:main' into neuron_mistral

92cf077

add_mistral_support

0bdfb73

Merge branch 'neuron_mistral' of https://github.com/DAIZHENWEI/vllm i…

4f1692a

…nto neuron_mistral

support_mistral_inference

e17b047

liangfu suggested changes Mar 6, 2024

View reviewed changes

examples/offline_inference_neuron.py Outdated Show resolved Hide resolved

vllm/model_executor/models/neuron/mistral.py Outdated Show resolved Hide resolved

vllm/model_executor/models/neuron/mistral.py Outdated Show resolved Hide resolved

Ubuntu and others added 3 commits March 6, 2024 23:12

support_mistral_inference

b16a451

support_mistral_inference

7b50319

Merge branch 'vllm-project:main' into neuron_mistral

d6224b3

liangfu approved these changes Mar 7, 2024

View reviewed changes

examples/offline_inference_neuron.py Outdated Show resolved Hide resolved

Merge branch 'vllm-project:main' into neuron_mistral

ca4ec04

WoosukKwon self-requested a review March 11, 2024 17:55

WoosukKwon added 2 commits March 11, 2024 18:41

Merge branch 'main' into neuron_mistral

06d154e

yapf

f2ca269

WoosukKwon approved these changes Mar 11, 2024

View reviewed changes

WoosukKwon merged commit 654865e into vllm-project:main Mar 11, 2024
24 checks passed

starmpcc pushed a commit to starmpcc/vllm that referenced this pull request Mar 14, 2024

Support Mistral Model Inference with transformers-neuronx (vllm-proje…

f6f2783

…ct#3153)

dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024

Support Mistral Model Inference with transformers-neuronx (vllm-proje…

e11edb9

…ct#3153)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Mistral Model Inference with transformers-neuronx #3153

Support Mistral Model Inference with transformers-neuronx #3153

DAIZHENWEI commented Mar 2, 2024

liangfu left a comment

liangfu commented Mar 5, 2024

DAIZHENWEI commented Mar 6, 2024 •

edited

Loading

liangfu left a comment

DAIZHENWEI commented Mar 11, 2024

WoosukKwon left a comment

Support Mistral Model Inference with transformers-neuronx #3153

Support Mistral Model Inference with transformers-neuronx #3153

Conversation

DAIZHENWEI commented Mar 2, 2024

liangfu left a comment

Choose a reason for hiding this comment

liangfu commented Mar 5, 2024

DAIZHENWEI commented Mar 6, 2024 • edited Loading

liangfu left a comment

Choose a reason for hiding this comment

DAIZHENWEI commented Mar 11, 2024

WoosukKwon left a comment

Choose a reason for hiding this comment

DAIZHENWEI commented Mar 6, 2024 •

edited

Loading