[Neuron] Add an option to build with neuron #2065

liangfu · 2023-12-12T19:09:24Z

This PR adds an option that setup vLLM to build with Neuron toolchain (include neuronx-cc and transformers-neuronx).

This would help us build

vllm-0.2.7+neuron212

, where the neuron version comes out of the compiler version (neuronx-cc 2.12).

This is part of the effort to add support to accelerate LLM inference with Trainium/Inferentia (see #1866) .

WoosukKwon

Hi @liangfu, apologies for the late review and thanks for the PR! I like this PR in that you didn't submit a big PR at once but instead split it into small parts. :)

Overall, I think moving the import statements is not a good idea. Considering the architecture you showed last time, I think we can just skip loading the modules that try to import custom ops. WDYT?

requirements-neuron.txt

setup.py

vllm/model_executor/layers/activation.py

setup.py

WoosukKwon

@liangfu Apologies for the late review and thanks for addressing my comments! Left some very minor comments on styles. Looking forward to the next PRs!

setup.py

vllm/utils.py

Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

liangfu mentioned this pull request Dec 12, 2023

[RFC] Initial Support for AWS Inferentia #1866

Closed

5 tasks

liangfu force-pushed the neuron-1 branch from 80eeab7 to 3ebdc73 Compare December 12, 2023 23:14

WoosukKwon added the aws-neuron Related to AWS Inferentia & Trainium label Dec 13, 2023

WoosukKwon self-requested a review December 21, 2023 06:27

WoosukKwon reviewed Dec 21, 2023

View reviewed changes

requirements-neuron.txt Outdated Show resolved Hide resolved

requirements-neuron.txt Outdated Show resolved Hide resolved

setup.py Outdated Show resolved Hide resolved

vllm/model_executor/layers/activation.py Outdated Show resolved Hide resolved

setup.py Outdated Show resolved Hide resolved

liangfu added 6 commits January 16, 2024 00:03

add an option to build with neuron

95a92e7

lazy import cuda dependencies

9a1cea1

bug fix

10c6648

undo changes to model_executor/layers

f04e29e

undo changes to worker/cache_engine.py

ac33ef4

update requirements file

e72e7f1

liangfu force-pushed the neuron-1 branch from 66f1165 to e72e7f1 Compare January 16, 2024 00:43

fix neuronx-cc version bug in setup.py

e593bb8

WoosukKwon approved these changes Jan 17, 2024

View reviewed changes

setup.py Outdated Show resolved Hide resolved

vllm/utils.py Show resolved Hide resolved

liangfu and others added 2 commits January 17, 2024 09:59

Update vllm/utils.py

759694e

Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

Update setup.py

523bef9

WoosukKwon merged commit 18473cf into vllm-project:main Jan 18, 2024

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Jan 18, 2024

[Neuron] Add an option to build with neuron (vllm-project#2065)

3d96adf

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

[Neuron] Add an option to build with neuron (vllm-project#2065)

a65b7c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Neuron] Add an option to build with neuron #2065

[Neuron] Add an option to build with neuron #2065

Uh oh!

liangfu commented Dec 12, 2023 •

edited

Loading

Uh oh!

WoosukKwon left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WoosukKwon left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Neuron] Add an option to build with neuron #2065

[Neuron] Add an option to build with neuron #2065

Uh oh!

Conversation

liangfu commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WoosukKwon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WoosukKwon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liangfu commented Dec 12, 2023 •

edited

Loading