Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: Add Ascend NPU as a new backend #7692

Open
wangshuai09 opened this issue Aug 20, 2024 · 0 comments
Open

[RFC]: Add Ascend NPU as a new backend #7692

wangshuai09 opened this issue Aug 20, 2024 · 0 comments
Labels

Comments

@wangshuai09
Copy link
Contributor

wangshuai09 commented Aug 20, 2024

Motivation.

VLLM provides an easy-to-use backend access machanism and there are many backends have been integrated.
As shown in #6368, #6728, #6066, many users want to use vllm on Ascend NPU.
The main purpose of this RFC is to follow the existing backend access machanism and make Ascend NPU available for VLLM.

Proposed Change.

图片1

We introduce Ascend Executor/Worker(s) based on GPU Executor/Worker(s) as Ascend runtime management and worker on NPU. We also apply the Ascend Backend as the replacement of attention layer, the Page Attention/Flash Attention ops are implemented here.

图片2

Because torch_npu already natively supports torch since 2.1.0, we should try to keep it consistent with the GPU code and make the least code changes in our implements.

Feedback Period.

A month

CC List.

@mgoin
@WoosukKwon

Any Other Things.

Background

Ascend NPU is a range of AI processors using Neural Processing Unit. It will efficiently handle matrix-matrix multiplication, dot-product and scalars. There are many projects have supported Ascend NPU, such as onnxruntime, deepspeed, llama.cpp

MindIE is the Ascend inference engine, a high-performance deep learning inference framework, is designed based on Ascend hardware.

RoadMap

The initial version will include the following:

  • Ascend Executor
  • Ascend Worker
  • Ascend Model Runner
  • Ascend MindIE Backend
  • Ascend SingleOps Backend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant