Skip to content

Feature Request: [CANN] backend supports Ascend 310P #10160

Closed
@leo-pony

Description

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

CANN backend supports Ascend 310P inference accelerator card. Currently, llama.cpp already supports Ascend 910B. However, some APIs of Ascend 910B are different from those of 310P, so they need to be adapted in CANN backend implementation.

Motivation

Compare to Ascend 910, Ascend 310 focuses on power-efficient inference on edge devices, The basic information as following:
Inference-Oriented: The 310P is optimized for inference tasks, focusing more on efficient and low-power operations, rather than the computational intensity required for training.
- Lower Throughput: While it supports similar operators for inference tasks (e.g., convolution, activation functions, pooling, etc.), it is not as heavily optimized for large-scale parallelism and training tasks. The 310P focuses on executing pre-trained models with lower computational demands.

Possible Implementation

The CANN backend adapts to 310P and maintains compatibility with 910B.

Metadata

Assignees

Labels

Ascend NPUissues specific to Ascend NPUsenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions