Feature Request: [CANN] backend supports Ascend 310P

### Prerequisites

- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [X] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

CANN backend supports Ascend 310P inference accelerator card. Currently, llama.cpp already supports Ascend 910B. However, some APIs of Ascend 910B are different from those of 310P, so they need to be adapted in CANN backend implementation.

### Motivation

Compare to Ascend 910, Ascend 310 focuses on power-efficient inference on edge devices, The basic information as following:
**Inference-Oriented**: The 310P is optimized for **inference tasks**, focusing more on efficient and low-power operations, rather than the computational intensity required for training.
    - **Lower Throughput**: While it supports similar operators for inference tasks (e.g., convolution, activation functions, pooling, etc.), it is not as heavily optimized for **large-scale parallelism** and training tasks. The 310P focuses on executing pre-trained models with lower computational demands.

### Possible Implementation

The CANN backend adapts to 310P and maintains compatibility with 910B.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: [CANN] backend supports Ascend 310P #10160

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development