LA-llama.cpp

Let's play LLM on LoongArch!

Overview

The project aims at porting and optimizing llama.cpp, a C++ LLM inference framework, on LoongArch. Especially, we want to tackle the following challenges:

Potential problems when porting the code on LoongArch platform.
Inference performance optimization via SIMD, temporarily targeting at 3A6000 platform.
LLM evaluation on LoongArch platform.
Interesting applications with presentation.

Plan

Based on the above challenges, the project can be divided into the following 4 stages:

Porting

Task: Port llama.cpp to LoongArch platform.
Objective: Compile and run llama.cpp on 3A6000.

Optimization

Task: Optimize the efficiency of llama.cpp on LoongArch (focus on CPU).
Objective: Apply programming optimization techniques and document the improvements.

Evaluation

Task: Benchmark various LLMs of different sizes.
Objective: Output a technical report.

Application

Task: Deploy usable applications with LLM on LoongArch platforms.
Objective: Output well-written deployment documents and visual demos.

Miscellaneous

We develop based on release b2430 of the original repo.

Progress and TODOs

[x] Compile original llama.cpp on x86 CPU. [ ] Run LLM on x86 CPU. [x] Set up QEMU environment for LoongArch. [x] Set up cross compilation tools for LoongArch on x86.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LA-llama.cpp

Overview

Plan

Porting

Optimization

Evaluation

Application

Miscellaneous

Progress and TODOs

Files

README.md

Latest commit

History

README.md

File metadata and controls

LA-llama.cpp

Overview

Plan

Porting

Optimization

Evaluation

Application

Miscellaneous

Progress and TODOs