Let's play LLM on LoongArch!
The project aims at porting and optimizing llama.cpp, a C++ LLM inference framework, on LoongArch. Especially, we want to tackle the following challenges:
- Potential problems when porting the code on LoongArch platform.
- Inference performance optimization via SIMD, temporarily targeting at 3A6000 platform.
- LLM evaluation on LoongArch platform.
- Interesting applications with presentation.
Based on the above challenges, the project can be divided into the following 4 stages:
- Task: Port llama.cpp to LoongArch platform.
- Objective: Compile and run llama.cpp on 3A6000.
- Task: Optimize the efficiency of llama.cpp on LoongArch (focus on CPU).
- Objective: Apply programming optimization techniques and document the improvements.
- Task: Benchmark various LLMs of different sizes.
- Objective: Output a technical report.
- Task: Deploy usable applications with LLM on LoongArch platforms.
- Objective: Output well-written deployment documents and visual demos.
- We develop based on release
b2430
of the original repo.
[x] Compile original llama.cpp on x86 CPU. [ ] Run LLM on x86 CPU. [x] Set up QEMU environment for LoongArch. [x] Set up cross compilation tools for LoongArch on x86.