Skip to content

Latest commit

 

History

History
43 lines (30 loc) · 1.4 KB

README.md

File metadata and controls

43 lines (30 loc) · 1.4 KB

LA-llama.cpp

Let's play LLM on LoongArch!

Overview

The project aims at porting and optimizing llama.cpp, a C++ LLM inference framework, on LoongArch. Especially, we want to tackle the following challenges:

  • Potential problems when porting the code on LoongArch platform.
  • Inference performance optimization via SIMD, temporarily targeting at 3A6000 platform.
  • LLM evaluation on LoongArch platform.
  • Interesting applications with presentation.

Plan

Based on the above challenges, the project can be divided into the following 4 stages:

Porting

  • Task: Port llama.cpp to LoongArch platform.
  • Objective: Compile and run llama.cpp on 3A6000.

Optimization

  • Task: Optimize the efficiency of llama.cpp on LoongArch (focus on CPU).
  • Objective: Apply programming optimization techniques and document the improvements.

Evaluation

  • Task: Benchmark various LLMs of different sizes.
  • Objective: Output a technical report.

Application

  • Task: Deploy usable applications with LLM on LoongArch platforms.
  • Objective: Output well-written deployment documents and visual demos.

Miscellaneous

Progress and TODOs

[x] Compile original llama.cpp on x86 CPU. [ ] Run LLM on x86 CPU. [x] Set up QEMU environment for LoongArch. [x] Set up cross compilation tools for LoongArch on x86.