Skip to content

gip/yllama.oc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yllama.oc

An on-chain implementation of inference for Llama 3 8b.

Overview

The project aims to create a generic yblock canister that's easily deployable and configurable on the Internet Computer Protocol (ICP). These yblock units serve as foundational components for uploading and executing AI algorithms, with the goal of distributing computation across a network of independent nodes. Consensus mechanisms ensure result accuracy without requiring knowledge of individual nodes.

In this implementation, the workload is distributed across 34 canisters. The code is currently unoptimized; future steps include reducing overhead and leveraging SIMD in WebAssembly.

The core algorithm is implemented in here.

Building

To build you will need:

  • This repositiry yllama.oc. It depends on yllama.rs.
  • The yllama.rs. It depends on tokenizers.
  • A patched version of the Hugging Face tokenizers repo. Crates built on getrandom need modification due to ICP's deterministic code execution and different randomness handling.

Deploying and Running

[To be updated]

Contact

gip.github@gmail.com

About

Fully On-chain Inference for LLMs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages