Skip to content

A self-improving agentic loop inspired by QT45: a 45-nucleotide RNA ribozyme capable of self-replication.

License

Notifications You must be signed in to change notification settings

rcurrie/qt45wasm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

qt45wasm

A self-improving agentic loop inspired by QT45, a 45-nucleotide RNA ribozyme capable of self-replication. An LLM synthesizes pure functions as WebAssembly modules, stores them in a persistent library, and reuses them to solve future problems. Each interaction makes the system more capable. The compute migrates from the probabilistic, expensive layer (LLM) to the deterministic, cheap layer (WASM) as the library grows. Just as QT45 catalyzes RNA-templated synthesis from a minimal motif — building complexity from simplicity in eutectic ice — this system starts with a minimal kernel and accumulates capability over time.

Compute Tiers

Functions climb a staircase of compute power as the workload demands:

  Scalar          SIMD (v128)         WebGPU
  ───────         ──────────          ──────
  i32/f64 in,     f32 arrays in       f32 arrays in
  scalar out      linear memory,      linear memory,
                  4 floats/op         Metal compute shader
                  (NEON on Apple      (thousands of
                  Silicon)            parallel threads)

  add(10, 20)     vec_add_f32         gpu_double_f32
       │          [1,2,3,4]           [1,2,3,4,5,6,7,8]
       ▼          [5,6,7,8]                  │
      30               │                     ▼
                       ▼            [2,4,6,8,10,12,14,16]
                  [6,8,10,12]

Each tier has its own code path — scalar functions are never touched when running SIMD or GPU workloads. The LLM generates the right kind of module for each tier: plain WAT for scalar, v128-enabled WAT for SIMD, and WAT with embedded WGSL compute shaders for GPU.

Biology → Code

Biology Code Role
RNA Polymerase LlmClient Synthesizes new functions from descriptions
Ribosome WasmRuntime Compiles and executes WASM modules
DNA Helix FunctionStore Persistent memory of all known functions
Immune memory HybridSearch Recognizes previously seen problems (FTS5 + embeddings)
Nucleotide pool Prompt templates Raw material the polymerase works from
Enzyme Compiled WASM binary Fast, deterministic, reusable computation
Eutectic ice SQLite The substrate that concentrates and preserves everything
Mitochondria GpuContext Offloads heavy computation to the GPU

Architecture

┌─────────────────────────────────────────────────┐
│                     REPL                        │
│  User input → Search → Plan → Execute → Output  │
└────────┬──────────┬──────────┬──────────────────┘
         │          │          │
    ┌────▼───┐ ┌────▼───┐ ┌────▼──────────────┐
    │ Search │ │  LLM   │ │     Runtime       │
    │ (FTS5 +│ │ Client │ │ call() scalar     │
    │ vector)│ │        │ │ call_array() simd │
    │        │ │        │ │ call_gpu() gpu    │
    └────┬───┘ └────┬───┘ └────┬──────────┬───┘
         │          │          │          │
         │          │          │     ┌────▼────┐
         │          │          │     │   GPU   │
         │          │          │     │ (wgpu/  │
         │          │          │     │  Metal) │
         │          │          │     └────┬────┘
    ┌────▼──────────▼──────────▼──────────▼───┐
    │          FunctionStore                  │
    │   (SQLite: functions, tests, shaders,   │
    │    embeddings, compute tiers)           │
    └─────────────────────────────────────────┘

Model Evaluation

We benchmarked five local small quantized ollama models on WAT code generation across all three tiers. TL;DR; CPU and SIMD generation - good, GPU: Going to need a bigger model or one tuned with WAT a bit more.

See FINDINGS.md for full results. Run the benchmark yourself with:

cargo run --bin eval_models

Install

Requires Rust and Ollama. GPU tier requires a Metal-capable Mac (Apple Silicon or discrete AMD GPU); SIMD and scalar work on any platform.

# Install a code-capable model
ollama pull qwen2.5-coder:7b

# Clone and build
git clone https://github.com/rcurrie/qt45wasm.git qt45
cd qt45wasm
cargo build

Test

cargo test

Run

Make sure Ollama is running, then try the built-in demo. The first run synthesizes functions via the LLM across all three compute tiers, verifies each with auto-generated tests, and stores everything in qt45.db:

cargo run -- --demo
Request: add (adds two integers) args: [I32(10), I32(20)]
  [cache] miss — generating via LLM...
  [llm] attempt 1/3...
  [llm] compilation successful
  [store] saved function: 'add' (tier: scalar)
  [test] 3/3 passed — verified
  [wasm] 30

Request: sum (adds two numbers together) args: [I32(7), I32(8)]
  [search] matched 'add' via both (score: 0.0328)
  [wasm] 15

=== SIMD Demo ===

Request [simd]: vec_add_f32 (element-wise addition of two f32 arrays) inputs: 2
  [cache] miss — generating SIMD via LLM...
  [llm] simd attempt 1/3...
  [llm] simd compilation successful
  [store] saved function: 'vec_add_f32' (tier: simd)
  [simd] [11, 22, 33, 44]

=== GPU Demo ===
  [gpu] adapter: Apple M1 Pro

Request [gpu]: gpu_double_f32 (double all elements of an f32 array) inputs: 1
  [cache] miss — generating GPU via LLM...
  [llm] gpu attempt 1/3...
  [llm] gpu compilation successful
  [store] saved function: 'gpu_double_f32' (tier: gpu)
  [gpu] [2, 4, 6, 8, 10, 12, 14, 16]

--- Function Library ---
  add            (i32, i32) -> i32    [calls: 3, tier: scalar, verified: true]
  vec_add_f32    simd                 [calls: 2, tier: simd,   verified: true]
  gpu_double_f32 gpu                  [calls: 2, tier: gpu,    verified: true]
  ...

Run it again — every function is now served from cache with zero LLM calls:

cargo run -- --demo
Request: add (adds two integers) args: [I32(10), I32(20)]
  [cache] found 'add' (calls: 3, verified)
  [wasm] 30
...
Request [simd]: vec_add_f32 ...
  [cache] found 'vec_add_f32' simd (calls: 2, verified)
  [simd] [11, 22, 33, 44]
...
Request [gpu]: gpu_double_f32 ...
  [cache] found 'gpu_double_f32' gpu (calls: 2, verified)
  [gpu] [2, 4, 6, 8, 10, 12, 14, 16]

Interactive REPL

For interactive use, run without --demo:

cargo run

Scalar functions

qt45wasm> add "adds two integers" (i32, i32) -> i32 : 10, 20

Request: add (adds two integers) args: [I32(10), I32(20)]
  [cache] found 'add' (calls: 9, verified)
  [wasm] 30

SIMD array functions

Prefix with simd and pass arrays in brackets:

qt45wasm> simd vec_add_f32 "element-wise add" [1,2,3,4] [5,6,7,8]

Request [simd]: vec_add_f32 (element-wise add) inputs: 2
  [simd] [6, 8, 10, 12]

GPU array functions

Prefix with gpu — the GPU context initializes lazily on first use:

qt45wasm> gpu gpu_double_f32 "double all elements" [1,2,3,4]
  [gpu] initializing GPU context...
  [gpu] adapter: Apple M1 Pro

Request [gpu]: gpu_double_f32 (double all elements) inputs: 1
  [gpu] [2, 4, 6, 8]

Commands

qt45wasm> .list
NAME             SIGNATURE                CALLS  TIER     VERIFIED
--------------------------------------------------------------------
add              (i32, i32) -> i32           10  scalar   yes
vec_add_f32      simd                         2  simd     yes
gpu_double_f32   gpu                          2  gpu      yes

qt45wasm> .source add
(module
  (func (export "add") (param i32 i32) (result i32)
    local.get 0
    local.get 1
    i32.add))

qt45wasm> .shader gpu_double_f32
@group(0) @binding(0) var<storage, read_write> data: array<f32>;
@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
  if id.x < arrayLength(&data) {
    data[id.x] = data[id.x] * 2.0;
  }
}
Command Description
.list List all stored functions with compute tier
.info <name> Show function details including tier
.source <name> Print WAT source code
.shader <name> Print WGSL shader source (GPU functions)
.tests <name> Show test cases
.bench <name> <n> Benchmark a SIMD/GPU function with n elements
.delete <name> Delete a function
.help Show commands and syntax
.quit Exit

Use --model to select a different Ollama model:

cargo run -- --model qwen2.5-coder:7b

Storage

All state persists locally in the project directory:

  • qt45.db — functions, tests, shaders, and vector embeddings (SQLite)
  • cache/ — the BGE-small-en-v1.5 ONNX embedding model (~50MB, downloaded on first run)

To reset, delete qt45.db. To re-download the embedding model, delete cache/.

Key Behaviors

  • Generate: new functions are synthesized as WAT (scalar and SIMD) or WAT+WGSL (GPU), compiled to WASM, and verified with LLM-generated test cases
  • Cache: subsequent calls to the same function load the compiled binary instantly — no LLM needed
  • Search: requesting "sum" finds the existing "add" function via hybrid FTS5 + vector similarity search, avoiding redundant LLM generation
  • Tiered compute: scalar functions use plain WASM values, SIMD functions operate on arrays in linear memory with v128 instructions, GPU functions dispatch WGSL compute shaders via Metal
  • GPU bridge: GPU functions use a host-function import pattern — the WAT module calls gpu.alloc, gpu.write_buffer, gpu.dispatch_shader, and gpu.read_buffer which the Rust host implements via wgpu

About

A self-improving agentic loop inspired by QT45: a 45-nucleotide RNA ribozyme capable of self-replication.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages