Skip to content

jiusanzhou/nanobox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔥 NanoBox

Firecracker microVM sandbox — boot any OCI image in ~1 second

Rust License: MIT

Architecture · Quick Start · API · Docs


What is NanoBox?

NanoBox turns any Docker/OCI container image into an isolated Firecracker microVM sandbox. Submit an image name, get a running VM — with full kernel-level isolation, in about one second.

POST /api/v1/templates/warmup
{ "name": "node22", "image": "node:22-alpine" }

# ⏳ First time: pull + convert (~20s), then cached

POST /api/v1/sandboxes
{ "template": "node22" }

# ⚡ ~1s later:
# { "id": "sb-01jq...", "ip": "10.100.0.42", "boot_time_ms": 890 }

Why not just containers?

Containers NanoBox
Isolation Shared kernel (namespace) Separate kernel (KVM)
Escape risk Container breakout possible VM-level boundary
Boot time ~100ms (warm) ~1s (cold), <5ms (snapshot)
Density ~1000/node ~2000/node
Use case Trusted workloads Untrusted code execution

NanoBox is built for running untrusted agent code — AI coding agents, user-submitted scripts, tool-use sandboxes — where container isolation isn't enough.

Key Features

  • 🚀 ~1s cold boot from any OCI image, <5ms from snapshot
  • 🔒 VM-level isolation — each sandbox runs its own Linux kernel via KVM
  • 📦 OCI image warmupcrane pull → layer-by-layer extraction → ext4 rootfs, cached by digest
  • 🏊 Warm pool — pre-created VMs ready for instant allocation
  • 📸 Snapshot/restore — pause a VM, resume later with full state
  • 🌐 Network control — isolated / internal-only / internet with rate limiting
  • ☸️ K8s native — DaemonSet on bare-metal nodes with /dev/kvm

Architecture

                        ┌──────────────────┐
                        │   Your Platform   │
                        │ (ABox, API, CLI)  │
                        └────────┬─────────┘
                                 │ REST / gRPC
                                 ▼
┌─────────────────────────────────────────────────────────┐
│                  NanoBox Control Plane                   │
│                                                         │
│   nanobox-api          Scheduler         Template Reg   │
│   (axum REST)       (node selection)   (OCI warmup)     │
└────────────────────────┬────────────────────────────────┘
                         │ gRPC
         ┌───────────────┼───────────────┐
         ▼               ▼               ▼
   ┌──────────┐    ┌──────────┐    ┌──────────┐
   │  Node 1  │    │  Node 2  │    │  Node N  │
   │  Agent   │    │  Agent   │    │  Agent   │
   │          │    │          │    │          │
   │ Pool Mgr │    │ Pool Mgr │    │ Pool Mgr │
   │ OCI Cache│    │ OCI Cache│    │ OCI Cache│
   │          │    │          │    │          │
   │ ┌──────┐ │    │ ┌──────┐ │    │ ┌──────┐ │
   │ │FC  VM│ │    │ │FC  VM│ │    │ │FC  VM│ │
   │ │FC  VM│ │    │ │FC  VM│ │    │ │FC  VM│ │
   │ └──────┘ │    │ └──────┘ │    │ └──────┘ │
   └──────────┘    └──────────┘    └──────────┘
     bare-metal      bare-metal      bare-metal
     + /dev/kvm      + /dev/kvm      + /dev/kvm

OCI Warmup Pipeline

The key innovation: images are pre-warmed into bootable rootfs, not converted on-demand.

crane pull node:22-alpine          # OCI tarball with layers
        │
        ▼
    manifest.json                   # Layer ordering
    ├── layer-1.tar.gz (alpine base, 300+ busybox symlinks)
    ├── layer-2.tar.gz (node + npm/npx/yarn symlinks)
    ├── layer-3.tar.gz (yarn + gpg)
    └── layer-4.tar.gz (entrypoint)
        │
        ▼ extract in order, preserve symlinks, handle whiteouts
        │
    ext4 rootfs (auto-sized)
        │
        ▼ inject nanobox-init + container ENV
        │
    cached at /var/lib/nanobox/oci-cache/rootfs/node22.ext4
        │
        ▼ CoW clone per instance (cp --reflink=auto)
        │
    Firecracker boot → NANOBOX_READY (~1s)

Why crane pull and not crane export? crane export flattens all layers into a single tar and destroys symlinks. Alpine images have 300+ busybox symlinks; Node.js has npm/npx/yarn as symlinks. crane pull preserves individual layers which we extract in order.

Project Structure

nanobox/
├── crates/
│   ├── nanobox-api/          # REST API server (axum)
│   ├── nanobox-agent/        # Node agent (Firecracker lifecycle)
│   ├── nanobox-core/         # Shared types, config, errors
│   ├── nanobox-oci/          # OCI image pull + ext4 rootfs builder
│   ├── nanobox-pool/         # Warm VM pool management
│   ├── nanobox-snap/         # Snapshot store
│   ├── nanobox-net/          # Network (TAP/bridge/iptables)
│   ├── nanobox-template/     # Template builder (OCI warmup)
│   └── nanobox-guest/        # Guest agent protocol
├── guest/
│   ├── init.c                # PID 1 init for guest VMs
│   └── Makefile              # Build: gcc -static -O2
├── proto/                    # gRPC protobuf definitions
├── docs/                     # Design documents
│   ├── architecture.md       # System architecture
│   ├── api-spec.md           # API specification
│   ├── deployment.md         # Deployment guide
│   └── TODO.md               # Roadmap
└── rootfs/                   # Legacy rootfs build scripts

Quick Start

Prerequisites

  • Linux with KVM (/dev/kvm)
  • Rust 1.75+
  • Firecracker v1.10+
  • crane (for OCI image pulling)

Build

# Build all crates
cargo build --release

# Build guest init (requires gcc with static linking on Linux)
cd guest && make

Run

# 1. Start the API server
cargo run --bin nanobox-api -- --config nanobox.toml

# 2. Start the node agent (on a KVM-enabled machine)
cargo run --bin nanobox-agent -- --config nanobox.toml

# 3. Warm up an image
curl -X POST http://localhost:8080/api/v1/templates/warmup \
  -H "Content-Type: application/json" \
  -d '{"name": "node22", "image": "node:22-alpine"}'

# 4. Create a sandbox
curl -X POST http://localhost:8080/api/v1/sandboxes \
  -H "Content-Type: application/json" \
  -d '{"template": "node22", "resources": {"vcpu": 2, "memory_mb": 512}}'

# 5. Execute code
curl -X POST http://localhost:8080/api/v1/sandboxes/{id}/exec \
  -H "Content-Type: application/json" \
  -d '{"command": "node", "args": ["-e", "console.log(JSON.stringify({ok:true}))"]}'

Configuration

# nanobox.toml

[api]
listen_addr = "0.0.0.0:8080"

[agent]
node_id = "node-01"
firecracker_bin = "/usr/local/bin/firecracker"
rootfs_dir = "/var/lib/nanobox/rootfs"
snapshot_dir = "/var/lib/nanobox/snapshots"

[firecracker]
kernel = "/var/lib/nanobox/kernels/vmlinux-fc"
boot_args = "console=ttyS0 reboot=k panic=1 pci=off init=/sbin/nanobox-init"

[network]
bridge_name = "nanobox-br0"
subnet = "10.100.0.0/16"
host_iface = "eth0"

[oci]
cache_dir = "/var/lib/nanobox/oci-cache"
init_binary = "/usr/local/lib/nanobox/nanobox-init"
dns_server = "10.100.0.1"

[pool]
replenish_interval_sec = 5
max_idle_sec = 600

[[pool.templates]]
name = "node22"
min_ready = 3
max_ready = 10

API

Sandbox Lifecycle

Method Endpoint Description
POST /api/v1/sandboxes Create sandbox
GET /api/v1/sandboxes/{id} Get sandbox info
GET /api/v1/sandboxes List sandboxes
DELETE /api/v1/sandboxes/{id} Destroy sandbox
POST /api/v1/sandboxes/{id}/exec Execute command
POST /api/v1/sandboxes/{id}/keepalive Extend timeout
POST /api/v1/sandboxes/{id}/pause Pause (snapshot)
POST /api/v1/sandboxes/{id}/resume Resume from snapshot

Template Management

Method Endpoint Description
POST /api/v1/templates/warmup Submit warmup task
GET /api/v1/templates List templates
GET /api/v1/templates/{name}/status Warmup status
DELETE /api/v1/templates/{name} Remove template

Cluster Management

Method Endpoint Description
GET /api/v1/nodes List nodes
GET /api/v1/stats Cluster statistics

Full API spec → docs/api-spec.md

Guest Init

The guest init (guest/init.c) runs as PID 1 inside each microVM:

  1. Mount proc, sys, devtmpfs, devpts
  2. Seed entropy via RNDADDENTROPY ioctl (critical — without this, sshd blocks forever on kernel 4.14)
  3. Start sshd (if available in the image)
  4. Serial command loop — read commands from stdin, execute via sh -c, pipe output to serial, write NANOBOX_DONE marker
  5. Signal ready — write NANOBOX_READY to /dev/ttyS0

Container environment variables are sourced from /etc/nanobox/env.sh before each command.

Performance

Measured on ai-infra-staging cluster (33 cores, 251Gi RAM):

Metric Value
OCI pull (node:22-alpine, 158MB) ~15s (network bound)
OCI → ext4 conversion ~5s
Firecracker cold boot ~1s
Cached image → VM ready ~1s
Snapshot restore (planned) <5ms
Memory per VM Configurable (128MB–8GB)

Roadmap

  • Firecracker VM lifecycle (create/exec/destroy/pause/resume)
  • OCI image warmup pipeline
  • Guest init with serial command loop
  • TAP networking with NAT + DNS forwarding
  • Warm pool management
  • Multi-VM concurrent instances (dynamic IP allocation)
  • Snapshot-based fast restore (<5ms)
  • ZeroBoot CoW fork engine integration (<1ms for short-lived execution)
  • WebSocket terminal
  • K8s CRD controller
  • Production hardening (jailer, seccomp)

Docs

Document Description
Architecture System design, components, data flow
API Spec REST API reference
Deployment K8s deployment guide
Roadmap TODO and future plans

Relationship with ABox

NanoBox is the sandbox execution layer for the ABox Agent Platform. It owns VM lifecycle, isolation, and resource enforcement. It does NOT handle users, agents, billing, or workflow orchestration.

ABox (Go) ─── REST/gRPC ───► NanoBox (Rust) ───► Firecracker microVM

License

MIT

About

Firecracker microVM sandbox service with OCI image warmup

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors