Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
46209e7
docs: add Day 1 of 5 Days of OSS blog post for Lume 0.2
sarinali Jan 29, 2026
57c3504
docs: add Day 2 of 5 Days of OSS blog post for QEMU Sandboxes
sarinali Jan 29, 2026
e58e7aa
docs: add Day 3 of 5 Days of OSS blog post for Android Sandboxes
sarinali Jan 29, 2026
5099da6
docs: add Day 4 of 5 Days of OSS blog post for Cua-Bench
sarinali Jan 29, 2026
ef68097
docs: add Day 5 of 5 Days of OSS blog post for Human-Taught Skills
sarinali Jan 29, 2026
db66634
docs: add blog post on building launch video with Remotion + Claude Code
sarinali Jan 29, 2026
617de35
docs: add blog post on running Clawdbot in Lume macOS VMs
sarinali Jan 29, 2026
bc9cd5d
docs: add blog post on Clawdbot computer-use with cua-computer-server
sarinali Jan 29, 2026
2de29a5
docs: add blog post on computer-use agent history leading to Clawdbot
sarinali Jan 29, 2026
abbf3c2
delete assets from local
sarinali Jan 29, 2026
f869572
Fix images lume
sarinali Jan 29, 2026
edf89bd
Replace HTML image tags with Markdown syntax
sarinali Jan 29, 2026
20a680a
Convert HTML image tags to Markdown syntax
sarinali Jan 29, 2026
b7cca44
Convert HTML image tags to Markdown format
sarinali Jan 29, 2026
705d28f
Replace HTML images with Markdown syntax
sarinali Jan 29, 2026
9b4ca03
Replace images with Markdown syntax in documentation
sarinali Jan 29, 2026
b031a82
Update clawdbot-computer-use-history.md
sarinali Jan 29, 2026
364c6a2
Update title for QEMU sandboxes blog post
f-trycua Jan 29, 2026
4163cf4
fix history of computer use
sarinali Jan 29, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions blog/5-days-of-oss-day-1-lume.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Day 1 of 5 Days of OSS Releases: Lume 0.2

_Published on January 29, 2026 by the Cua Team_

One year ago, Cua started with an experiment deploying a computer-use agent on a macOS VM. Ahead of what's coming next, we've revamped the Lume CLI based on real-world usage deploying agents to customers.

## Unattended Setup

Apple provides no headless provisioning for macOS. So we built a VNC + OCR system that clicks through Setup Assistant like a human would - language, country, user account, terms, accessibility settings. Go from IPSW to a fully configured VM without touching the keyboard. Write custom YAML configs to set up any macOS version your way.

![lume_1](https://github.com/user-attachments/assets/88b69e52-6185-4bd9-93cc-161cb22cbe5e)

```bash
lume create my-vm --os macos --ipsw latest --unattended tahoe
```

## HTTP API + Daemon

A REST API on port 7777 that runs as a background service. Your scripts and CI pipelines can create, start, stop, and clone VMs programmatically. VMs persist even if your terminal closes.

![lume_2](https://github.com/user-attachments/assets/abe581cf-a5b0-4bed-86f2-1be890d2abad)

```bash
curl -X POST localhost:7777/lume/vms/my-vm/run -d '{"noDisplay": true}'
```

## MCP Server

Native integration with Claude Cowork, Claude Code and AI coding agents. Add Lume to your Claude config and ask it to "create a sandbox VM and run my tests" - it just works. Perfect for agentic workflows that need isolated macOS environments.

![lume_3](https://github.com/user-attachments/assets/e77819ff-345b-4678-9631-3913bf4b7d93)

See our example cookbooks for spinning up macOS sandboxes in Claude Cowork: [https://cua.ai/docs/lume/examples](https://cua.ai/docs/lume/examples)

## Multi-location Storage

macOS disk space is always tight. Now you can add external SSDs as storage locations and move VMs between them. Keep your working VMs on fast internal storage, archive others to external drives.

![lume_4](https://github.com/user-attachments/assets/b291fb67-4a69-41b5-8965-bad859fd9477)

```bash
lume config storage add external-ssd /Volumes/ExternalSSD/lume
lume clone my-vm backup --dest-storage external-ssd
```

## Registry Support

Push and pull VM images to GHCR or GCS like Docker images. Create a golden image once with all your tools and configs, push it to a registry, and your whole team can pull it in seconds. No more shipping disk images around.

<div align="center">
<img src="./assets/lume_5.jpeg" alt="Lume 0.2 Registry Support" width="600" />
</div>

```bash
lume push my-vm ghcr.io/myorg/my-vm:latest
lume pull ghcr.io/myorg/my-vm:latest
```

![lume_5](https://github.com/user-attachments/assets/e494a7b7-06ce-4f47-b429-5eb761c8ecc9)

---

This is the foundation for what's coming next. Lume handles the VM layer on macOS - tomorrow we ship what runs on top.

**MIT licensed. Apple Silicon only.**

- [GitHub Repository](https://github.com/trycua/cua)
- [Lume Documentation](https://cua.ai/docs/lume)
66 changes: 66 additions & 0 deletions blog/5-days-of-oss-day-2-qemu-sandboxes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Day 2 of 5 Days of OSS Releases: QEMU Linux and Windows Sandboxes

_Published on January 30, 2026 by the Cua Team_

Today we're releasing Windows 11 and Ubuntu VMs running in Docker via QEMU/KVM, available for self-hosting and MIT-licensed.

## Hardware-virtualized Desktops

Real operating systems booting from disk images. Windows 11 with full GUI, Ubuntu 22.04 with desktop environment. GPU passthrough supported for running local models on the sandbox or GUI apps that need graphics acceleration.

![qemu_1](https://github.com/user-attachments/assets/8282733d-c7e5-4f31-9974-32b414364868)

```bash
docker run -p 8006:8006 -p 5000:5000 ghcr.io/trycua/cua-qemu-windows:latest
```

## Fully Unattended Setup

Windows uses an unattended answer file, Linux uses cloud-init. Go from ISO to configured VM with user account, network, and computer server installed — no manual steps.

![qemu_2](https://github.com/user-attachments/assets/0c529e5e-a288-467e-a4e7-b065088af5b0)


## Computer Server Pre-installed

cua-computer-server runs on port 5000 at boot. Screenshots, mouse, keyboard, all over HTTP. Same API as Lume sandboxes — switch between macOS, Windows, and Linux without changing your agent code. Built-in autoupdater keeps the server current without rebuilding images.


![qemu_3](https://github.com/user-attachments/assets/edcb04c7-2f6d-4006-a593-0399f6eef761)

## noVNC on Port 8006

Browser-based desktop access. Open localhost:8006 to see the desktop, watch your agent run, debug visually. No VNC client install needed.

![qemu_4](https://github.com/user-attachments/assets/fa762996-1188-4119-9a79-0846f9e69bfb)


## Memory Snapshots

Freeze and restore full VM state via QEMU's snapshot support. Save state mid-task, restore to exact same point later — running processes, memory contents, everything.

![qemu_5](https://github.com/user-attachments/assets/8fd12831-c572-43e1-9455-349bb2fa879b)


## Runtime Config

Set RAM, CPU cores, disk size via environment variables. No need to rebuild images for different resource requirements.

![qemu_6](https://github.com/user-attachments/assets/89632439-b8e1-4b7b-80bb-c6453e281625)

```bash
docker run -e RAM_SIZE=16G -e CPU_CORES=8 -e DISK_SIZE=100G ghcr.io/trycua/cua-qemu-windows:latest
```

## Benchmark Compatible

Works with OSWorld and Windows Agent Arena. Pre-configured images include the applications used in standard agent evaluation suites.

![qemu_7](https://github.com/user-attachments/assets/a4374703-8516-4edc-88be-cbdb72919b88)

---

**Requires Docker and KVM support.**

- [GitHub Repository](https://github.com/trycua/cua)
- [Desktop Sandbox Documentation](https://cua.ai/docs/cua/guide/get-started/what-is-desktop-sandbox)
108 changes: 108 additions & 0 deletions blog/5-days-of-oss-day-3-android-sandboxes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Day 3 of 5 Days of OSS Releases: QEMU Android Sandboxes

_Published on January 31, 2026 by the Cua Team_

We're halfway through 5 Days of OSS Releases! Today we're releasing QEMU Android Sandboxes running in Docker, available for self-hosting and MIT-licensed.

## Full Android Emulator

Real Android 11 system with Google APIs running via QEMU/KVM. A complete Android device with touch input, app installation, and system services. Run any app your agent needs to test or automate.

![androidqemu_1](https://github.com/user-attachments/assets/3c9fb76f-5e0b-4031-a152-6ff0f9301dfd)


```bash
docker run -p 8006:8006 -p 8000:8000 ghcr.io/trycua/cua-android:latest
```

## MCP Server Included

Connect Claude Code, Claude Desktop, or any MCP-compatible client directly to the Android device over HTTP. Screenshots, touch, keyboard, shell commands — all exposed as MCP tools. Ask Claude to test your app and watch it happen.

![androidqemu_2](https://github.com/user-attachments/assets/66cdb74f-8c5b-4515-b4f6-8fae29830b26)


```json
{
"mcpServers": {
"cua-android": {
"type": "url",
"url": "http://localhost:8000/mcp"
}
}
}
```

## Device Profiles

Emulate real hardware. Samsung Galaxy S6 through S10, Nexus 4, Nexus One, tablets like Pixel C. Configure via EMULATOR_DEVICE environment variable.

![androidqemu_3](https://github.com/user-attachments/assets/af8f6588-949a-40ad-94ef-d77c6755392c)


```bash
docker run -e EMULATOR_DEVICE="Samsung Galaxy S10" ghcr.io/trycua/cua-android:latest
```

## ADB Built-in

Full Android Debug Bridge access. Install APKs, run shell commands, capture logcat, fire intents, inspect memory. Everything you'd do with a physical device, all from the container.

![androidqemu_4](https://github.com/user-attachments/assets/9b0b3295-f8c3-4642-9740-4d8b050cd6e5)


```bash
adb shell pm list packages
adb shell am start -a android.intent.action.VIEW -d "https://example.com"
adb shell dumpsys meminfo com.android.chrome
adb shell logcat -d | grep -i error
```

## Computer Server Pre-installed

Same HTTP API on port 8000 as macOS, Windows, and Linux sandboxes. Screenshots, touch, keyboard, shell commands — all unified. Write one agent, run it across every platform.

![androidqemu_5](https://github.com/user-attachments/assets/edd56ec5-3832-47e6-b052-985611edd4b4)


## noVNC on Port 8006

Watch your agent navigate Android in the browser. See the screen, observe touch events, debug visually. No VNC client needed.

![androidqemu_6](https://github.com/user-attachments/assets/8c0c539f-7ee9-499b-837c-a4d333f0885b)


## Intent-based Automation

Launch apps, open URLs, send broadcasts via Android intents. More reliable than coordinate-based tapping — let the OS handle navigation.

![androidqemu_7](https://github.com/user-attachments/assets/79ee200d-b9f4-4b06-9b2f-a3d91649c008)


```bash
# Launch Settings app
adb shell am start -a android.settings.SETTINGS

# Open URL in Chrome
adb shell am start -a android.intent.action.VIEW -d "https://example.com" com.android.chrome
```

## Full Diagnostic Access

logcat for real-time logs, dumpsys for system state, pm for package management, top for resource monitoring. Debug apps, track performance, catch crashes automatically.

![androidqemu_8](https://github.com/user-attachments/assets/2f07c93e-48ef-4e4c-b3a7-120a682b2ce3)


## Built on docker-android

Based on budtmo/docker-android with cua-computer-server added. Proven emulator stability with our unified agent API on top.

![androidqemu_9](https://github.com/user-attachments/assets/2254c159-388b-4845-95a2-002237fdd930)

---

**Requires Docker and KVM support.**

- [GitHub Repository](https://github.com/trycua/cua)
- [Desktop Sandbox Documentation](https://cua.ai/docs/cua/guide/get-started/what-is-desktop-sandbox)
95 changes: 95 additions & 0 deletions blog/5-days-of-oss-day-4-cua-bench.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Day 4 of 5 Days of OSS Releases: Cua-Bench

_Published on February 1, 2026 by the Cua Team_

Day 4 — the benchmark we use internally—and with customers—to evaluate CUA agents. Now open-source and MIT-licensed.

## 563 Tasks Across 3 Benchmarks

40 native cua-bench tasks + 369 OSWorld tasks + 154 Windows Agent Arena tasks. One unified harness to evaluate your agent across all of them.


![cuabench_1](https://github.com/user-attachments/assets/ee6de06e-0bc1-40a4-aa1f-79f7da1ebaa5)

```bash
cb run dataset datasets/cua-bench-basic --agent your-agent --model anthropic/claude-sonnet-4-20250514
```

## Parallelized Evaluations

Run tasks across multiple workers simultaneously. Evaluate your agent on 100 tasks in minutes, not hours.


![cuabench_2](https://github.com/user-attachments/assets/3e19d5d6-3721-4ad0-924d-dc5f39aad3a4)

```bash
cb run dataset datasets/cua-bench-basic --max-parallel 8
```

## Works Everywhere

macOS, Windows, Linux, Android. Managed environments you can self-host. Write one agent, evaluate it across every platform.

![cuabench_3](https://github.com/user-attachments/assets/07602652-51cf-4279-851a-11207f5e917c)


## Interactive Exploration

Step through any task manually before running your agent. Understand what you're evaluating.

![cuabench_4](https://github.com/user-attachments/assets/419b7713-8300-422b-bc04-ca6f68791272)


```bash
cb interact tasks/slack_env --variant-id 0
```

## Task Generation

Generate new evaluation tasks from natural language descriptions. Extend the benchmark with your own scenarios.

![cuabench_5](https://github.com/user-attachments/assets/5f53a0b3-313a-4029-a5f5-abf1e77c9c3c)


```bash
cb task generate "2048 game"
```

## Live Monitoring

Watch your agent's progress in real-time. View traces, logs, and screenshots as evaluations run.

![cuabench_6](https://github.com/user-attachments/assets/97af5e09-bc88-48fb-9e82-d23bb6f1a88f)


```bash
cb run watch <run_id>
cb trace view <run_id>
```

## Oracle Validation

Run tasks with reference implementations to verify environments work correctly before agent evaluation.

![cuabench_7](https://github.com/user-attachments/assets/6689cdeb-843e-45db-8729-a1e80fb60b98)


```bash
cb run dataset datasets/cua-bench-basic --oracle
```

## Adapters for Existing Benchmarks

Plug in OSWorld, Windows Agent Arena, AndroidWorld. Unified CLI, consistent evaluation, comparable results.

![cuabench_8](https://github.com/user-attachments/assets/77b0b545-42e1-480b-92d9-22380c8c64ec)


---

![cuabench_9](https://github.com/user-attachments/assets/09ad87ec-0a54-401a-bb33-93e891df5df2)


**MIT licensed. Self-hostable. 563 tasks and counting.**

- [GitHub Repository](https://github.com/trycua/cua)
Loading