trycua · sarinali · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026
diff --git a/blog/5-days-of-oss-day-1-lume.md b/blog/5-days-of-oss-day-1-lume.md
@@ -0,0 +1,68 @@
+# Day 1 of 5 Days of OSS Releases: Lume 0.2
+
+_Published on January 29, 2026 by the Cua Team_
+
+One year ago, Cua started with an experiment deploying a computer-use agent on a macOS VM. Ahead of what's coming next, we've revamped the Lume CLI based on real-world usage deploying agents to customers.
+
+## Unattended Setup
+
+Apple provides no headless provisioning for macOS. So we built a VNC + OCR system that clicks through Setup Assistant like a human would - language, country, user account, terms, accessibility settings. Go from IPSW to a fully configured VM without touching the keyboard. Write custom YAML configs to set up any macOS version your way.
+
+![lume_1](https://github.com/user-attachments/assets/88b69e52-6185-4bd9-93cc-161cb22cbe5e)
+
+```bash
+lume create my-vm --os macos --ipsw latest --unattended tahoe
+```
+
+## HTTP API + Daemon
+
+A REST API on port 7777 that runs as a background service. Your scripts and CI pipelines can create, start, stop, and clone VMs programmatically. VMs persist even if your terminal closes.
+
+![lume_2](https://github.com/user-attachments/assets/abe581cf-a5b0-4bed-86f2-1be890d2abad)
+
+```bash
+curl -X POST localhost:7777/lume/vms/my-vm/run -d '{"noDisplay": true}'
+```
+
+## MCP Server
+
+Native integration with Claude Cowork, Claude Code and AI coding agents. Add Lume to your Claude config and ask it to "create a sandbox VM and run my tests" - it just works. Perfect for agentic workflows that need isolated macOS environments.
+
+![lume_3](https://github.com/user-attachments/assets/e77819ff-345b-4678-9631-3913bf4b7d93)
+
+See our example cookbooks for spinning up macOS sandboxes in Claude Cowork: [https://cua.ai/docs/lume/examples](https://cua.ai/docs/lume/examples)
+
+## Multi-location Storage
+
+macOS disk space is always tight. Now you can add external SSDs as storage locations and move VMs between them. Keep your working VMs on fast internal storage, archive others to external drives.
+
+![lume_4](https://github.com/user-attachments/assets/b291fb67-4a69-41b5-8965-bad859fd9477)
+
+```bash
+lume config storage add external-ssd /Volumes/ExternalSSD/lume
+lume clone my-vm backup --dest-storage external-ssd
+```
+
+## Registry Support
+
+Push and pull VM images to GHCR or GCS like Docker images. Create a golden image once with all your tools and configs, push it to a registry, and your whole team can pull it in seconds. No more shipping disk images around.
+
+<div align="center">
+  <img src="./assets/lume_5.jpeg" alt="Lume 0.2 Registry Support" width="600" />
+</div>
+
+```bash
+lume push my-vm ghcr.io/myorg/my-vm:latest
+lume pull ghcr.io/myorg/my-vm:latest
+```
+
+![lume_5](https://github.com/user-attachments/assets/e494a7b7-06ce-4f47-b429-5eb761c8ecc9)
+
+---
+
+This is the foundation for what's coming next. Lume handles the VM layer on macOS - tomorrow we ship what runs on top.
+
+**MIT licensed. Apple Silicon only.**
+
+- [GitHub Repository](https://github.com/trycua/cua)
+- [Lume Documentation](https://cua.ai/docs/lume)
diff --git a/blog/5-days-of-oss-day-2-qemu-sandboxes.md b/blog/5-days-of-oss-day-2-qemu-sandboxes.md
@@ -0,0 +1,66 @@
+# Day 2 of 5 Days of OSS Releases: QEMU Linux and Windows Sandboxes
+
+_Published on January 30, 2026 by the Cua Team_
+
+Today we're releasing Windows 11 and Ubuntu VMs running in Docker via QEMU/KVM, available for self-hosting and MIT-licensed.
+
+## Hardware-virtualized Desktops
+
+Real operating systems booting from disk images. Windows 11 with full GUI, Ubuntu 22.04 with desktop environment. GPU passthrough supported for running local models on the sandbox or GUI apps that need graphics acceleration.
+
+![qemu_1](https://github.com/user-attachments/assets/8282733d-c7e5-4f31-9974-32b414364868)
+
+```bash
+docker run -p 8006:8006 -p 5000:5000 ghcr.io/trycua/cua-qemu-windows:latest
+```
+
+## Fully Unattended Setup
+
+Windows uses an unattended answer file, Linux uses cloud-init. Go from ISO to configured VM with user account, network, and computer server installed — no manual steps.
+
+![qemu_2](https://github.com/user-attachments/assets/0c529e5e-a288-467e-a4e7-b065088af5b0)
+
+
+## Computer Server Pre-installed
+
+cua-computer-server runs on port 5000 at boot. Screenshots, mouse, keyboard, all over HTTP. Same API as Lume sandboxes — switch between macOS, Windows, and Linux without changing your agent code. Built-in autoupdater keeps the server current without rebuilding images.
+
+
+![qemu_3](https://github.com/user-attachments/assets/edcb04c7-2f6d-4006-a593-0399f6eef761)
+
+## noVNC on Port 8006
+
+Browser-based desktop access. Open localhost:8006 to see the desktop, watch your agent run, debug visually. No VNC client install needed.
+
+![qemu_4](https://github.com/user-attachments/assets/fa762996-1188-4119-9a79-0846f9e69bfb)
+
+
+## Memory Snapshots
+
+Freeze and restore full VM state via QEMU's snapshot support. Save state mid-task, restore to exact same point later — running processes, memory contents, everything.
+
+![qemu_5](https://github.com/user-attachments/assets/8fd12831-c572-43e1-9455-349bb2fa879b)
+
+
+## Runtime Config
+
+Set RAM, CPU cores, disk size via environment variables. No need to rebuild images for different resource requirements.
+
+![qemu_6](https://github.com/user-attachments/assets/89632439-b8e1-4b7b-80bb-c6453e281625)
+
+```bash
+docker run -e RAM_SIZE=16G -e CPU_CORES=8 -e DISK_SIZE=100G ghcr.io/trycua/cua-qemu-windows:latest
+```
+
+## Benchmark Compatible
+
+Works with OSWorld and Windows Agent Arena. Pre-configured images include the applications used in standard agent evaluation suites.
+
+![qemu_7](https://github.com/user-attachments/assets/a4374703-8516-4edc-88be-cbdb72919b88)
+
+---
+
+**Requires Docker and KVM support.**
+
+- [GitHub Repository](https://github.com/trycua/cua)
+- [Desktop Sandbox Documentation](https://cua.ai/docs/cua/guide/get-started/what-is-desktop-sandbox)
diff --git a/blog/5-days-of-oss-day-3-android-sandboxes.md b/blog/5-days-of-oss-day-3-android-sandboxes.md
@@ -0,0 +1,108 @@
+# Day 3 of 5 Days of OSS Releases: QEMU Android Sandboxes
+
+_Published on January 31, 2026 by the Cua Team_
+
+We're halfway through 5 Days of OSS Releases! Today we're releasing QEMU Android Sandboxes running in Docker, available for self-hosting and MIT-licensed.
+
+## Full Android Emulator
+
+Real Android 11 system with Google APIs running via QEMU/KVM. A complete Android device with touch input, app installation, and system services. Run any app your agent needs to test or automate.
+
+![androidqemu_1](https://github.com/user-attachments/assets/3c9fb76f-5e0b-4031-a152-6ff0f9301dfd)
+
+
+```bash
+docker run -p 8006:8006 -p 8000:8000 ghcr.io/trycua/cua-android:latest
+```
+
+## MCP Server Included
+
+Connect Claude Code, Claude Desktop, or any MCP-compatible client directly to the Android device over HTTP. Screenshots, touch, keyboard, shell commands — all exposed as MCP tools. Ask Claude to test your app and watch it happen.
+
+![androidqemu_2](https://github.com/user-attachments/assets/66cdb74f-8c5b-4515-b4f6-8fae29830b26)
+
+
+```json
+{
+  "mcpServers": {
+    "cua-android": {
+      "type": "url",
+      "url": "http://localhost:8000/mcp"
+    }
+  }
+}
+```
+
+## Device Profiles
+
+Emulate real hardware. Samsung Galaxy S6 through S10, Nexus 4, Nexus One, tablets like Pixel C. Configure via EMULATOR_DEVICE environment variable.
+
+![androidqemu_3](https://github.com/user-attachments/assets/af8f6588-949a-40ad-94ef-d77c6755392c)
+
+
+```bash
+docker run -e EMULATOR_DEVICE="Samsung Galaxy S10" ghcr.io/trycua/cua-android:latest
+```
+
+## ADB Built-in
+
+Full Android Debug Bridge access. Install APKs, run shell commands, capture logcat, fire intents, inspect memory. Everything you'd do with a physical device, all from the container.
+
+![androidqemu_4](https://github.com/user-attachments/assets/9b0b3295-f8c3-4642-9740-4d8b050cd6e5)
+
+
+```bash
+adb shell pm list packages
+adb shell am start -a android.intent.action.VIEW -d "https://example.com"
+adb shell dumpsys meminfo com.android.chrome
+adb shell logcat -d | grep -i error
+```
+
+## Computer Server Pre-installed
+
+Same HTTP API on port 8000 as macOS, Windows, and Linux sandboxes. Screenshots, touch, keyboard, shell commands — all unified. Write one agent, run it across every platform.
+
+![androidqemu_5](https://github.com/user-attachments/assets/edd56ec5-3832-47e6-b052-985611edd4b4)
+
+
+## noVNC on Port 8006
+
+Watch your agent navigate Android in the browser. See the screen, observe touch events, debug visually. No VNC client needed.
+
+![androidqemu_6](https://github.com/user-attachments/assets/8c0c539f-7ee9-499b-837c-a4d333f0885b)
+
+
+## Intent-based Automation
+
+Launch apps, open URLs, send broadcasts via Android intents. More reliable than coordinate-based tapping — let the OS handle navigation.
+
+![androidqemu_7](https://github.com/user-attachments/assets/79ee200d-b9f4-4b06-9b2f-a3d91649c008)
+
+
+```bash
+# Launch Settings app
+adb shell am start -a android.settings.SETTINGS
+
+# Open URL in Chrome
+adb shell am start -a android.intent.action.VIEW -d "https://example.com" com.android.chrome
+```
+
+## Full Diagnostic Access
+
+logcat for real-time logs, dumpsys for system state, pm for package management, top for resource monitoring. Debug apps, track performance, catch crashes automatically.
+
+![androidqemu_8](https://github.com/user-attachments/assets/2f07c93e-48ef-4e4c-b3a7-120a682b2ce3)
+
+
+## Built on docker-android
+
+Based on budtmo/docker-android with cua-computer-server added. Proven emulator stability with our unified agent API on top.
+
+![androidqemu_9](https://github.com/user-attachments/assets/2254c159-388b-4845-95a2-002237fdd930)
+
+---
+
+**Requires Docker and KVM support.**
+
+- [GitHub Repository](https://github.com/trycua/cua)
+- [Desktop Sandbox Documentation](https://cua.ai/docs/cua/guide/get-started/what-is-desktop-sandbox)
diff --git a/blog/5-days-of-oss-day-4-cua-bench.md b/blog/5-days-of-oss-day-4-cua-bench.md
@@ -0,0 +1,95 @@
+# Day 4 of 5 Days of OSS Releases: Cua-Bench
+
+_Published on February 1, 2026 by the Cua Team_
+
+Day 4 — the benchmark we use internally—and with customers—to evaluate CUA agents. Now open-source and MIT-licensed.
+
+## 563 Tasks Across 3 Benchmarks
+
+40 native cua-bench tasks + 369 OSWorld tasks + 154 Windows Agent Arena tasks. One unified harness to evaluate your agent across all of them.
+
+
+![cuabench_1](https://github.com/user-attachments/assets/ee6de06e-0bc1-40a4-aa1f-79f7da1ebaa5)
+
+```bash
+cb run dataset datasets/cua-bench-basic --agent your-agent --model anthropic/claude-sonnet-4-20250514
+```
+
+## Parallelized Evaluations
+
+Run tasks across multiple workers simultaneously. Evaluate your agent on 100 tasks in minutes, not hours.
+
+
+![cuabench_2](https://github.com/user-attachments/assets/3e19d5d6-3721-4ad0-924d-dc5f39aad3a4)
+
+```bash
+cb run dataset datasets/cua-bench-basic --max-parallel 8
+```
+
+## Works Everywhere
+
+macOS, Windows, Linux, Android. Managed environments you can self-host. Write one agent, evaluate it across every platform.
+
+![cuabench_3](https://github.com/user-attachments/assets/07602652-51cf-4279-851a-11207f5e917c)
+
+
+## Interactive Exploration
+
+Step through any task manually before running your agent. Understand what you're evaluating.
+
+![cuabench_4](https://github.com/user-attachments/assets/419b7713-8300-422b-bc04-ca6f68791272)
+
+
+```bash
+cb interact tasks/slack_env --variant-id 0
+```
+
+## Task Generation
+
+Generate new evaluation tasks from natural language descriptions. Extend the benchmark with your own scenarios.
+
+![cuabench_5](https://github.com/user-attachments/assets/5f53a0b3-313a-4029-a5f5-abf1e77c9c3c)
+
+
+```bash
+cb task generate "2048 game"
+```
+
+## Live Monitoring
+
+Watch your agent's progress in real-time. View traces, logs, and screenshots as evaluations run.
+
+![cuabench_6](https://github.com/user-attachments/assets/97af5e09-bc88-48fb-9e82-d23bb6f1a88f)
+
+
+```bash
+cb run watch <run_id>
+cb trace view <run_id>
+```
+
+## Oracle Validation
+
+Run tasks with reference implementations to verify environments work correctly before agent evaluation.
+
+![cuabench_7](https://github.com/user-attachments/assets/6689cdeb-843e-45db-8729-a1e80fb60b98)
+
+
+```bash
+cb run dataset datasets/cua-bench-basic --oracle
+```
+
+## Adapters for Existing Benchmarks
+
+Plug in OSWorld, Windows Agent Arena, AndroidWorld. Unified CLI, consistent evaluation, comparable results.
+
+![cuabench_8](https://github.com/user-attachments/assets/77b0b545-42e1-480b-92d9-22380c8c64ec)
+
+
+---
+
+![cuabench_9](https://github.com/user-attachments/assets/09ad87ec-0a54-401a-bb33-93e891df5df2)
+
+
+**MIT licensed. Self-hostable. 563 tasks and counting.**
+
+- [GitHub Repository](https://github.com/trycua/cua)