Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion docs/content/docs/cuabench/guide/fundamentals/meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,13 @@
"title": "Fundamentals",
"description": "Core concepts of tasks and environments",
"icon": "Lightbulb",
"pages": ["tasks", "app-helpers", "universal-gui", "simulated-desktop", "agent-traces", "adapters", "registry"]
"pages": [
"tasks",
"app-helpers",
"universal-gui",
"simulated-desktop",
"agent-traces",
"adapters",
"registry"
]
}
16 changes: 2 additions & 14 deletions docs/src/components/custom-header.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -267,20 +267,8 @@ export function CustomHeader() {
className="hidden sm:inline-flex h-9 w-9 items-center justify-center rounded-md text-fd-muted-foreground transition-colors hover:bg-fd-accent hover:text-fd-foreground"
title="Vibe Coding MCP"
>
<Image
src={McpBlack}
alt="MCP"
width={20}
height={20}
className="block dark:hidden"
/>
<Image
src={McpWhite}
alt="MCP"
width={20}
height={20}
className="hidden dark:block"
/>
<Image src={McpBlack} alt="MCP" width={20} height={20} className="block dark:hidden" />
<Image src={McpWhite} alt="MCP" width={20} height={20} className="hidden dark:block" />
</Link>

<ThemeToggle />
Expand Down
14 changes: 7 additions & 7 deletions libs/cua-bench/datasets/cua-bench-workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,10 @@ cb interact datasets/cua-bench-workflows/photoshop-tasks --variant-id 2

### photoshop-tasks

| Variant | Description |
|---------|-------------|
| 0 | Create document with "Hello World" text layer |
| 1 | Create document with "Welcome to CUA" text layer |
| 2 | Open PSD and count layers |
| 3 | Open PSD and describe layers |
| 4 | Create and save document as specific filename |
| Variant | Description |
| ------- | ------------------------------------------------ |
| 0 | Create document with "Hello World" text layer |
| 1 | Create document with "Welcome to CUA" text layer |
| 2 | Open PSD and count layers |
| 3 | Open PSD and describe layers |
| 4 | Create and save document as specific filename |
4 changes: 2 additions & 2 deletions libs/lume/resources/unattended-sequoia.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ boot_commands:
- "<click 'Full Name'>"
# Type Full Name
- "<type 'lume'>"
- "<tab>"
- "<tab>"
# Skip Account Name (auto-filled from Full Name)
- "<tab>"
# Type Password
Expand Down Expand Up @@ -200,7 +200,7 @@ boot_commands:

# Open Spotlight with Cmd+Space
- "<cmd+space>"
- "<delay 2>"
- "<delay 2>"

# Type "Terminal" to search for Terminal app
- "<type 'Terminal'>"
Expand Down
4 changes: 2 additions & 2 deletions libs/lume/src/Resources/unattended-presets/sequoia.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ boot_commands:
- "<delay 1>"
# Type Full Name
- "<type 'lume'>"
- "<tab>"
- "<tab>"
# Skip Account Name (auto-filled from Full Name)
- "<tab>"
# Type Password
Expand Down Expand Up @@ -199,7 +199,7 @@ boot_commands:

# Open Spotlight with Cmd+Space
- "<cmd+space>"
- "<delay 2>"
- "<delay 2>"

# Type "Terminal" to search for Terminal app
- "<type 'Terminal'>"
Expand Down
2 changes: 1 addition & 1 deletion libs/python/agent/agent/callbacks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
from .budget_manager import BudgetManagerCallback
from .image_retention import ImageRetentionCallback
from .logging import LoggingCallback
from .otel import OtelCallback, OtelErrorCallback
from .operator_validator import OperatorNormalizerCallback
from .otel import OtelCallback, OtelErrorCallback
from .prompt_instructions import PromptInstructionsCallback
from .telemetry import TelemetryCallback
from .trajectory_saver import TrajectorySaverCallback
Expand Down
145 changes: 145 additions & 0 deletions libs/python/agent/tests/test_telemetry_events.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
"""
Test script to verify telemetry events are emitted correctly.
"""

from unittest.mock import AsyncMock, MagicMock, patch

import pytest


class TestAgentTelemetryEvents:
"""Test telemetry events emitted by ComputerAgent."""

@patch("agent.agent.record_event")
@patch("agent.agent.is_telemetry_enabled", return_value=True)
def test_agent_init_event(self, mock_telemetry_enabled, mock_record_event):
"""Test that agent_init event is emitted with correct args_provided."""
from agent.agent import ComputerAgent

# Create agent with various args
agent = ComputerAgent(
model="anthropic/claude-sonnet-4-5-20250929",
instructions="Test instructions",
max_retries=5, # non-default
trajectory_dir="/tmp/test",
)

# Find the agent_init call
agent_init_calls = [
call for call in mock_record_event.call_args_list if call[0][0] == "agent_init"
]

assert len(agent_init_calls) == 1, "agent_init should be called once"

event_name, event_data = agent_init_calls[0][0]
assert event_name == "agent_init"
assert event_data["model"] == "anthropic/claude-sonnet-4-5-20250929"
assert "instructions" in event_data["args_provided"]
assert "max_retries" in event_data["args_provided"]
assert "trajectory_dir" in event_data["args_provided"]

@patch("agent.agent.record_event")
@patch("agent.agent.is_telemetry_enabled", return_value=True)
def test_agent_init_minimal_args(self, mock_telemetry_enabled, mock_record_event):
"""Test agent_init with minimal args (defaults)."""
from agent.agent import ComputerAgent

agent = ComputerAgent(model="anthropic/claude-sonnet-4-5-20250929")

agent_init_calls = [
call for call in mock_record_event.call_args_list if call[0][0] == "agent_init"
]

assert len(agent_init_calls) == 1
event_name, event_data = agent_init_calls[0][0]

# With defaults, only model-related things should be tracked
# instructions, trajectory_dir, etc. should NOT be in args_provided
assert "instructions" not in event_data["args_provided"]
assert "trajectory_dir" not in event_data["args_provided"]
assert "max_retries" not in event_data["args_provided"] # default is 3

@patch("agent.agent.record_event")
@patch("agent.agent.is_telemetry_enabled", return_value=False)
def test_no_events_when_telemetry_disabled(self, mock_telemetry_enabled, mock_record_event):
"""Test that no events are emitted when telemetry is disabled."""
from agent.agent import ComputerAgent

agent = ComputerAgent(
model="anthropic/claude-sonnet-4-5-20250929",
telemetry_enabled=False,
)

# No agent_init should be called (telemetry disabled)
agent_init_calls = [
call for call in mock_record_event.call_args_list if call[0][0] == "agent_init"
]

assert len(agent_init_calls) == 0


class TestActionTelemetryEvents:
"""Test telemetry events for computer actions."""

@pytest.mark.asyncio
@patch("agent.agent.record_event")
@patch("agent.agent.is_telemetry_enabled", return_value=True)
async def test_computer_action_executed_event(self, mock_telemetry_enabled, mock_record_event):
"""Test that computer_action_executed is emitted for computer calls."""
from agent.agent import ComputerAgent

agent = ComputerAgent(model="anthropic/claude-sonnet-4-5-20250929")
agent.telemetry_enabled = True

# Mock computer handler
mock_computer = MagicMock()
mock_computer.click = AsyncMock(return_value=None)
mock_computer.screenshot = AsyncMock(return_value="base64screenshot")

# Create a mock computer_call item
item = {
"type": "computer_call",
"call_id": "test-call-id",
"action": {
"type": "click",
"x": 100,
"y": 200,
},
}

# Process the item (this would normally happen in the agent loop)
# Note: We can't easily test this without running the full agent loop
# This is more of an integration test

# For unit testing, we verify the event structure
expected_event = {
"action_type": "click",
}

# Verify event structure is correct
assert "action_type" in expected_event


class TestToolExecutedEvents:
"""Test telemetry events for tool execution."""

def test_event_structure(self):
"""Test that agent_tool_executed event has correct structure."""
expected_computer_tool_event = {
"tool_type": "computer",
"tool_name": "click",
}

expected_function_tool_event = {
"tool_type": "function",
"tool_name": "my_custom_function",
}

# Verify expected structure
assert "tool_type" in expected_computer_tool_event
assert "tool_name" in expected_computer_tool_event
assert expected_computer_tool_event["tool_type"] in ["computer", "function"]


if __name__ == "__main__":
pytest.main([__file__, "-v"])
10 changes: 10 additions & 0 deletions libs/python/computer-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ python -m computer_server --width 1512 --height 982
```

This provides:

- HTTP API at `/ws`, `/cmd`, `/status` endpoints
- MCP server at `/mcp` endpoint (requires `fastmcp` package)

Expand All @@ -53,11 +54,13 @@ This ensures the AI model sees consistent coordinates between screenshots and mo
#### Claude Code Integration

1. Start the server (or run as a service/LaunchAgent):

```bash
python -m computer_server --port 8000
```

2. Add the MCP server URL to Claude Code:

```bash
claude mcp add cua-computer-server --transport http http://localhost:8000/mcp
```
Expand All @@ -67,6 +70,7 @@ claude mcp add cua-computer-server --transport http http://localhost:8000/mcp
The MCP interface exposes 40+ tools for computer control:

### Screen & Mouse

- `computer_screenshot` - Capture current screen
- `computer_click` - Click at coordinates
- `computer_double_click` - Double-click
Expand All @@ -77,26 +81,31 @@ The MCP interface exposes 40+ tools for computer control:
- `computer_get_cursor_position` - Get cursor position

### Keyboard

- `computer_type` - Type text
- `computer_press_key` - Press a single key
- `computer_hotkey` - Press key combination (e.g., Ctrl+C)
- `computer_key_down` / `computer_key_up` - Hold/release keys

### Clipboard

- `computer_clipboard_get` - Get clipboard content
- `computer_clipboard_set` - Set clipboard content

### Shell

- `computer_run_command` - Execute shell command

### File System

- `computer_file_read` / `computer_file_write` - Read/write files
- `computer_file_exists` / `computer_directory_exists` - Check existence
- `computer_list_directory` - List directory contents
- `computer_create_directory` - Create directory
- `computer_delete_file` / `computer_delete_directory` - Delete files/directories

### Window Management

- `computer_open` - Open file or URL
- `computer_launch_app` - Launch application
- `computer_get_active_window` - Get active window
Expand All @@ -105,5 +114,6 @@ The MCP interface exposes 40+ tools for computer control:
- `computer_close_window` - Close window

### Accessibility

- `computer_get_accessibility_tree` - Get UI element tree
- `computer_find_element` - Find UI element by role/title
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@

# Trigger screen recording prompt on macOS
try:
from PIL import ImageGrab, Image
from PIL import Image, ImageGrab

ImageGrab.grab()
except Exception as e:
Expand Down
Loading
Loading