Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions agent_docs/common-issues/element-not-found.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Click/Input Fails - Element Not Found

> Related: [ui-operations.md](../operations/ui-operations.md), [data-extraction.md](../operations/data-extraction.md)

**Symptom:** "Element not found" or click has no effect

**Diagnose:** Use `return_html` to see actual DOM:
```json
{"type": "return_html"}
```

Or use `js_evaluate` to check:
```javascript
(function() {
const el = document.querySelector('#my-button');
console.log('Found:', !!el);
console.log('Tag:', el?.tagName);
console.log('Visible:', el?.offsetParent !== null);
return el?.outerHTML;
})()
```

**Solutions:**

| Problem | Fix |
|---------|-----|
| Element not loaded yet | Add `sleep` before interaction |
| Wrong selector | Use browser DevTools to verify selector |
| Element in iframe | Not supported - use fetch instead |
| Dynamic ID | Use attribute selector: `[data-testid="submit"]` |
42 changes: 42 additions & 0 deletions agent_docs/common-issues/fetch-returns-html.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Fetch Returns HTML Instead of JSON

> Related: [fetch.md](../operations/fetch.md), [js-evaluate.md](../operations/js-evaluate.md)

**Symptom:** Response is HTML error page, not API data

**Causes:**
- Wrong URL (redirected to login page)
- Missing auth (see [unauthenticated.md](unauthenticated.md))
- CORS blocked

**Diagnose:** Use `js_evaluate` to inspect what you got:
```javascript
(function() {
const raw = sessionStorage.getItem('my_fetch_result');
console.log('Response type:', typeof raw);
console.log('First 500 chars:', raw?.substring(0, 500));
console.log('Looks like HTML:', raw?.includes('<html') || raw?.includes('<!DOCTYPE'));
return raw;
})()
```

**Fix:** If you're getting HTML, parse it with JS to extract the data you need:
```javascript
(function() {
const html = sessionStorage.getItem('my_fetch_result');
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');

// Extract what you need from the HTML
return {
title: doc.querySelector('title')?.textContent,
data: Array.from(doc.querySelectorAll('table tr')).map(row => ({
cells: Array.from(row.querySelectorAll('td')).map(td => td.textContent.trim())
})),
links: Array.from(doc.querySelectorAll('a')).map(a => ({
text: a.textContent.trim(),
href: a.href
}))
};
})()
```
92 changes: 92 additions & 0 deletions agent_docs/common-issues/js-evaluate-issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# js_evaluate Issues

> Related: [js-evaluate.md](../operations/js-evaluate.md), [fetch.md](../operations/fetch.md)

---

## Returns undefined

**Symptom:** `session_storage_key` contains `null` or nothing

**Causes & Fixes:**

| Cause | Fix |
|-------|-----|
| Missing IIFE wrapper | Wrap in `(function() { ... })()` |
| No return statement | Add `return` before the value |
| Async without await | Use `async function` and `await` |

```javascript
// WRONG
document.title

// WRONG - no return
(function() { const x = document.title; })()

// RIGHT
(function() { return document.title; })()
```

---

## Can't Access Fetch Data

**Symptom:** `JSON.parse()` fails, `data.items` is undefined

**Cause:** Fetch stores results as **strings**. You must parse them.

**Important:** Data may be **doubly stringified**. You may need to call `JSON.parse()` twice!

```javascript
// WRONG - raw is a string!
const raw = sessionStorage.getItem('api_response');
return raw.items; // undefined

// RIGHT - parse first
const raw = sessionStorage.getItem('api_response');
const data = JSON.parse(raw);
return data.items;

// If still a string, parse again!
const raw = sessionStorage.getItem('api_response');
let data = JSON.parse(raw);
if (typeof data === 'string') {
data = JSON.parse(data); // Double parse
}
return data.items;
```

**Always debug with console.log:**
```javascript
(function() {
const raw = sessionStorage.getItem('api_response');
console.log('Raw type:', typeof raw);
console.log('Raw value:', raw?.substring(0, 200));

let data = JSON.parse(raw);
console.log('After first parse, type:', typeof data);

if (typeof data === 'string') {
console.log('Still a string! Parsing again...');
data = JSON.parse(data);
}

console.log('Final data keys:', Object.keys(data));
return data;
})()
```
Check `console_logs` in operation metadata.

---

## Blocked Pattern Error

**Symptom:** Error about blocked JavaScript pattern

**Cause:** Security restrictions block certain APIs.

**Blocked patterns:**
- `fetch()` → Use `fetch` operation instead
- `eval()`, `Function()` → Rewrite without dynamic code
- `addEventListener()` → Not supported
- `location`, `history` → Use `navigate` operation
16 changes: 16 additions & 0 deletions agent_docs/common-issues/page-not-loaded.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Page Not Fully Loaded

> Related: [navigation.md](../operations/navigation.md), [ui-operations.md](../operations/ui-operations.md)

**Symptom:** Element not found, fetch returns unexpected data, click fails

**Solution:** Add more sleep time after navigation

```json
{"type": "navigate", "url": "https://example.com", "sleep_after_navigation_seconds": 5.0}
```

Or add explicit sleep:
```json
{"type": "sleep", "timeout_seconds": 3.0}
```
21 changes: 21 additions & 0 deletions agent_docs/common-issues/placeholder-not-resolved.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Placeholder Not Resolved

> Related: [placeholders.md](../core/placeholders.md), [fetch.md](../operations/fetch.md)

**Symptom:** Literal `{{param}}` appears in request, or value is empty

**Causes & Fixes:**

| Cause | Fix |
|-------|-----|
| String not escape-quoted | Use `"\"{{param}}\""` not `"{{param}}"` |
| Storage placeholder in navigate | Not supported - only user params work in URLs |
| Storage placeholder in js_evaluate | Access directly: `sessionStorage.getItem('key')` |
| Wrong path | Check exact key name and nesting |

**Check what's in storage:**
```javascript
(function() {
return JSON.parse(sessionStorage.getItem('my_key'));
})()
```
46 changes: 46 additions & 0 deletions agent_docs/common-issues/unauthenticated.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Fetch Returns 401/403 (Unauthenticated)

> Related: [fetch.md](../operations/fetch.md), [js-evaluate.md](../operations/js-evaluate.md)

**Symptom:** API returns unauthorized error

**Diagnose:** Use `js_evaluate` to inspect what auth exists:
```javascript
(function() {
return {
cookies: document.cookie,
sessionStorage: Object.fromEntries(
Object.keys(sessionStorage).map(k => [k, sessionStorage.getItem(k)])
),
localStorage: Object.fromEntries(
Object.keys(localStorage).map(k => [k, localStorage.getItem(k)])
),
windowConfig: window.__CONFIG__ || window.__INITIAL_STATE__ || null
};
})()
```

**Solutions:**

| Problem | Fix |
|---------|-----|
| Cookies not sent | Set `"credentials": "include"` |
| Wrong origin | Navigate to API origin first |
| Token in JS variable | Extract via `js_evaluate`, use in header |
| HttpOnly cookie needed | Use `get_cookies` operation |

**Example: Navigate first, then fetch with cookies**
```json
[
{"type": "navigate", "url": "https://example.com"},
{"type": "sleep", "timeout_seconds": 2.0},
{
"type": "fetch",
"endpoint": {
"url": "https://example.com/api/data",
"method": "GET",
"credentials": "include"
}
}
]
```
125 changes: 125 additions & 0 deletions agent_docs/core/execution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Execution

> How routines are executed via Chrome DevTools Protocol (CDP).

**Code:** [routine.py](web_hacker/data_models/routine/routine.py) (`Routine.execute()`), [execution.py](web_hacker/data_models/routine/execution.py)

## Execution Flow

1. **Create/attach browser tab** - New incognito tab or attach to existing `tab_id`
2. **Enable CDP domains** - Page, Runtime, Network, DOM
3. **Execute operations sequentially** - Each operation interpolates parameters, resolves placeholders, executes via CDP
4. **Collect result** - Final data from `return` or `return_html` operation
5. **Cleanup** - Close tab (unless `close_tab_when_done=False`)

## RoutineExecutionResult

```python
class RoutineExecutionResult(BaseModel):
ok: bool = True # Success/failure
error: str | None = None # Error message if failed
warnings: list[str] = [] # Non-fatal warnings
data: dict | list | str | None = None # Final result data
operations_metadata: list[OperationExecutionMetadata] = [] # Per-operation timing/details
placeholder_resolution: dict[str, str | None] = {} # Resolved placeholder values
is_base64: bool = False # True if data is base64-encoded binary
content_type: str | None = None # MIME type (for downloads)
filename: str | None = None # Suggested filename (for downloads)
```

## RoutineExecutionContext

A **mutable context** passed to each operation. Operations read from it and write results back:

```python
class RoutineExecutionContext(BaseModel):
# CDP connection
session_id: str # CDP session ID
ws: WebSocket # WebSocket connection
send_cmd: Callable # CDP command sender
recv_until: Callable # CDP response receiver

# Input
parameters_dict: dict = {} # User-provided parameters
timeout: float = 180.0 # Operation timeout

# Mutable state (operations update these)
current_url: str = "about:blank" # Updated by navigate operations
result: RoutineExecutionResult # Final result - operations set result.data
current_operation_metadata: OperationExecutionMetadata | None # Current op metadata
```

**How operations mutate context:**
- `navigate` → updates `current_url`
- `fetch` → stores response in sessionStorage, updates `result.placeholder_resolution`
- `return` / `return_html` → sets `result.data`
- `download` → sets `result.data`, `result.is_base64`, `result.filename`
- All operations → append to `result.operations_metadata`

## Operation Metadata

Every operation automatically records execution metadata:

```python
class OperationExecutionMetadata(BaseModel):
type: str # Operation type (e.g., "fetch", "click")
duration_seconds: float # Execution time
details: dict = {} # Operation-specific data
error: str | None # Error if operation failed
```

**What gets stored in `details`:**
- `fetch` → `request`, `response` (method, url, status, headers)
- `click` → `selector`, `element` (tag, id, classes), `click_coordinates`
- `input_text` → `selector`, `text_length`, `element`
- `js_evaluate` → `console_logs`, `execution_error`, `storage_error`

Access after execution:
```python
result = routine.execute(params)
for op_meta in result.operations_metadata:
print(f"{op_meta.type}: {op_meta.duration_seconds:.2f}s")
if op_meta.details.get("response"):
print(f" Status: {op_meta.details['response']['status']}")
```

## Operation Execution

Each operation's `execute()` method:

1. Creates `OperationExecutionMetadata` with `type`
2. Calls `_execute_operation()` (operation-specific logic)
3. Operation mutates `context.result` and adds to `details`
4. Records `duration_seconds` and any `error`
5. Appends metadata to `context.result.operations_metadata`

```python
# Simplified from RoutineOperation.execute()
def execute(self, context):
context.current_operation_metadata = OperationExecutionMetadata(type=self.type)
start = time.perf_counter()
try:
self._execute_operation(context) # Subclass implements this
except Exception as e:
context.current_operation_metadata.error = str(e)
finally:
context.current_operation_metadata.duration_seconds = time.perf_counter() - start
context.result.operations_metadata.append(context.current_operation_metadata)
```

## Error Handling

- **CDP errors** - Connection/protocol failures → `result.ok = False`, `result.error` set
- **Operation errors** - JS/fetch failures → Raised as `RuntimeError`, caught at routine level
- **Placeholder warnings** - Unresolved placeholders → Added to `result.warnings`

## Download Results

For `download` operations, result contains base64 data:

```python
if result.is_base64 and result.filename:
import base64
with open(result.filename, "wb") as f:
f.write(base64.b64decode(result.data))
```
Loading