forked from SawyerHood/dev-browser
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathllm-guide.txt
More file actions
130 lines (116 loc) · 6.03 KB
/
Copy pathllm-guide.txt
File metadata and controls
130 lines (116 loc) · 6.03 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
LLM USAGE GUIDE:
Write small, focused scripts. Each script should do ONE thing: navigate, click, fill, or check.
End each script by logging the state you need for the next decision.
Use descriptive page names like "login", "checkout", or "results" instead of "page1".
Named pages from browser.getPage("name") persist between script runs, so you usually do not need to re-navigate.
Inside page.evaluate(...), write plain JavaScript only - no TypeScript syntax in the browser context.
On Windows/PowerShell, use here-strings to pipe multiline scripts:
@"
const page = await browser.getPage("main");
console.log(await page.title());
"@ | dev-browser --connect
Quick inspection:
dev-browser --connect <<'EOF'
const tabs = await browser.listPages();
console.log(JSON.stringify(tabs, null, 2));
EOF
dev-browser --connect <<'EOF'
const page = await browser.getPage("TARGET_ID_HERE");
console.log(JSON.stringify({
url: page.url(),
title: await page.title(),
}, null, 2));
EOF
AI snapshots for element discovery:
dev-browser <<'EOF'
const page = await browser.getPage("main");
const result = await page.snapshotForAI();
console.log(result.full);
// Returns { full: string, incremental?: string }.
// Optional args: { track?: string, depth?: number, timeout?: number }.
// Read result.full to identify the right element.
// Then interact with it using Playwright:
// await page.getByRole("button", { name: "Continue" }).click();
// Re-run page.snapshotForAI({ track: "main" }) after the page changes.
EOF
Choosing your approach:
Unknown pages: use page.snapshotForAI() first to discover the page, then interact based on what you find.
Known pages/selectors: skip the snapshot and use direct Playwright selectors like page.click(), page.fill(), or page.locator() for faster, more reliable automation.
Screenshots for visual state:
dev-browser <<'EOF'
const page = await browser.getPage("main");
const buf = await page.screenshot();
const path = await saveScreenshot(buf, "debug.png");
console.log(path);
EOF
Waiting patterns:
dev-browser <<'EOF'
const page = await browser.getPage("search-results");
await page.waitForSelector(".results");
await page.waitForURL("**/success");
console.log(JSON.stringify({
url: page.url(),
title: await page.title(),
}, null, 2));
EOF
Dev server navigation:
For local dev servers (Next.js, Vite, etc.), prefer:
await page.goto(url, { waitUntil: "domcontentloaded" });
The default "load" wait can hang on HMR, streaming, or other long-lived dev-server connections.
Use "load" only when you specifically need every subresource to finish loading.
Error recovery:
If a script fails, the page usually stays where it stopped.
Reconnect to the same page name, take a screenshot, and log the URL/title:
dev-browser <<'EOF'
const page = await browser.getPage("checkout");
const path = await saveScreenshot(await page.screenshot(), "debug.png");
console.log(JSON.stringify({
screenshot: path,
url: page.url(),
title: await page.title(),
}, null, 2));
EOF
Common Playwright Page methods:
page.goto(url, { waitUntil: "domcontentloaded" })
Navigate to a URL; prefer this on dev servers
page.title() Get the current page title
page.url() Get the current URL
page.snapshotForAI(options) Get an AI-optimized snapshot; returns { full, incremental? }
Options: { track?: string, depth?: number, timeout?: number }
page.getByRole(role, { name }) Target elements discovered from the snapshot
page.textContent(selector) Get the text content of an element
page.innerHTML(selector) Get the inner HTML of an element
page.fill(selector, value) Fill an input field
page.click(selector) Click an element
page.type(selector, text) Type text character by character
page.press(selector, key) Press a key such as Enter or Tab
page.waitForSelector(selector) Wait for an element to appear
page.waitForURL(url) Wait for navigation to a URL
page.screenshot() Capture a screenshot buffer; save it with saveScreenshot(...)
page.$$eval(selector, fn) Run a function on all matching elements
page.$eval(selector, fn) Run a function on the first matching element
page.evaluate(fn) Run JavaScript in the page context (plain JS only)
page.locator(selector) Create a locator for chained actions
Connecting to a running Chrome instance:
Auto-discover Chrome with debugging enabled:
dev-browser --connect <<'EOF'
const page = await browser.getPage("main");
console.log(await page.title());
EOF
Connect to a specific CDP endpoint:
dev-browser --connect http://localhost:9222 <<'EOF'
const page = await browser.getPage("main");
console.log(await page.title());
EOF
To launch Chrome with debugging enabled:
chrome.exe --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222
Or visit chrome://inspect/#remote-debugging to configure.
Tips:
- Use console.log(JSON.stringify(...)) for structured output.
- Prefer page.snapshotForAI() for structure; use screenshots when visual layout or styling matters.
- Keep page names stable across scripts so you can resume work after failures.
- Each --browser name maps to a separate daemon-managed browser instance.
- Use --connect to attach to an existing browser; omit the URL to auto-discover Chrome with debugging enabled.
- Use short timeouts (--timeout 10) so scripts fail fast instead of hanging on missing elements.
- Add --headless for unattended automation; omit it when you want to watch the browser window.