A high-performance Playwright MCP (Model Context Protocol) server with intelligent DOM compression and content search capabilities for browser automation.
- 🎭 Full Playwright browser automation via MCP
- 🏗️ Client-server architecture with HTTP API
- 📍 Ref-based element identification system (
[ref=e1],[ref=e2], etc.) - 🔍 Powerful regex-based content search using ripgrep
- 💾 Persistent browser profiles with Chrome
- 🚀 91%+ DOM compression with intelligent list folding
- 📄 Semantic HTML snapshots using Playwright's internal APIs
- ⚡ High-performance search with safety limits
npm install -g better-playwright-mcp3npm install better-playwright-mcp3Prerequisites:
-
First, start the HTTP server:
npx better-playwright-mcp3@latest server
-
Then use the SDK in your code:
import { PlaywrightClient } from 'better-playwright-mcp3';
async function automateWebPage() {
// Connect to the HTTP server (must be running)
const client = new PlaywrightClient('http://localhost:3102');
// Create a page
const { pageId, success } = await client.createPage(
'my-page', // page name
'Test page', // description
'https://example.com' // URL
);
// Get page structure with intelligent folding
const outline = await client.getOutline(pageId);
console.log(outline);
// Returns compressed outline (~90% reduction) with list folding
// Search for specific content (regex by default)
const searchResult = await client.searchSnapshot(pageId, 'Example', { ignoreCase: true });
console.log(searchResult);
// Search with regular expressions (default behavior)
const prices = await client.searchSnapshot(pageId, '\\$[0-9]+\\.\\d{2}', { lineLimit: 10 });
// Search multiple patterns (OR)
const links = await client.searchSnapshot(pageId, 'link|button|input', { ignoreCase: true });
// Interact with the page using ref identifiers
await client.browserClick(pageId, 'e3'); // Click element
await client.browserType(pageId, 'e4', 'Hello World'); // Type text
await client.browserHover(pageId, 'e2'); // Hover over element
// Navigation
await client.browserNavigate(pageId, 'https://google.com');
await client.browserNavigateBack(pageId);
await client.browserNavigateForward(pageId);
// Scrolling
await client.scrollToBottom(pageId);
await client.scrollToTop(pageId);
// Waiting
await client.waitForTimeout(pageId, 2000); // Wait 2 seconds
await client.waitForSelector(pageId, 'body');
// Take screenshots
const screenshot = await client.screenshot(pageId, true); // Full page
// Clean up
await client.closePage(pageId);
}Available Methods:
- Page Management:
createPage,closePage,listPages - Navigation:
browserNavigate,browserNavigateBack,browserNavigateForward - Interaction:
browserClick,browserType,browserHover,browserSelectOption,fill - Advanced Actions:
browserPressKey,browserFileUpload,browserHandleDialog - Page Structure:
getOutline- Get intelligently compressed page structure with list folding (NEW in v3.2.0) - Content Search:
searchSnapshot- Search page content with regex patterns (powered by ripgrep) - Screenshots:
screenshot- Capture page as image - Scrolling:
scrollToBottom,scrollToTop - Waiting:
waitForTimeout,waitForSelector
The MCP server requires an HTTP server to be running. You need to start both:
Step 1: Start the HTTP server
npx better-playwright-mcp3@latest serverStep 2: In another terminal, start the MCP server
npx better-playwright-mcp3@latestThe MCP server will:
- Start listening on stdio for MCP protocol messages
- Connect to the HTTP server on port 3102
- Route browser automation commands through the HTTP server
You can run the HTTP server independently:
npx better-playwright-mcp3@latest serverOptions:
-p, --port <number>- Server port (default: 3102)--host <string>- Server host (default: localhost)--headless- Run browser in headless mode--chromium- Use Chromium instead of Chrome--no-user-profile- Do not use persistent user profile--user-data-dir <path>- User data directory
When used with AI assistants, the following tools are available:
createPage- Create a new browser page with name and descriptionclosePage- Close a specific pagelistPages- List all managed pages with titles and URLs
browserClick- Click an element using its ref identifierbrowserType- Type text into an elementbrowserHover- Hover over an elementbrowserSelectOption- Select options in a dropdownbrowserPressKey- Press keyboard keysbrowserFileUpload- Upload files to file inputbrowserHandleDialog- Handle browser dialogs (alert, confirm, prompt)browserNavigate- Navigate to a URLbrowserNavigateBack- Go back to previous pagebrowserNavigateForward- Go forward to next pagescrollToBottom- Scroll to bottom of page/elementscrollToTop- Scroll to top of page/elementwaitForTimeout- Wait for specified millisecondswaitForSelector- Wait for element to appear
searchSnapshot- Search page content using regex patterns (powered by ripgrep)screenshot- Take a screenshot (PNG/JPEG)
The outline generation uses a three-step compression algorithm:
- Unwrap - Remove meaningless generic wrapper nodes
- Text Truncation - Limit text content to 50 characters
- List Folding - Detect and compress repetitive patterns using SimHash
Original DOM (5000+ lines)
↓
[Remove empty wrappers]
↓
[Detect similar patterns]
↓
Compressed Outline (<500 lines, ~91% reduction)
Example compression:
// Before: 48 similar product cards
- listitem [ref=e234]: Product 1 details...
- listitem [ref=e235]: Product 2 details...
- listitem [ref=e236]: Product 3 details...
... (45 more items)
// After: Folded representation
- listitem [ref=e234]: Product 1 details...
- listitem (... and 47 more similar) [refs: e235, e236, ...]
This project implements a two-tier architecture optimized for minimal token usage:
- MCP Server - Communicates with AI assistants via Model Context Protocol
- HTTP Server - Controls browser instances and provides grep-based search
AI Assistant <--[MCP Protocol]--> MCP Server <--[HTTP]--> HTTP Server <---> Browser
|
v
ripgrep engine
- Minimal Token Usage: Intelligent compression reduces DOM by ~91%
- On-Demand Search: Content retrieved via regex patterns when needed
- Performance: Uses ripgrep for 10x+ faster searching
- Safety: Automatic result limiting to prevent context overflow
Elements in snapshots are identified using ref attributes (e.g., [ref=e1], [ref=e2]). This system:
- Provides stable identifiers for elements
- Works with Playwright's internal
aria-refselectors - Enables precise element targeting across page changes
Example snapshot:
- generic [ref=e2]:
- heading "Example Domain" [level=1] [ref=e3]
- paragraph [ref=e4]: This domain is for use in illustrative examples
- link "More information..." [ref=e5] [cursor=pointer]
// Create a page
const { pageId, success } = await client.createPage(
'shopping',
'Amazon shopping page',
'https://amazon.com'
);
// Navigate to another URL
await client.browserNavigate(pageId, 'https://google.com');
// Go back/forward
await client.browserNavigateBack(pageId);
await client.browserNavigateForward(pageId);// Get intelligently compressed page outline
const outline = await client.getOutline(pageId);
console.log(outline);
// Example output showing list folding:
// Page Outline (473/5257 lines):
// - banner [ref=e1]
// - navigation [ref=e2]
// - list "Products" [ref=e3]
// - listitem "Product 1" [ref=e4]
// - listitem (... and 47 more similar) [refs: e5, e6, ...]
//
// Compression: 91% reduction while preserving all refs// Search for text (case insensitive)
const results = await client.searchSnapshot(pageId, 'product', { ignoreCase: true });
// Search with regular expression (default behavior)
const emails = await client.searchSnapshot(pageId, '[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-z]+');
// Search multiple patterns (OR)
const buttons = await client.searchSnapshot(pageId, 'button|submit|click', { ignoreCase: true });
// Search for prices with dollar sign
const prices = await client.searchSnapshot(pageId, '\\$\\d+\\.\\d{2}');
// Limit number of result lines
const firstTen = await client.searchSnapshot(pageId, 'item', { lineLimit: 10 });Search Options:
pattern(required) - Regex pattern to search forignoreCase(optional) - Case insensitive search (default: false)lineLimit(optional) - Maximum lines to return (default: 100, max: 100)
Response Format:
result- Matched text contentmatchCount- Total number of matches foundtruncated- Whether results were truncated due to line limit
// Click on element using its ref identifier
await client.browserClick(pageId, 'e3');
// Type text into input field
await client.browserType(pageId, 'e4', 'search query');
// Hover over element
await client.browserHover(pageId, 'e2');
// Press keyboard key
await client.browserPressKey(pageId, 'Enter');// Scroll page
await client.scrollToBottom(pageId);
await client.scrollToTop(pageId);
// Wait operations
await client.waitForTimeout(pageId, 2000); // Wait 2 seconds
await client.waitForSelector(pageId, '#my-element');When using this library with AI assistants, follow this optimized workflow for maximum efficiency:
// Always begin by getting the compressed page structure
const outline = await client.getOutline(pageId);
// Returns intelligently compressed view with ~91% reductionThe outline provides:
- Complete page structure with intelligent list folding
- First element of each pattern preserved as sample
- All ref identifiers for precise element targeting
- Clear indication of repetitive patterns (e.g., "... and 47 more similar")
// Based on outline understanding, perform targeted searches
const searchResults = await client.searchSnapshot(pageId, 'specific term', {
ignoreCase: true,
lineLimit: 10
});
// Now you know exactly what to search for and where it might be// Use ref IDs discovered from outline or grep, not guesswork
await client.browserClick(pageId, 'e42'); // Ref ID confirmed from outlineToken Efficiency: Compressed outline (typically <500 lines) + targeted searches use far fewer tokens than full snapshots (often 5000+ lines)
Accuracy: The outline shows actual page structure, preventing incorrect assumptions about element locations
Smart Compression: The algorithm preserves one sample from each pattern group, so AI understands the structure without seeing all repetitions
❌ Don't blindly try random ref IDs without verification ❌ Don't request full snapshots that exceed token limits ❌ Don't make assumptions about page structure without checking the outline first ❌ Don't use generic search patterns when specific ones would be more efficient
// GOOD: Outline-first approach
const outline = await client.getOutline(pageId);
// Shows: "- listitem [ref=e234]: [first product]"
// "- listitem (... and 47 more similar) [refs: e235, e236, ...]"
// Now search for specific product attributes
const prices = await client.searchSnapshot(pageId, '\\$\\d+\\.\\d{2}', { lineLimit: 10 });
// BAD: Blind searching without context
const results = await client.searchSnapshot(pageId, 'product', { ignoreCase: true }); // Too generic
await client.browserClick(pageId, 'e1'); // Guessing ref IDs- Node.js >= 18.0.0
- TypeScript
- Chrome or Chromium browser
# Clone the repository
git clone https://github.com/yourusername/better-playwright-mcp.git
cd better-playwright-mcp
# Install dependencies
npm install
# Build the project
npm run build
# Run in development mode
npm run devbetter-playwright-mcp3/
├── src/
│ ├── index.ts # Main export file
│ ├── mcp-server.ts # MCP server implementation
│ ├── client/
│ │ └── playwright-client.ts # HTTP client for browser automation
│ ├── server/
│ │ └── playwright-server.ts # HTTP server controlling browsers
│ └── utils/
│ ├── smart-outline-simple.ts # Intelligent outline generation
│ ├── list-detector.ts # Pattern detection using SimHash
│ ├── dom-simhash.ts # SimHash implementation
│ └── remove-useless-wrappers.ts # DOM cleanup
├── bin/
│ └── cli.js # CLI entry point
├── docs/
│ └── architecture.md # Detailed architecture documentation
├── package.json
├── tsconfig.json
└── README.md
-
Port already in use
- Change the port using
-pflag:npx better-playwright-mcp3 server -p 3103 - Or set environment variable:
PORT=3103 npx better-playwright-mcp3 server
- Change the port using
-
Browser not launching
- Ensure Chrome or Chromium is installed
- Try using
--chromiumflag for Chromium - Check system resources
-
Element not found
- Verify the ref identifier exists in outline
- Use
searchSnapshot()to search for elements - Wait for elements using
waitForSelector()
-
Search returns too many results
- Use more specific patterns
- Use
lineLimitoption to limit results - Leverage regex features for precise matching
Enable detailed logging:
DEBUG=* npx better-playwright-mcp3Contributions are welcome! Please feel free to submit a Pull Request.
MIT