A real-time remote browser streaming system that allows users to interact with a browser instance running on a server through a web interface.
- 🏗️ Architecture & Design Choices
- Demo
- 🚀 Setup
- 🛠️ Prerequisites
- 💻 Local Development
- 🌐 Server
- 🌐 Client
- 📈 Future Improvements
┌─────────────────┐ ┌───────────────────────┐ ┌────────────────────┐
│ │ │ │ │ │
│ Web Client │◄───►│ Go Server │◄───►│ Playwright │
│ (Browser) │ │ (WebRTC Signaling) │ │ (Headless Browser) │
│ │ │ │ │ │
└─────────────────┘ └───────────────────────┘ └────────────────────┘
-
Language/Framework: Go (Golang)
- Chosen for its excellent concurrency model and performance
- Built-in HTTP server for handling WebRTC signaling
- Strong type safety and standard library
-
WebRTC (Pion Library)
- Enables real-time, peer-to-peer communication
- Handles NAT traversal using STUN servers
- Efficient data channel for browser control messages
- Low-latency video streaming capabilities
-
Playwright-Go
- Cross-browser automation
- Headless browser control
- Screenshot capture for screen sharing
- JavaScript execution in the browser context
-
WebRTC Data Channels
- Bi-directional communication
- Low-latency message passing
-
Canvas API
- Efficient rendering of remote screen updates
-
WebRTC Over WebSockets
- Chose WebRTC for its peer-to-peer capabilities
- Lower latency for real-time interaction
- Better handling of media streaming
- Built-in NAT traversal
-
Screenshot-Based Streaming
- Simple implementation using Playwright's screenshot API
- Frame-by-frame updates for screen sharing
-
Graceful Shutdown
- Proper cleanup of resources on server shutdown
- Signal handling for Ctrl+C
- Graceful degradation of services
-
Security Considerations
- Input validation on both client and server
- CORS protection
-
Error Handling
- Comprehensive logging throughout the application
- Graceful recovery from panics
- User-friendly error messages
The system is built around several key interfaces that define the contract between different components:
- Purpose: Defines the contract for browser automation and interaction
- Key Methods:
Start(url string) error
: Initializes and navigates the browserCaptureScreenshot() ([]byte, error)
: Takes screenshots of the current viewportHandleEvent(event UserEvent) error
: Processes various user input eventsHandleClick(x, y int) error
: Simulates mouse clicksHandleKeyUp/HandleKeyDown(key string) error
: Simulates keyboard inputHandleScroll(deltaX, deltaY float64) error
: Simulates scrolling
- Why?
- Decouples browser implementation from the streaming logic
- Makes it easy to swap different browser automation backends (e.g., Playwright, CDP, etc)
- Simplifies unit testing through mocking
- Purpose: Represents user interactions in a serializable format
- Key Types:
click
: Mouse click events with X,Y coordinateskeyup
/keydown
: Keyboard eventsscroll
: Scroll events with delta valuesnavigate*
: Browser navigation commands
- Why?
- Standardizes event representation across the application
- Enables easy JSON serialization for WebRTC data channels
- Makes it simple to add new event types in the future
- Purpose: Manages the lifecycle and state of a streaming session
- Key Responsibilities:
- Coordinates between WebRTC and browser sessions
- Handles video frame streaming
- Manages user input forwarding
- Why?
- Encapsulates all session-related state
- Simplifies connection management
- Makes the system more maintainable and testable
-
Interface-Based Design
- All major components communicate through well-defined interfaces
- Makes the system more modular and easier to maintain
-
Event-Driven Architecture
- User interactions are modeled as discrete events
- Makes the system more responsive and easier to debug
-
Separation of Concerns
- Clear separation between browser control, networking, and UI
- Each component has a single responsibility
- Makes the codebase more maintainable and easier to understand
Click the image above to watch the demo video on Loom
- Clone the repository and navigate to the project directory:
git clone https://github.com/johntharian/Remote-Browser-Client-Server-System.git cd Remote-Browser-Client-Server-System
Download the installer from golang.org/dl - Run the installer and follow the prompts
Verify installations:
go version
-
Install Go dependencies:
go mod tidy
-
Install Playwright
# Install Playwright go get -u github.com/playwright-community/playwright-go # Install browsers (this might take a few minutes) go run github.com/playwright-community/playwright-go/cmd/playwright@v0.5200.0 install --with-deps
-
Start the WebRTC server:
go run cmd/server/main.go
-
Open index.html (client) in a browser
- Runs a browser instance using Playwright.
- Streams the rendered contents of the browser (e.g., a webpage opened in the browser instance) to connected clients.
- Accepts user actions (e.g., clicks, typing, navigation) from a client and applies them to the Playwright browser instance.
- Runs locally in a browser.
- Displays the streamed contents of the remote Playwright browser session.
- Lets the user interact with the page (e.g., by clicking, scrolling, or typing).
- Sends those interactions back to the server, where they are executed in the Playwright-controlled browser.
- Chrome DevTools Protocol (CDP) Integration
- Leverage Playwright's CDP support for more efficient browser session streaming
- Implement frame-by-frame video streaming for smoother remote interaction
- Security Features
- Add authentication and authorization
- Implement session encryption
- Add rate limiting and abuse prevention
- Multi-User Support
- Session sharing capabilities
- Collaborative browsing features
- User Interface Improvements
- Fix issue with input bar not working
- Add a loading spinner while waiting for the connection to be established
- Add a button to disconnect from the server
- Add a button to refresh the page
- Add a button to close the browser