Skip to content

A real-time remote browser streaming system that allows users to interact with a browser instance running on a server through a web interface.

Notifications You must be signed in to change notification settings

johntharian/Remote-Browser-Client-Server-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Remote-Browser-Client-System

A real-time remote browser streaming system that allows users to interact with a browser instance running on a server through a web interface.

📋 Table of Contents

🏗️ Architecture and Design Choices

System Architecture

┌─────────────────┐     ┌───────────────────────┐     ┌────────────────────┐
│                 │     │                       │     │                    │
│  Web Client     │◄───►│  Go Server            │◄───►│  Playwright        │
│  (Browser)      │     │  (WebRTC Signaling)   │     │ (Headless Browser) │
│                 │     │                       │     │                    │
└─────────────────┘     └───────────────────────┘     └────────────────────┘

Technology Stack

1. Server-Side (Go)

  • Language/Framework: Go (Golang)

    • Chosen for its excellent concurrency model and performance
    • Built-in HTTP server for handling WebRTC signaling
    • Strong type safety and standard library
  • WebRTC (Pion Library)

    • Enables real-time, peer-to-peer communication
    • Handles NAT traversal using STUN servers
    • Efficient data channel for browser control messages
    • Low-latency video streaming capabilities
  • Playwright-Go

    • Cross-browser automation
    • Headless browser control
    • Screenshot capture for screen sharing
    • JavaScript execution in the browser context

2. Client-Side (JavaScript/HTML5)

  • WebRTC Data Channels

    • Bi-directional communication
    • Low-latency message passing
  • Canvas API

    • Efficient rendering of remote screen updates

Key Design Decisions

  1. WebRTC Over WebSockets

    • Chose WebRTC for its peer-to-peer capabilities
    • Lower latency for real-time interaction
    • Better handling of media streaming
    • Built-in NAT traversal
  2. Screenshot-Based Streaming

    • Simple implementation using Playwright's screenshot API
    • Frame-by-frame updates for screen sharing
  3. Graceful Shutdown

    • Proper cleanup of resources on server shutdown
    • Signal handling for Ctrl+C
    • Graceful degradation of services
  4. Security Considerations

    • Input validation on both client and server
    • CORS protection
  5. Error Handling

    • Comprehensive logging throughout the application
    • Graceful recovery from panics
    • User-friendly error messages

Key Interfaces

The system is built around several key interfaces that define the contract between different components:

1. BrowserSession (internal/browser/browser.go)

  • Purpose: Defines the contract for browser automation and interaction
  • Key Methods:
    • Start(url string) error: Initializes and navigates the browser
    • CaptureScreenshot() ([]byte, error): Takes screenshots of the current viewport
    • HandleEvent(event UserEvent) error: Processes various user input events
    • HandleClick(x, y int) error: Simulates mouse clicks
    • HandleKeyUp/HandleKeyDown(key string) error: Simulates keyboard input
    • HandleScroll(deltaX, deltaY float64) error: Simulates scrolling
  • Why?
    • Decouples browser implementation from the streaming logic
    • Makes it easy to swap different browser automation backends (e.g., Playwright, CDP, etc)
    • Simplifies unit testing through mocking

2. UserEvent (internal/events/events.go)

  • Purpose: Represents user interactions in a serializable format
  • Key Types:
    • click: Mouse click events with X,Y coordinates
    • keyup/keydown: Keyboard events
    • scroll: Scroll events with delta values
    • navigate*: Browser navigation commands
  • Why?
    • Standardizes event representation across the application
    • Enables easy JSON serialization for WebRTC data channels
    • Makes it simple to add new event types in the future

3. StreamingSession (internal/streaming/session.go)

  • Purpose: Manages the lifecycle and state of a streaming session
  • Key Responsibilities:
    • Coordinates between WebRTC and browser sessions
    • Handles video frame streaming
    • Manages user input forwarding
  • Why?
    • Encapsulates all session-related state
    • Simplifies connection management
    • Makes the system more maintainable and testable

Design Decisions

  1. Interface-Based Design

    • All major components communicate through well-defined interfaces
    • Makes the system more modular and easier to maintain
  2. Event-Driven Architecture

    • User interactions are modeled as discrete events
    • Makes the system more responsive and easier to debug
  3. Separation of Concerns

    • Clear separation between browser control, networking, and UI
    • Each component has a single responsibility
    • Makes the codebase more maintainable and easier to understand

🎥 Demo

Workging Demo

Click the image above to watch the demo video on Loom

🚀 Setup

  1. Clone the repository and navigate to the project directory:
    git clone https://github.com/johntharian/Remote-Browser-Client-Server-System.git
    cd Remote-Browser-Client-Server-System

Prerequisites

Install Go

Download the installer from golang.org/dl - Run the installer and follow the prompts

Verify installations:

go version

Local Development

  1. Install Go dependencies:

    go mod tidy
  2. Install Playwright

    # Install Playwright
    go get -u github.com/playwright-community/playwright-go
    
    # Install browsers (this might take a few minutes)
    go run github.com/playwright-community/playwright-go/cmd/playwright@v0.5200.0 install --with-deps
  3. Start the WebRTC server:

    go run cmd/server/main.go
  4. Open index.html (client) in a browser

Server

  • Runs a browser instance using Playwright.
  • Streams the rendered contents of the browser (e.g., a webpage opened in the browser instance) to connected clients.
  • Accepts user actions (e.g., clicks, typing, navigation) from a client and applies them to the Playwright browser instance.

Client

  • Runs locally in a browser.
  • Displays the streamed contents of the remote Playwright browser session.
  • Lets the user interact with the page (e.g., by clicking, scrolling, or typing).
  • Sends those interactions back to the server, where they are executed in the Playwright-controlled browser.

Future Improvements

Core Functionality

  1. Chrome DevTools Protocol (CDP) Integration
    • Leverage Playwright's CDP support for more efficient browser session streaming
    • Implement frame-by-frame video streaming for smoother remote interaction

Security & Reliability

  1. Security Features
    • Add authentication and authorization
    • Implement session encryption
    • Add rate limiting and abuse prevention

Scalability

  1. Multi-User Support
    • Session sharing capabilities
    • Collaborative browsing features

Client

  1. User Interface Improvements
    • Fix issue with input bar not working
    • Add a loading spinner while waiting for the connection to be established
    • Add a button to disconnect from the server
    • Add a button to refresh the page
    • Add a button to close the browser

About

A real-time remote browser streaming system that allows users to interact with a browser instance running on a server through a web interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published