Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
assets		assets
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
package-lock.json		package-lock.json
readme.md		readme.md
requirements.txt		requirements.txt

Repository files navigation

WebRover

Your AI Co-pilot for Web Navigation 🚀

Autonomous Web Agent | Task Automation | Information Retrieval | Deep Research

Overview

WebRover is an autonomous AI agent designed to interpret user input and execute actions by interacting with web elements to accomplish tasks or answer questions. It leverages advanced language models and web automation tools to navigate the web, gather information, and provide structured responses based on the user's needs.

Key Features

Agent Capabilities

Three specialized agents for different use cases (Task, Research, Deep Research)
Dynamic agent selection based on task complexity
Real-time agent state visualization
Streaming agent actions and thoughts

Browser Integration

Local browser instance for privacy and control
Multi-tab management
PDF document handling
Secure browsing sessions

User Interface

Modern chat interface with real-time updates
Interactive agent selection
Action streaming with visual feedback
Real-time page annotations and highlights

Output Options

Direct chat responses
One-click Google Docs export
PDF download functionality
Copy to clipboard support

Research Tools

Vector store for information retention
Multi-source verification
Academic paper generation
Reference management

Technical Features

State-of-the-art LLM integration (GPT-4o, o3-mini-high, Claude-3.5 sonnet)
RAG pipeline for enhanced responses
LangGraph for state management
Playwright for reliable web automation

Agent Types

1. Task Agent

A specialized automation agent for executing web-based tasks and workflows.

Custom action planning for multi-step tasks
Dynamic element interaction based on context
Real-time task progress monitoring

2. Research Agent

An information gathering specialist with smart content processing.

Intelligent source selection and validation
Adaptive search refinement
Single-pass comprehensive information gathering

3. Deep Research Agent (New! 🎉)

An advanced research agent that produces academic-quality content through systematic topic exploration.

Automatic topic decomposition and structured research
Independent subtopic exploration
Academic paper generation with proper citations
Cross-referenced bibliography compilation

Agent Architecture Diagrams

Deep Research Agent Flow

Deep Research Agent's workflow for comprehensive research and content generation

Research Agent Flow

Research Agent's workflow for information gathering and synthesis

Task Agent Flow

Task Agent's workflow for automating web interactions

Architecture

The system is built on a modern tech stack with three distinct agent types, each powered by:

State Management
- LangGraph for maintaining agent state
- Handles complex navigation flows and decision making
- Structured workflow management
Browser Automation
- Playwright for reliable web interaction
- Custom element detection and interaction system
- Automated navigation and content extraction
Content Processing
- RAG (Retrieval Augmented Generation) pipeline
- Vector store integration for efficient information storage
- PDF and webpage content extraction
- Automatic content structuring and organization
AI Decision Making
- Multiple LLM integration (GPT-4, Claude)
- Context-aware navigation
- Self-review mechanisms
- Structured output generation

Setup Instructions

Backend Setup

Clone the repository

git clone https://github.com/hrithikkoduri18/webrover.git
cd webrover
cd backend

Install Poetry (if not already installed)

Mac/Linux:

curl -sSL https://install.python-poetry.org | python3 -

Windows:

(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -

Set Python version for Poetry
```
poetry env use python3.12
```
Install dependencies using Poetry:
```
poetry install
```

Activate the Poetry shell: For Unix/Linux/MacOS:

poetry shell
# or manually
source $(poetry env info --path)/bin/activate

For Windows:

poetry shell
# or manually
& (poetry env info --path)\Scripts\activate

Set up environment variables in .env:

OPENAI_API_KEY="your_openai_api_key"
LANGCHAIN_API_KEY="your_langchain_api_key"
LANGCHAIN_TRACING_V2="true"
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_PROJECT="your_project_name"
ANTHROPIC_API_KEY="your_anthropic_api_key"

Run the backend:

Make sure you are in the backend folder

uvicorn app.main:app --reload --port 8000

For Windows User:

uvicorn app.main:app --port 8000

Access the API at http://localhost:8000

Frontend Setup

Open a new terminal and make sure you are in the WebRover folder:
```
cd frontend
```
Install dependencies:
```
npm install
```
Run the frontend:
```
npm run dev
```
Access the frontend at http://localhost:3000

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❤️ by @hrithikkoduri

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebRover

Your AI Co-pilot for Web Navigation 🚀

Overview

Key Features

Agent Capabilities

Browser Integration

User Interface

Output Options

Research Tools

Technical Features

Agent Types

1. Task Agent

2. Research Agent

3. Deep Research Agent (New! 🎉)

Agent Architecture Diagrams

Deep Research Agent Flow

Research Agent Flow

Task Agent Flow

Architecture

Setup Instructions

Backend Setup

Frontend Setup

Contributing

License

About

Releases 1

Packages

Contributors 3

Languages

License

hrithikkoduri/WebRover

Folders and files

Latest commit

History

Repository files navigation

WebRover

Your AI Co-pilot for Web Navigation 🚀

Overview

Key Features

Agent Capabilities

Browser Integration

User Interface

Output Options

Research Tools

Technical Features

Agent Types

1. Task Agent

2. Research Agent

3. Deep Research Agent (New! 🎉)

Agent Architecture Diagrams

Deep Research Agent Flow

Research Agent Flow

Task Agent Flow

Architecture

Setup Instructions

Backend Setup

Frontend Setup

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages