A curated list of tools, frameworks, and resources for building AI agents that can browse and interact with the web.

Steel is an open-source browser API built specifically for AI agents. We make it easy to build AI applications that can effectively interact with the web.
β¨ Get started for free here.
AI agents that autonomously navigate and interact with the web through a user-friendly interface. (a.k.a Browser Agents)
- Surf.new - An open-source playground for chatting with different web agents.
- OpenAI Operator - OpenAI's AI agents that can browser the web for you.
- Browser-Use - SOTA agent and framework that makes the web LLM-friendly.
- Skyvern-AI - Framework to automate browser-based workflows.
- Proxy by Convergence - Proxy is your AI-powered digital assistant that explores the web and executes tasks through simple conversation.
- Google Project Mariner - A research prototype exploring the future of human-agent interaction, starting with your browser.
- Runner H - Runner H is a state-of-the-art AI agent that will allow anyone to automate complex, cumbersome, multi-step tasks without repetitive and manual input.
- WebVoyager (Agent) - Vision-enabled web agent.
- AgentGPT - Deploy autonomous AI agents in your browser.
- Agent-E - Agent & framework with HTML DOM distillation.
- Kura - Web Agents for the Enterprise.
- Manus - A general AI agent that can execute long running tasks across tools like browsers, terminals, and text editors.
- doBrowser - An AI-powered Chrome extension that understands natural language and takes actions in your browser on your behalf.
- WebSurfer (Autogen) - MultimodalWebSurfer is a multimodal agent that can search the web and visit web pages.
- Magentic-One - A generalist multi-agent system for solving complex tasks including surfing the web via Autogen's MultimodalWebSurfer.
- Harpa.ai - An AI-powered Chrome extension & browser agent that understands natural language and takes actions on your behalf.
- Yutori - A multi-agent system that executes browser-based tasks in parallel given a natural language prompt.
- Automina - AI browser automation tool with natural language control.
- rtrvr.ai - AI Web Agent Chrome Extension that autonomously does tasks, scrapes to Sheets, and calls API's β all with just prompts and your own browser!
- Anthropic Computer Use - Computer use agent that can control your browser.
- Self-Operating Computer Framework - A framework to enable multimodal models to operate a computer.
- Highlight - Highlight AI lets models understand your desktop activity. Get stuff done faster.
- OpenInterpreter - An open-source CLI based agent that can write & execute code as well as control your browser.
- UI-TARS - A GUI agent model designed to interact seamlessly with GUIs using human-like perception, reasoning, and action capabilities.
Tools, frameworks and libraries that translate natural language instructions into web interactions.
- Asteroid.ai - Hosted Browser Agents for SMEs to automate complex workflows.
- PulsarRPA - AI-powered browser automation for data extraction.
- VimGPT - Experimental project using GPT-4 Vision to browse the web via the Vimium extension.
- Cekura.io - An AI browser agent that helps companies maintain up-to-date documentation.
- Dex by Dexterity - An AI coworker embedding into and controlling your browser.
- Autobrowser - A free, experimental Chrome extension that leverages Claude Computer Use to automate tasks in your browser.
- Bytebot - Bytebot provides AI-powered scraping automations that evolve with your target sites.
- Runcopycat - A no-code browser automation platform that turns screen recordings into reusable automated workflows.
- Bardeen.ai - A Chrome extension that enables AI-powered browser automations, allowing users to automate tasks and workflows directly within the browser.
- Starizon.ai - Browser assistant for web task automation.
- BrowserGPT - Browser extension for page summaries and Q&A.
- Browse.ai - Chrome extension webscraping that can leverage AI for structured data extraction.
- Strawberry Browser - A personal assistant that sits in your browser, automates repetitive web actions, learns your workflows.
- Deta.surf - An integrated platform that combines a browser, file manager, and AI assistant with browser-level context.
- Comet by Perplexity - An AI-powered browser by Perplexity. Not much more details out yet.
- Dia Browser - Dia Browser is envisioned as an entirely new web browser built with AI at the center by The Browser Company (Arc).
- Reworkd - No-code web data extraction solution using agentic AI.
- Ottogrid - Spreadsheet based web agents to automate manual research.
- Steel.dev - Open-source headless browser API built specifically for AI agents and apps.
- Omniparser - Tool for parsing GUIs for vision based agents.
- LaVague - Framework for natural language web automation.
- Langchain Playwright toolkit - Toolkit integration with AI agents.
- Browserbase - A headless browser API for AI workflows.
- Stagehand - AI web browsing framework.
- Tarsier - Vision utilities library for web interaction agents.
- AutoGPT - Experimental agent for task completion and web browsing.
- Bytebot - Containerized computer use agent framework with a virtual desktop environment.
Web crawlers & scrapers that leverage AI to navigate websites and extract content.
- FireCrawl - APIs for turning websites into LLM-friendly markdown.
- Crawl4AI - Open-source LLM Friendly Web Crawler & Scraper.
- ScrapeGraphAI - Python scraper based on AI.
- WebAgent (OpenAgents) - The web-browsing agent module of the OpenAgents platform (HKU). Enables autonomous navigation of websites via natural language, as part of a larger multi-modal agent framework.
- Expand.ai - Turns any website into a type-safe API you can rely on.
- LLM Scraper - Uses LLMs for intelligent scraping and content understanding.
Utilities that help agents search the web or query web data via natural language.
- AgentQL - A query language and toolkit that makes the web AI-ready.
- SerpAPI - Search API that provides Google Search results for your agents.
- Serper.dev - Performant and cost effective search API that provides Google Search results for your agents.
- Jina.ai - Neural search platform for web data.
- Exa.ai - Semantic Search Engine for AI.
Datasets, benchmarks, and notable research efforts for evaluating and advancing web-capable AI agents.
- Web Agent Leaderboard - Web agent leaderboard compiling different AI agent products and how they perform on the widely used WebVoyager benchmarks.
- Web Games by Convergence - a collection of challenges designed for testing general-purpose web-browsing AI agents.
- Bananalyzer - An open-source evaluation framework for web-based AI agents.
- Mind2Web - A large-scale dataset for generalist web agents.
- World of Bits: An Open-Domain Platform for Web-Based Agents - OpenAI's research paper that introduces World or Bits: a platform where agents complete tasks on the internet by performing low-level keyboard and mouse actions.
- MiniWoB++ - A classic suite of 104 mini web browser tasks in a synthetic environment. It's is an extension of the OpenAI MiniWoB benchmark.
- WebArena - A realistic, self-hostable web environment for autonomous agents. Includes official leaderboard tracking agent performance.
- WebCanvas - An online evaluation framework for dynamic web environments. Tests agents on live websites.
- WebGPT - OpenAI's browser-assisted question-answering research project.
- WebShop - A simulated e-commerce shopping environment with 1.18M real Amazon products.
- WebVoyager (Benchmark) - Vision-enabled web agent using GPT-4V for real-world website interaction.
- WorkArena - A suite of 33 browser-based tasks for enterprise "knowledge worker" scenarios.
- BrowserGym by ServiceNow - A gym environment for web task automation.
Resources for learning how to build, deploy, or utilize AI web agents.
- LangGraph WebVoyager Tutorial - Tutorial demonstrating how to build a web navigation agent using LangGraph Agents, Vision Models, and Web Voyager.
- Build an AI Browser Agent - Step-by-step guide to create an AI that browses the web using Playwright and the Browser-Use library.
- Install & Run Browser-Use Locally - Instructions on installing the open-source Browser-Use agent with a local LLM.
- Build a Browser Agent with DeepSeek - Walks through deploying a Browser-Use web UI agent powered by the DeepSeek model on a cloud VM.
Feel free to reach out at team@steel.dev or on Discord.
Steel is an open-source browser API built specifically for AI agents. Get started for free here.
- Follow @steeldotdev on X.
- Join the Discord community.
- Feel free to reach out to us at team@steel.dev
Contributions of any kind welcome, just follow the guidelines!