Skip to content

πŸ”₯ A list of tools, frameworks, and resources for building AI web agents

License

Notifications You must be signed in to change notification settings

steel-dev/awesome-web-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Web Agents

A curated list of tools, frameworks, and resources for building AI agents that can browse and interact with the web.

About Steel

Steel is an open-source browser API built specifically for AI agents. We make it easy to build AI applications that can effectively interact with the web.

✨ Get started for free here.

Contents

Autonomous Web Agents

AI agents that autonomously navigate and interact with the web through a user-friendly interface. (a.k.a Browser Agents)

  • Surf.new - An open-source playground for chatting with different web agents. GitHub Repo stars
  • OpenAI Operator - OpenAI's AI agents that can browser the web for you.
  • Browser-Use - SOTA agent and framework that makes the web LLM-friendly. GitHub Repo stars
  • Skyvern-AI - Framework to automate browser-based workflows. GitHub Repo stars
  • Proxy by Convergence - Proxy is your AI-powered digital assistant that explores the web and executes tasks through simple conversation.
  • Google Project Mariner - A research prototype exploring the future of human-agent interaction, starting with your browser.
  • Runner H - Runner H is a state-of-the-art AI agent that will allow anyone to automate complex, cumbersome, multi-step tasks without repetitive and manual input.
  • WebVoyager (Agent) - Vision-enabled web agent. GitHub Repo stars
  • AgentGPT - Deploy autonomous AI agents in your browser. GitHub Repo stars
  • Agent-E - Agent & framework with HTML DOM distillation. GitHub Repo stars
  • Kura - Web Agents for the Enterprise.
  • Manus - A general AI agent that can execute long running tasks across tools like browsers, terminals, and text editors.
  • doBrowser - An AI-powered Chrome extension that understands natural language and takes actions in your browser on your behalf.
  • WebSurfer (Autogen) - MultimodalWebSurfer is a multimodal agent that can search the web and visit web pages. GitHub Repo stars
  • Magentic-One - A generalist multi-agent system for solving complex tasks including surfing the web via Autogen's MultimodalWebSurfer.
  • Harpa.ai - An AI-powered Chrome extension & browser agent that understands natural language and takes actions on your behalf.
  • Yutori - A multi-agent system that executes browser-based tasks in parallel given a natural language prompt.
  • Automina - AI browser automation tool with natural language control.
  • rtrvr.ai - AI Web Agent Chrome Extension that autonomously does tasks, scrapes to Sheets, and calls API's – all with just prompts and your own browser!

Computer-use Agents

  • Anthropic Computer Use - Computer use agent that can control your browser.
  • Self-Operating Computer Framework - A framework to enable multimodal models to operate a computer. GitHub Repo stars
  • Highlight - Highlight AI lets models understand your desktop activity. Get stuff done faster.
  • OpenInterpreter - An open-source CLI based agent that can write & execute code as well as control your browser. GitHub Repo stars
  • UI-TARS - A GUI agent model designed to interact seamlessly with GUIs using human-like perception, reasoning, and action capabilities. GitHub Repo stars

AI Web Automation Tools

Tools, frameworks and libraries that translate natural language instructions into web interactions.

  • Asteroid.ai - Hosted Browser Agents for SMEs to automate complex workflows. GitHub Repo stars
  • PulsarRPA - AI-powered browser automation for data extraction. GitHub Repo stars
  • VimGPT - Experimental project using GPT-4 Vision to browse the web via the Vimium extension. GitHub Repo stars
  • Cekura.io - An AI browser agent that helps companies maintain up-to-date documentation.
  • Dex by Dexterity - An AI coworker embedding into and controlling your browser.
  • Autobrowser - A free, experimental Chrome extension that leverages Claude Computer Use to automate tasks in your browser.
  • Bytebot - Bytebot provides AI-powered scraping automations that evolve with your target sites.
  • Runcopycat - A no-code browser automation platform that turns screen recordings into reusable automated workflows.
  • Bardeen.ai - A Chrome extension that enables AI-powered browser automations, allowing users to automate tasks and workflows directly within the browser.
  • Starizon.ai - Browser assistant for web task automation.
  • BrowserGPT - Browser extension for page summaries and Q&A.
  • Browse.ai - Chrome extension webscraping that can leverage AI for structured data extraction.
  • Strawberry Browser - A personal assistant that sits in your browser, automates repetitive web actions, learns your workflows.
  • Deta.surf - An integrated platform that combines a browser, file manager, and AI assistant with browser-level context.
  • Comet by Perplexity - An AI-powered browser by Perplexity. Not much more details out yet.
  • Dia Browser - Dia Browser is envisioned as an entirely new web browser built with AI at the center by The Browser Company (Arc).
  • Reworkd - No-code web data extraction solution using agentic AI.
  • Ottogrid - Spreadsheet based web agents to automate manual research.

Dev Tools

  • Steel.dev - Open-source headless browser API built specifically for AI agents and apps. GitHub Repo stars
  • Omniparser - Tool for parsing GUIs for vision based agents. GitHub Repo stars
  • LaVague - Framework for natural language web automation. GitHub Repo stars
  • Langchain Playwright toolkit - Toolkit integration with AI agents.
  • Browserbase - A headless browser API for AI workflows.
  • Stagehand - AI web browsing framework. GitHub Repo stars
  • Tarsier - Vision utilities library for web interaction agents. GitHub Repo stars
  • AutoGPT - Experimental agent for task completion and web browsing. GitHub Repo stars
  • Bytebot - Containerized computer use agent framework with a virtual desktop environment. GitHub Repo stars

AI Web Scrapers/Crawlers

Web crawlers & scrapers that leverage AI to navigate websites and extract content.

  • FireCrawl - APIs for turning websites into LLM-friendly markdown. GitHub Repo stars
  • Crawl4AI - Open-source LLM Friendly Web Crawler & Scraper. GitHub Repo stars
  • ScrapeGraphAI - Python scraper based on AI. GitHub Repo stars
  • WebAgent (OpenAgents) - The web-browsing agent module of the OpenAgents platform (HKU). Enables autonomous navigation of websites via natural language, as part of a larger multi-modal agent framework. GitHub Repo stars
  • Expand.ai - Turns any website into a type-safe API you can rely on.
  • LLM Scraper - Uses LLMs for intelligent scraping and content understanding. GitHub Repo stars

Web Search & Query Tools

Utilities that help agents search the web or query web data via natural language.

  • AgentQL - A query language and toolkit that makes the web AI-ready. GitHub Repo stars
  • SerpAPI - Search API that provides Google Search results for your agents.
  • Serper.dev - Performant and cost effective search API that provides Google Search results for your agents.
  • Jina.ai - Neural search platform for web data.
  • Exa.ai - Semantic Search Engine for AI.

Benchmarks & Research

Datasets, benchmarks, and notable research efforts for evaluating and advancing web-capable AI agents.

  • Web Agent Leaderboard - Web agent leaderboard compiling different AI agent products and how they perform on the widely used WebVoyager benchmarks. GitHub Repo stars
  • Web Games by Convergence - a collection of challenges designed for testing general-purpose web-browsing AI agents. GitHub Repo stars
  • Bananalyzer - An open-source evaluation framework for web-based AI agents. GitHub Repo stars
  • Mind2Web - A large-scale dataset for generalist web agents. GitHub Repo stars
  • World of Bits: An Open-Domain Platform for Web-Based Agents - OpenAI's research paper that introduces World or Bits: a platform where agents complete tasks on the internet by performing low-level keyboard and mouse actions.
  • MiniWoB++ - A classic suite of 104 mini web browser tasks in a synthetic environment. It's is an extension of the OpenAI MiniWoB benchmark. GitHub Repo stars
  • WebArena - A realistic, self-hostable web environment for autonomous agents. Includes official leaderboard tracking agent performance. GitHub Repo stars
  • WebCanvas - An online evaluation framework for dynamic web environments. Tests agents on live websites. GitHub Repo stars
  • WebGPT - OpenAI's browser-assisted question-answering research project.
  • WebShop - A simulated e-commerce shopping environment with 1.18M real Amazon products. GitHub Repo stars
  • WebVoyager (Benchmark) - Vision-enabled web agent using GPT-4V for real-world website interaction. GitHub Repo stars
  • WorkArena - A suite of 33 browser-based tasks for enterprise "knowledge worker" scenarios. GitHub Repo stars
  • BrowserGym by ServiceNow - A gym environment for web task automation. GitHub Repo stars

Tutorials & Guides

Resources for learning how to build, deploy, or utilize AI web agents.

Interested in implementing Steel?

Feel free to reach out at team@steel.dev or on Discord.

Steel is an open-source browser API built specifically for AI agents. Get started for free here.

Join the Community

Contributing

Contributions of any kind welcome, just follow the guidelines!

Contributors

Thanks goes to these contributors!

About

πŸ”₯ A list of tools, frameworks, and resources for building AI web agents

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks