A lightweight CLI interface that combines LLM agents and browser automation to scrape dynamic web content using natural language commands.
This project uses LangGraph, Anthropic's Claude 3.5, and Bright Data MCP tools to build a fully agentic browser-based scraping agent.
- 🌐 JavaScript-aware scraping via BrightData's MCP and headless Chrome
- 🧠 LLM-powered reasoning using Claude 3.5 Sonnet to interpret tasks and choose scraping tools
- 🧰 LangGraph ReAct agent orchestration for chaining tool use
- 🔧 Extensible tool loading using LangChain MCP adapters
- 🤖 Conversational interface — type queries like "Extract the titles of the top Hacker News posts" and let the agent figure it out
- Clone the repo
git clone https://github.com/rrishabhjn/llm-scraper.git
cd llm-scraper- Install dependencies (Python ≥ 3.13)
pip install -r requirements.txt
# or, if you're using PEP 621:
uv pip install -r uv.lock- Set up environment variables
Create a .env file at the root of the project with the following values:
API_TOKEN=your_brightdata_api_token
BROWSER_AUTH=your_browser_auth_string
WEB_UNLOCKER_ZONE=your_zoneRun the agent:
python main.pyThen simply type your queries like:
You: Go to https://news.ycombinator.com and extract the top 5 post titles, authors, and points.
The agent will reason about the tools it needs, interact with the page via MCP, and return structured results.
langgraph.prebuilt.create_react_agent– Core LLM agent orchestrationChatAnthropic– Claude 3.5 LLM interfaceBrightData MCP– Headless browser interfacelangchain_mcp_adapters– Auto-loaded scraping tools
.
├── main.py # Entrypoint CLI chat loop
├── pyproject.toml # Dependency and project metadata
├── .env # API tokens and browser zone credentials
└── README.md # You're reading it!This is a proof-of-concept, and contributions are welcome! Ideas for improvement:
- Add schema-based scraping using pydantic or Zod
- Plug in other LLM providers like GPT-4 or Gemini
- Create browser screenshots or download PDFs via tools
MIT © Rishabh Jain
Rishabh Jain – PM with a builder mindset. I work at the intersection of AI, automation, and developer tools.