🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
-
Updated
Apr 26, 2025 - TypeScript
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Run a high-fidelity browser-based web archiving crawler in a single Docker container
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
Lightweight scraper for Google News
Model Context Protocol (MCP) Server for Graphlit Platform
A powerful Chrome extension for web scraping
Web Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode. Plug and play with your own logic and customize it flexibly and scalably on BuildShip.
🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖
Web crawling & scraping framework for Node.js on top of headless Chrome browser
A simple TypeScript framework for declaratively composing bots with Puppeteer
Crawler written in TypeScript using ES6 generators.
Official TypeScript/JavaScript SDK for the Supadata API.
A web crawling library written in TypeScript.
Spring Boot + Keycloak Backend / Angular Web App
Awesome boilerplate for writing browser automations using Playwright, with debugging and tests ready to go.
Add a description, image, and links to the web-crawler topic page so that developers can more easily learn about it.
To associate your repository with the web-crawler topic, visit your repo's landing page and select "manage topics."