The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
-
Updated
Sep 6, 2025 - TypeScript
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Self-hosted webscraper.
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
🤖 Cross-platform modified Chromium build for benchmarking automation compatibility with: Cloudflare, Akamai, Kasada, F5 Shape, reCAPTCHA, PerimeterX, Imperva, DataDome, hCaptcha, FunCaptcha
🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON.
Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails, deployment, and more.
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
🧶 Extract data from any website without code, just clicks.
GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS
n8n node for browser automation using Puppeteer
A free & open source IMDb front-end.
Facebook Group Members Extractor. Download Facebook group members in CSV.
Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
estela, an elastic web scraping cluster 🕸
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.
Add a description, image, and links to the scraping topic page so that developers can more easily learn about it.
To associate your repository with the scraping topic, visit your repo's landing page and select "manage topics."