🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
-
Updated
Dec 9, 2025 - TypeScript
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Self-hosted webscraper.
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
🤖 Cross-platform browser for automation testing: Cloudflare, Akamai, Kasada, Shape, DataDome, PerimeterX, hCaptcha, FunCaptcha, Imperva, reCAPTCHA, ThreatMetrix, Adscore
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON.
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails, deployment, and more.
n8n node for browser automation using Puppeteer
🧶 Extract data from any website without code, just clicks.
GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS
A free & open source IMDb front-end.
Facebook Group Members Extractor. Download Facebook group members in CSV.
Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.
estela, an elastic web scraping cluster 🕸
A high-performance proxy rotation engine with automated IP management and real-time health monitoring
Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.
Add a description, image, and links to the scraping topic page so that developers can more easily learn about it.
To associate your repository with the scraping topic, visit your repo's landing page and select "manage topics."