#

scraping

Here are 435 public repositories matching this topic...

firecrawl

firecrawl / firecrawl

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

markdown crawler scraper ai html-to-markdown web-crawler scraping web-scraper web-scraping data-extraction webscraping web-data-extraction ai-agents web-search ai-search web-data llm ai-crawler ai-scraping

Updated Dec 11, 2025
TypeScript

crawlee

apify / crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Dec 11, 2025
TypeScript

jaypyles / Scraperr

Self-hosted webscraper.

python docker kubernetes opensource helm scraping webscraper web-scraper self-hosted web-scraping web-scrapers webscraping playwright

Updated Oct 12, 2025
TypeScript

any4ai / AnyCrawl

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

data html-to-markdown scraping webscraper crawl scrape serp rag aitools ai-scraping

Updated Dec 7, 2025
TypeScript

BotBrowser

botswin / BotBrowser

🤖 Cross-platform browser for automation testing: Cloudflare, Akamai, Kasada, Shape, DataDome, PerimeterX, hCaptcha, FunCaptcha, Imperva, reCAPTCHA, ThreatMetrix, Adscore

automation webdriver scraping cloudflare chromedriver akamai web3 perimeterx bot-detection incapsula puppeteer antibot datadome anti-detection threatmetrix kasada shapesecurity

Updated Dec 7, 2025
TypeScript

apify / fingerprint-suite

Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

typescript scraping fingerprinting puppeteer playwright

Updated Dec 11, 2025
TypeScript

ulixee / secret-agent

The web scraper that's nearly impossible to block - now called @ulixee/hero

browser proxy scraping devtools mitm chromium stealth automated mitmproxy puppeteer playwright secretagent

Updated Mar 7, 2023
TypeScript

josephlimtech / linkedin-profile-scraper-api

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Updated Apr 5, 2024
TypeScript

devflowinc / firecrawl-simple

➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.

search markdown crawler data scraper ai html-to-markdown web-crawler scraping embeddings webscraping rag llm ai-scraping

Updated May 23, 2025
TypeScript

adrianhajdin / pricewise

Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails, deployment, and more.

scraping webscraping

Updated Jul 6, 2024
TypeScript

drudge / n8n-nodes-puppeteer

n8n node for browser automation using Puppeteer

pdf screenshot screenshots browser script scraping proxy-server chromium scrape puppeteer n8n n8n-nodes stealth-mode

Updated Nov 15, 2025
TypeScript

tinking

baptisteArno / tinking

🧶 Extract data from any website without code, just clicks.

scraping scrapping scrapper scraping-websites harvesting puppeteer

Updated Apr 15, 2021
TypeScript

PawanOsman / GoogleBard

GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS

api google ai reverse-engineering scraping prompt assistant assistant-chat-bots chatgpt google-bard

Updated Jan 11, 2024
TypeScript

libremdb

zyachel / libremdb

A free & open source IMDb front-end.

sass front-end privacy typescript scraping foss imdb alternative-frontends

Updated Oct 26, 2025
TypeScript

floriandiud / facebook-group-members-scraper

Facebook Group Members Extractor. Download Facebook group members in CSV.

facebook csv scraping growth growth-hacking facebook-scraper facebook-data-extract facebook-scraping facebook-data-scraper

Updated Oct 9, 2025
TypeScript

shihabmridha / educative.io-downloader

Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.

nodejs pdf typescript scraping hacktoberfest puppeteer educativeio

Updated Apr 13, 2024
TypeScript

bitmakerla / estela

estela, an elastic web scraping cluster 🕸

react python docker kubernetes scraper django scraping crawling requests web-scraping scrapy hacktoberfest python-requests scrapyd scrapy-visualization webscraping-python

Updated Dec 3, 2025
TypeScript

Anish-Agnihotri / tweetdrop

Generate dispersable airdrops from Twitter threads.

twitter crypto twitter-api ethereum scraping tweet airdrop

Updated Jan 3, 2022
TypeScript

rota

alpkeskin / rota

A high-performance proxy rotation engine with automated IP management and real-time health monitoring

golang http proxy scraping rate-limiting proxy-server http-proxy socks5 rotating-proxy proxy-list socks5-proxy proxy-checker proxy-rotator ip-rotation

Updated Nov 12, 2025
TypeScript

mtwn105 / decipher-research-agent

Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.

agent ai mcp scraping ml artificial-intelligence gemini web-scraping openai qdrant llm vector-db crewai notebooklm agentic-ai model-context-protocol

Updated Jun 6, 2025
TypeScript

Improve this page

Add a description, image, and links to the scraping topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scraping topic, visit your repo's landing page and select "manage topics."