🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
-
Updated
Mar 12, 2026 - JavaScript
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
A powerful MCP server extension providing web search and content extraction capabilities. Integrates DuckDuckGo search functionality and URL content extraction into your MCP environment, enabling AI assistants to search the web and extract webpage content programmatically.
🔍 Model Context Protocol (MCP) tool for parsing websites using the Jina.ai Reader
Local browser toolkit for AI agents: deep research and browser use automation with local Chrome (CDP) + Playwright. Flexible, extensible scripts for web navigation, extraction and workflow automatization - built for reproducible research and agent-driven browsing.
A userscript that adds a button to YouTube video pages for copying the transcript with or without timestamps.
Extract meaningful content from the chaos of a web page
Live Web Access for Your Local AI — Tunable Search & Clean Content Extraction
Chrome extension to copy YouTube transcripts with AI-friendly features
Convert webpages to clean Markdown for LLM and RAG workflows. Browser-based UI + Node.js CLI with selector drilling, metadata extraction, and batch processing.
Prysm is a blazing-smart Puppeteer-based web scraper that doesn't just extract - it understands structure. Capable of scraping virtually any website with intelligent content detection and 14 specialized scroll strategies that adapt to different page layouts, Prysm excels at extracting content that other scrapers miss.
Example project demonstrating how to use PDFix SDK WebAssembly build in Node.js. Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Simple node server to extract relevant content from website source code using Mozilla's Readability.js
📋 WebMD is a Chrome extension that transforms web pages into Markdown documents with surgical precision.
A chrome extension for chating with the webpage
A web-based utility for fetching, categorizing, summarizing and managing global news and articles using the GDELT 2.0 API. Designed for content creators, news aggregators, and researchers, this tool simplifies access to up-to-date articles with an intuitive UI and customizable configurations.
a fast chrome extension for students, researchers, and power users. it captures text from any web page and sends it to obsidian, notion, and anki. keeps questions and lists in order. no manual copy paste. just grab and save.
HTML to clean Markdown optimized for LLMs. Replaces readability + turndown. One function: content extraction + conversion + token estimation.
Extract clean Markdown, JSON or MCP from any web page. Chrome Extension + MCP Server.
Example project demonstrating how to use PDFix SDK WebAssembly build in Node.js. Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
🚀 Smart, lightweight AI summarization widget for websites. Features intelligent article extraction, JSON-LD SEO (AIO) metadata support, and seamless native app deep-linking for ChatGPT, Claude, and Gemini. Supports 10+ languages including RTL.
Add a description, image, and links to the content-extraction topic page so that developers can more easily learn about it.
To associate your repository with the content-extraction topic, visit your repo's landing page and select "manage topics."