Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON do…

Pascal 681 42 Updated Apr 20, 2024

jeffjose / tget

tget is wget for torrents

JavaScript 622 51 Updated Dec 11, 2020

edgi-govdata-archiving / awesome-website-change-monitoring

A curated list of awesome tools for website diffing and change monitoring.

494 31 Updated Aug 9, 2022

iipc / awesome-web-archiving

An Awesome List for getting started with web archiving

2,040 156 Updated Nov 6, 2024

datatogether / research

📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity

91 11 Updated Sep 27, 2018

lorien / awesome-web-scraping

List of libraries, tools and APIs for web scraping and data processing.

Makefile 6,671 787 Updated Oct 27, 2024

transitive-bullshit / awesome-puppeteer

A curated list of awesome puppeteer resources.

2,403 161 Updated Jul 19, 2024

Germey / AwesomeWebScraping

List of libraries, tools and APIs for web scraping and data processing.

Makefile 240 33 Updated Apr 5, 2024

simon987 / awesome-datahoarding

List of data-hoarding related tools

1,084 83 Updated Sep 14, 2023

machawk1 / wail

🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation

Roff 350 35 Updated Oct 4, 2024

internetarchive / heritrix3

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Java 2,829 763 Updated Nov 7, 2024

webrecorder / webrecorder-player

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)

JavaScript 437 38 Updated Sep 17, 2020

steffenfritz / html2warc

simple script to convert web resources to a single warc file

Python 18 2 Updated May 11, 2023

wallabag / wallabag

wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.

PHP 10,448 768 Updated Nov 8, 2024

xarantolus / Collect

A server to collect & archive websites that also supports video downloads

TypeScript 78 10 Updated Feb 11, 2023

qarmin / czkawka

Multi functional app to find duplicates, empty folders, similar images etc.

Rust 20,166 656 Updated Oct 12, 2024

QL-Win / QuickLook

Bring macOS “Quick Look” feature to Windows

C# 17,442 1,091 Updated Apr 11, 2024

blaCCkHatHacEEkr / PENTESTING-BIBLE

articles

12,908 2,346 Updated Apr 3, 2023

noxone / regex-generator

Generate regular expressions from sample texts.

Kotlin 401 66 Updated Oct 30, 2024

alxnbl / onenote-md-exporter

ConsoleApp to export OneNote notebooks to Markdown formats

C# 922 75 Updated Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Butters3388214

Block or report Butters3388214

Favorites

Anorov / cloudflare-scrape

ArchiveTeam / grab-site

ArchiveBox / ArchiveBox

rockdaboot / wget2

mirror / wget

ArchiveTeam / wpull

wkentaro / gdown

mhogomchungu / media-downloader

circulosmeos / gdown.pl

melbahja / got

benibela / xidel