Skip to content
View Butters3388214's full-sized avatar

Block or report Butters3388214

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Favorites

31 repositories

A Python module to bypass Cloudflare's anti-bot page.

Python 3,387 461 Updated Oct 14, 2023

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

Python 1,395 135 Updated Jul 7, 2024

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Python 22,243 1,178 Updated Nov 4, 2024

The successor of GNU Wget. Contributions preferred at https://gitlab.com/gnuwget/wget2. But accepted here as well 😍

C 562 76 Updated Nov 1, 2024

Wget Git mirror

C 393 132 Updated Sep 25, 2024

Wget-compatible web downloader and crawler.

HTML 555 77 Updated Apr 29, 2024

Google Drive Public File Downloader when Curl/Wget Fails

Python 4,300 350 Updated Aug 12, 2024

Media Downloader is a Qt/C++ front end to yt-dlp, youtube-dl, gallery-dl, lux, you-get, svtplay-dl, aria2c, wget and safari books..

C++ 1,667 128 Updated Nov 4, 2024

Google Drive direct download of big files

Perl 937 196 Updated May 12, 2023

Got: Simple golang package and CLI tool to download large files faster 🏃 than cURL and Wget!

Go 723 46 Updated Jan 16, 2024

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON do…

Pascal 681 42 Updated Apr 20, 2024

tget is wget for torrents

JavaScript 622 51 Updated Dec 11, 2020

A curated list of awesome tools for website diffing and change monitoring.

494 31 Updated Aug 9, 2022

An Awesome List for getting started with web archiving

2,040 156 Updated Nov 6, 2024

📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity

91 11 Updated Sep 27, 2018

List of libraries, tools and APIs for web scraping and data processing.

Makefile 6,671 787 Updated Oct 27, 2024

A curated list of awesome puppeteer resources.

2,403 161 Updated Jul 19, 2024

List of libraries, tools and APIs for web scraping and data processing.

Makefile 240 33 Updated Apr 5, 2024

List of data-hoarding related tools

1,084 83 Updated Sep 14, 2023

🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation

Roff 350 35 Updated Oct 4, 2024

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Java 2,829 763 Updated Nov 7, 2024

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)

JavaScript 437 38 Updated Sep 17, 2020

simple script to convert web resources to a single warc file

Python 18 2 Updated May 11, 2023

wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.

PHP 10,448 768 Updated Nov 8, 2024

A server to collect & archive websites that also supports video downloads

TypeScript 78 10 Updated Feb 11, 2023

Multi functional app to find duplicates, empty folders, similar images etc.

Rust 20,166 656 Updated Oct 12, 2024

Bring macOS “Quick Look” feature to Windows

C# 17,442 1,091 Updated Apr 11, 2024

Generate regular expressions from sample texts.

Kotlin 401 66 Updated Oct 30, 2024

ConsoleApp to export OneNote notebooks to Markdown formats

C# 922 75 Updated Jul 8, 2024