Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
-
Updated
May 19, 2020 - JavaScript
Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Save web pages as Safari webarchive files from the command line
A dockerized, queued high fidelity web archiver based on Squidwarc
Links on the web break all the time, robustify them!
A Rails engine supporting the discovery of web archives.
Seeder - Czech webarchive curating tool and public site
Docker image for the Archives Unleashed Toolkit
Rails application for the Archives Unleashed Cloud.
A Python utility for publishing a social media story built from archived web pages to multiple services.
Parser for WARC (aka WebArchive) files
A library for interacting with web archive collections at Archive-It, Trove, Pandora, and more.
Add-On for Google Sheets to help those working with web archives.
A toolkit for developing algorithms that sample mementos from a web archive collection.
Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.
Create webarchive entries on archive.org from your raindrop.io bookmarks list using waybackpy
Add a description, image, and links to the webarchives topic page so that developers can more easily learn about it.
To associate your repository with the webarchives topic, visit your repo's landing page and select "manage topics."