Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A suggestion for making WACZ and WARC-requests #663

Closed
hamoudak opened this issue Aug 3, 2024 · 4 comments
Closed

A suggestion for making WACZ and WARC-requests #663

hamoudak opened this issue Aug 3, 2024 · 4 comments
Labels
question Further information is requested

Comments

@hamoudak
Copy link

hamoudak commented Aug 3, 2024

I would like to be wacz-requests section as there is already zim-requests ;because I've seen many people not have the ability (for instance, system-requirements or enough space etc.) or knowledge for crawling a website with browsertrix . manual-archiving with archiveweb.page is good and easily handled but it will produce many links for archiving a website. so I think this idea will help many.

@ikreymer
Copy link
Member

We have a service that we offer, https://browsertrix.com/, where you can sign-up and run crawls via a UI. The crawls are run via Browsertrix Crawler. Unfortunately, we don't have the resources to offer WACZ files of sites on-demand, like the Zimit service does. One idea, perhaps, is for Zimit could offer WACZ files alongside ZIM as part of the same crawl - that's a question for @benoit74 @rgaudin and others.

@ikreymer ikreymer added the question Further information is requested label Aug 29, 2024
@hamoudak
Copy link
Author

thank you for clearing this up; I do know the website for a long time but all I see is [log in] and the premium offers. will it be free or something.

@tw4l
Copy link
Member

tw4l commented Aug 29, 2024

thank you for clearing this up; I do know the website for a long time but all I see is [log in] and the premium offers. will it be free or something.

Our hosted service is and will remain a paid service, but the software is FOSS and it is possible to self-host if you're comfortable with Kubernetes: https://github.com/webrecorder/browsertrix. Probably more than you want to do given the requirements about system limitations in the issue description, but it is an option.

@benoit74
Copy link
Contributor

benoit74 commented Sep 2, 2024

One idea, perhaps, is for Zimit could offer WACZ files alongside ZIM as part of the same crawl

Definitely not something "light" to implement, we suppose we are dealing with ZIM files in multiple places ^^ Pretty sure we will probably never make it unless there is something stronger than someone wishing to have this feature

@ikreymer ikreymer closed this as completed Sep 2, 2024
@github-project-automation github-project-automation bot moved this from Triage to Done! in Webrecorder Projects Sep 2, 2024
@ikreymer ikreymer closed this as not planned Won't fix, can't repro, duplicate, stale Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
Status: Done!
Development

No branches or pull requests

4 participants