-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manually create a profile.tar.gz #731
Comments
Hi @djhmateer, part of the issue may be that Browsertrix Crawler uses Brave Browser, which has a similar browser profile data structure to Chrome in that they are both Chromium-based but I would guess diverge at some points. I'm also not sure if the user profiles differ by operating system - the current browsertrix-browser-base Dockerfile is based on Ubuntu 24. My guess would be that a manually saved and tarred/gzipped user data directory from a Brave browser installation would work but I haven't tested this myself, not sure if it'd have to be from the same OS as well. |
Hi @tw4l - thank you so much for the reply. Will test and report back. |
This strategy has worked well thank you @tw4l Essentially I ran Release Channel Brave on my WSL2 (Ubuntu 22) instance using instructions from https://brave.com/linux/ Then did something like: brave-browser
# now login to whatever site eg https://www.osr4rightstools.org
cd ~/.config/BraveSoftware/Brave-Browser
tar -czvf profile.tar.gz *
mv profile.tar.gz ~/auto-archiver/tmp/.
cd ~/auto-archiver/tmp
chmod 777 profile.tar.gz
# test
docker run --rm -v /home/dave/auto-archiver/tmp:/crawls/ webrecorder/browsertrix-crawler crawl --url https://www.osr4rightstools.org --scopeType page --generateWACZ --text --screenshot fullPage --collection 2 --id 2 --saveState never --behaviors autoscroll,autoplay,autofetch,siteSpecific --behaviorTimeout 200 --timeout 200 --profile /crawls/profile.tar.gz
# un tar and gz the wacz
# look for archive/screenshot .warc
# use replayweb.page to see if the screenshot is correct (easy to see if the site is logged in) |
Is it possible to manually create a profile.tar.gz as in
I started looking in here:
browsertrix-crawler/src/create-login-profile.ts
Line 1 in fb8ed18
C:\Users\djhma\AppData\Local\Google\Chrome\User Data
- I tried tar.gz'ing this directory but it didn't seem to work.I've posted here too https://forum.webrecorder.net/t/manually-create-and-use-a-profile-tar-gz/702
Facebook is not happy with the docker profile.tar.gz creation process.
The text was updated successfully, but these errors were encountered: