Open
Description
Context
Some of Brave's settings aren't good defaults for crawling.
- Embedded LinkedIn posts are blocked by default to preserve privacy — probably with good intention, but it also means they won't be captured by the crawler!
- AMP pages are redirected automatically — unsure if this is good or bad, seems like possible added hassle when figuring out scoping?
- Tracker and ad blocking is set to
Standard
, unsure what this does - Cross site cookies are blocked — likely fine?
- Widevine is off by default — likely not an issue, crawling DRM content doesn't work.
- IPFS is set to "Ask" and requires user input, won't work when crawling
- Brave's analytics are on, would probably be good to turn these off? We're not real users :P
- Brave's shields (default adblock and tracking protection) are on by default
User Story
As a user, I'd like my crawler's default settings to offer the best chances of success!
Metadata
Assignees
Labels
Type
Projects
Status
Todo