Skip to content

Fix empty urlClassification  #122

@franciscawijaya

Description

@franciscawijaya

As mentioned in the previous meeting, when I did the first batch of the crawl, all the data are collected except for the Firefox's urlClassification. When I open the site manually without a crawl, I was able to check the sites flagged by Firefox Tracking Protection

Some fundamental things that I checked:

  • Opened the protection dashboard while the crawl is ongoing to check if it is a matter of there are first party and third party sites blocked but just not recorded or if it is the problem of Firefox's Enhanced Tracking Protection not flagging and blocking the necessary sites during the crawl. The observation showed that it is the latter.
  • Used the code version for June crawl and ran the crawl for one isolated site and it still didn't collect the UrlClassification and the protections dashboard also suggests that 'No Trackers known to Nightly were detected on this page'

Since then, I have tried out different things and ruled out some of the possible causes:

  • VPN: I tried with both on and off and different IP locations for the crawl but the issue still persists.
  • Firefox version: I made sure the version that I used manually and for the crawl is the same via opening the setting for Firefox when the crawl is ongoing before the crawl exited Firefox to make sure of the version (there is a recent update of Firefox version on July 9)
  • Dependabot update on Optmeowt extension: I repackaged the XPI extension to include the dependabot update as well as @Mattm27's edit on the workflow for the extension and checked it manually by dropping it to Firefox extension and open a site. I was able to get the urlClassification when doing it manually. However, when I ran the crawl for just one isolated site with the newly updated xpi file, the urlClassification data is not collected.

Metadata

Metadata

Labels

bugSomething isn't workingcore functionalityNew big featureinfrastructureAn issue relate to underlying compute or selecting technologies

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions