-
-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to override default setting for cloudfront.net #2587
Comments
Hello, and thanks for opening a new issue! Yellowlists should already get automatically updated for all Privacy Badger users. We could try to dig into what's going on here. To start, could you follow steps 2 and 3 of our debugging instructions for "cloudfront.net"? Let me know if you have any questions. |
Please refrain from posting on unrelated issues. Let's continue the discussion here where it belongs. |
Alexei,
No problem! I am happy to help out, especially since those generated
cloudfront.net URLs are or soon will be a lot more popular on many sites,
hose using AmazonAWS or similar commercial hosting/tracking/advertising/SEO
services in particular.
Give me a couple-three hours to chill out while gaming and also have
dinner, and I will follow the steps you mentioned above. Covid-19 social
distancing has well and truly f*cked up my already strange sleep cycle and
caused me to put on weight due to being bored near more than the month's
worth of food and beverages I usually keep on hand. <hehehe>
Regards,
Mark Gibson
…On Tue, Apr 14, 2020 at 12:08 PM Alexei ***@***.***> wrote:
Hello, and thanks for opening a new issue! Yellowlists should already get
automatically updated for all Privacy Badger users. We could try to dig
into what's going on here. To start, could you follow steps 2 and 3 of our debugging
instructions
<https://github.com/EFForg/privacybadger/wiki/Find-out-why-Privacy-Badger-is-blocking-a-domain>
for "cloudfront.net"? Let me know if you have any questions.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2587 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AM4DAYCJMEO4TFEJGT47SW3RMSJ65ANCNFSM4MHMVXWQ>
.
|
Regarding your question posted elsewhere:
I think this is the crux of it. The reason However, perhaps private TLDs such as We should also perhaps ignore private TLDs on the PSL altogether, and treat them the same as any other domain. |
Somewhat related: #2259 |
Alexei,
It might be better in the long run to treat TLDs that use URLs which have
third-level (Nth-level, where N > 2?) parts randomly generated for tracking
purposes as different beasts entirely. Single name TLDs such as com, org,
mil, net, etc. were almost useless for learning about the contents of
subdomains in many (most?) cases, and the hundreds of new ones that people
were clamoring for were finally officially legitimized by ICANN just add to
the confusion.
What immediate struck me is that [random].cloudfrond.net concept could be
used to keep track of links between people, businesses, locations, etc. If
the randompart is long enough, it could be used to encode all kinds of
additional information and still look random using the miracle of
steganography. Much of the subscription email I get at my Gmail accounts
is encrypted, so if I want to to visit a site without click-transfering a
bunch of information embedded in the clickable link, I have to just search
for the title of an article and can almost aways find it wherever it was
actually published. govDelivery has scrozzled the links in most of the US
federal government publications I subscribe to, but the same trick works
when I want to read an article I know came from NIST without using the
obvious link which transfer unknown information about me and my reading
habits to third parties.
Now that I may have succeeded in convincing you that I consider a healthy
amount of paranoia regarding one's privacy to be a Good Thing™. I think
your idea of setting up a class of domains that ignores how they are listed
in the PSL or how many parts of the domain name are the effective TLD.
Most of the more popular blocking tools I have used do not permit people to
use regex expressions to define rules for dealing with URLs, but I think a
working definition of the kinds of special TLDs we are discussing would be
that they involve a generated-on-the-fly component which would quickly make
blocking them individually an exercise in futility. [the x^16] part of the
offending cloudfront.net domains tends to clog up PrivacyBadger with a
redlist full of them when an unsophisticated user just keeps trying to
block them and then gives up at some point...having added dozens of such
URLs to lists used within PrivacyBadger.
The offending cloudfront.net domains are easy for a human to spot once
the way they are generated and used is understood, but we want to give the
user a way to block traditional cloudfront.net-based as normal TLDs or to
treat them as special cases due to the way certain sites will keep
generating zillions of them because automagically blocking each new one
that shows up just makes matters worse. We could check to see if the
offending domain fits into a [regex-defn].cloudfront.net format and then
ask if the user wants to keep a one or more than a particular site will
spot and resuse in the future, like the two that I currently have
yellowlisted here.
You should see how blocking just one of them and selecting another to
replace it several sessions and reboots later will affect some sites that
use more than one.
A similar situation exists for a popular internet speed test site on
charter.com. I have no few than 18 lines in the test site URL page's
settings list, just to allow several different tests to be run during a
session. Spectrum.net is comprised of Charter Communication and
Time-Warner Cable plus the defunct RoadRunner ISP and also Brighthouse
Networks. Spectrum now touts its the internet speed test for customers to
see just how wonderful a deal they are getting from Spectrum. One of the
18 URLs I have allowed in NoScript to allow me to better utilize that
speedtest is
https://spt01chynwy.chyn.wy.charter.com
If I wanted to test connection speed from here to AUS or parts of the EU,
Japan, Brazil, etc. the list of URLs I would have to permit would be huge.
This is the second time I have observed that those 18 URLs seem to be
matching pairs of identical URLs, but I though NoScript would merge those
pairs automatically as being redundant. Maybe this time I will investigate
further, but I have so many sets of rules for pages in NoScript that an
extra 9 entries for one page hardly matters.
There is another case that might be on-topic here, namely a URL of the form
" https://firefly-125161.s3.amazonaws.com" (where I have changed the six
digit number for this message since it is somehow associated with a
financial institution I do business with). I always block anything from
[s3].amazonaws.com when I see it but this is an entry I have to allow. I
do not know if the six digits are unique to an account or a branch of the
company or specific to me as a unique customer (since I tend to have
multiple accounts at financial institutions I like and have trusted for
years). This one is associated with a bank, insurance company, or
brokerage ... I forget which.
Maybe I should wait until tomorrow because this is the sort of problem that
seems to become more well-defined after I sleep on it. :-) I have clients
who can attest to how relieved they were when I chatted with them a day or
two after I was thinking out loud about ways to solve a problem (mostly
because some of them were very savvy about the business they were in and
would practically light up when I mentioned some method or technique they
had heard of before -- clients often do not realize that even an
experienced consultant might save a lot of time and their money if they
somehow hint at the way a problem has been addressed in another place and
time.
What one particular client I had for a dozen or more years taught me while
I was his computer consultant made some of the finance and accounting
classes I took to get an MBA seem rather trivial, and he would actually
sometimes pay me by the hour just to read some particular books related to
a project and call him if I had questions. He knew me well enough to know
I would learn a lot of things that would save him money on that project or
some future one he had in mind.
Have your eyes glazed over yet? I am really developing a bad habit of
rambling on when I write these days.
Bad covid-19. Bad virus. Horrible social distancing and PPE to wear -- all
coronavirus' fault. Be gone, vile and pernicious pandemic!
<grin>
Be well,
Mark Gibson
…On Tue, Apr 14, 2020 at 1:18 PM Alexei ***@***.***> wrote:
Regarding your question posted elsewhere:
How do I keep a small number of those random third-level domains with
[random characters].cloudfront.net URLs that I wish to (partially) allow
(set to green or yellow) and have cloudfront.net set to red (black-isted)
by default so largre numbers of unwanted third-level cloudfront.net
domains do not get marked red and added to the list of user-controlled
domains on my computer?
I think this is the crux of it. The reason cloudfront.net itself isn't on
the Tracking Domains list is that cloudfront.net is on Mozilla's Public
Suffix List <https://publicsuffix.org/>, which Privacy Badger uses to
decide what is and what isn't a Top Level Domain. Because cloudfront.net
is a "private" TLD, it's not on the list the same way as com or org is
not on the list.
However, perhaps private TLDs such as cloudfront.net and blogspot.com (#2401
(comment)
<#2401 (comment)>)
should be on the list of tracking domains to allow easy overriding of the
default behavior.
We should also perhaps ignore private TLDs on the PSL altogether, and
treat them the same as any other domain.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2587 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AM4DAYDLJJLIE7CTPIURLK3RMSSHNANCNFSM4MHMVXWQ>
.
|
Alexei,
I expected to get back to the problem of how to deal with domains such as
"cloudfront.net" what often come with seemingly randomly generated
third-level domains of the form "xxxxxxxxxxxxxx.cloudfront.net" where "x"
can be [a-z,0-9] a lot sooner.but medical issues and bills intervened. One
of the medical issues I was hoping to deal with prior to the pandemic was
upgrading the prescriptions on my reading and computer glasses and that is
an expense I am not ready to assume right now. I was reasonably familiar
with JavaScript circa 2002, but have not used it much since then, but that
would not matter if I could make sense of how to use the debugger provided
by github on PrivacyBadger as it acts on the site "www.wordofwarcraft.com"
which is one that I allow two or three generated cloudfront.net URLs to
stay in the yellowlist. As you recall, I cannot find just "cloudfront.net"
in any PB table at this time, which makes it hard for me to figure out what
is going on.
The goal here is to leave just plain "cloudfront.net" on the yellowlist,
but somehow mark it as being read-only, so that modified versions (with the
prepended random 14-char string) as the third-level domain can be
separately added to the yellowlist, but those that are not specifically
added just get red-listed (black-listed) by not being saved anywhere by PB.
www.worldofwarcraft.com (hereinafter WoW.com) is where I first got
frustrated by not having a way of keeping just a small set of pairs of the
longer "randomtag".cloudfront.net URLs for use in later sessions, but
letting all other variants of cloudfront.net just be blocked.
My difficulty is not so much not knowing of means of doing what I want to
accomplish, but not being familiar with the dev tools on github. The
debugger gives me headaches because it is generating a lot of output when I
merely load WoW.com and have the JS code with "cloudfront.net" plugged into
the console with the subject domain (cloudfront.net) should replace "XXX"
said JS code. There are many things going on that have nothing to do with
the problem I want to deal with and a few that I will need to pay attention
to enough to recognize when they change the debugger output that is
associated with what I am doing.
The most simple solution I have thought of is having a way of marking any
entry in any table or domains that PB uses as read-only, so it is always in
the default category for the base domain name even when the user makes
exceptions for variations on the base domain. The newly added exceptions
should be marked user-controlled so that they will not show up again when
PB is reloaded from scratch. That should allow the user to allow the user
to manage a select few variations on the base domain, without overwriting
it, and just let other variations get cleared when PB is restarted.
Really, this is just treating URLs as tracking cookies, with the generated
portion containing the "cookie" used to identify the user, or whatever,
with more than one variation of the base URL being like a set of tracking
cookies each with a different purpose.
The system I describe above would work for at a couple of other kinds of
URL-as-tag/tracking-cookie I have spotted recently. I think Google uses a
similar system as user want to view other people's videos on Youtube.
There, is is probably so the ads seen as part of the content and noted and
used to update the amount Google owes the content provider...I am just
guess, but that seems like a very reasonable guess to me.
Regards,
TMG
…On Tue, Apr 14, 2020 at 1:20 PM Alexei ***@***.***> wrote:
Somewhat related: #2259
<#2259>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2587 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AM4DAYGO43S36NH7X47GJIDRMSSQDANCNFSM4MHMVXWQ>
.
|
Regarding "keeping just a small set of pairs of the longer "randomtag".cloudfront.net URLs for use in later sessions, but letting all other variants of cloudfront.net just be blocked." It sounds like you use other extensions together with Privacy Badger. If any one of your extensions blocks a domain, that domain is blocked everywhere, regardless of the settings of your other extensions. So you should be able to get what you want by configuring one of your other content blockers to block Once that's done, there is no need to change anything in Privacy Badger, which will continue to cookieblock all |
I figured out a dirt simple workaround for the many URLS of the form xxxxxxxxxxxxxx.cloudfront.net that were ending up in the list of trackers that were individually user-controllable in Privacy Badger. Ghostwords seems to have found the same solution over three weeks ago. I likely would have spotted it months ago but for the fact that I was specifically allowing a few of these RG'd cloudfront.net domains in privacy Badger (PB), NoScript and uBlock Origin, because I often use two or three ad- or tracker-blocker addons simultaneously since doing so does not slow this machine down enough for me to notice and some blockers (think: uBlock Origin) might block specific aspects or elements of the content located at a given URL. So I did what was suggested about and removed all references to the cloudfront.net domain from the managed sites list of NoSript and PB, then fully blocked cloudfront.net references from booth addons, saved my changed, then added in the very few explicit exceptions I need to use a couple-three sites. The only detail to remember, as ghostwords mentioned, is to have only one instance of the RG'd domain that is fully blocked in PB when you are done. I am presuming that any blocker which fully blocks a URL will be given priority over some that may look at certain aspects or elements of it or its content to give it at least a partial pass. I will quit complexifying things now... I should just look at the code as soon as I have a few extra decades of useful life in which to do the many things I am actually expected to do. Regards, |
Recently I was told in the forum that cloudfront.net is already on the yellowlist. I checked and it is, here. On my system there is no entry at all for "cloudfront.net", but there are the two entries of the form "xxxxxxxxxxxxxx.cloudfront.net" which resulted from me making them user-controlled (so the site that keeps generating new random URLs will find them and not create new clutter every time I go there).
How do I merge the current official yellowlist with the one I have changed a bit on my computer? I do not want to overwrite any changes I have made locally, but do want to merge in any new entries from the official list. I am guessing that the default entry for just "cloudfront.net" was somehow overwritten when I was trying to figure out how to locally blacklist that domain entirely and then make exceptions as needed.
I suspect there are many others who would like to be able to merge the current master yellowlist into their local PrivacyBadger installation's version, with the local version always taking precedence. I guess you have your reasons for not letting users maintain their own redlist as well as modified (user-controlled) yellowlist on their local machine and providing a merge function that would allow their local lists to be updated to reflect changes to the master list(s) without entries being stepped on locally.
I hope the above makes sense. It has been many years since I used the terminology related to list maintenance, and I fairly sure the kind of updating merge I am looking for has a name.
The text was updated successfully, but these errors were encountered: