Skip to content
This repository has been archived by the owner on Dec 1, 2022. It is now read-only.

[English only] Questions about uBlock Origin/AdGuard #4

Closed
geddyhub opened this issue Jul 10, 2020 · 57 comments
Closed

[English only] Questions about uBlock Origin/AdGuard #4

geddyhub opened this issue Jul 10, 2020 · 57 comments
Labels
question Further information is requested

Comments

@geddyhub
Copy link

hi there yuki, hope you're doing ok. so, i need your help. do you think it's a bad idea to globally noop "wp.com, wordpress.com, wix.com, parastorage.com" and similar commonly used cms's with frequent/known vulnerabilities? tia for your time.
p.s.: feel free to delete this issue after a few days.

@geddyhub
Copy link
Author

@Yuki2718 forgot to tag you.

@Yuki2718
Copy link
Owner

As always that's a trade-off. If you're okay to noop them locally every time, ofc it's better security-wise - but that's too much trouble for people like me who browse various sites so, as you see in my dynamic-rules.txt I nooped them globally. As a matter of the number, majority of threats can still be prevented because they're implemented in uncommon 3p domains, OTOH I have seen cases malicious scripts implemented in 1p or inline where whatever you micro-manage rules it bypasses medium-mode. wp.com can be used to proxy scripts but then there're many more paths for that (amazonaws, cloudfront, etc.). Or if Wix can be abused why Jimdo or Strikingly not? I don't see a reason to take domains you mentioned special. Even with loose rules medium mode gives practically enough protection, but if you want more for peace of mind it's better to plug a hole medium mode natively has - the 1p path. Separating browser profiles when you login or buy something does most part of this. Block unneeded plugins (WordPress, jQuery, etc.) by subscribing good Social and Annoyances lists, it cannot save you from all the problems of plugins but in some cases mitigate damage.

No need to ping me as it's my repo. I'll close but am thinking to keep this issue so people like you can ask something about uBO/AdGuard, probably with the title changed once I think of a good one.

@geddyhub
Copy link
Author

thank you so much. take care.

@Yuki2718 Yuki2718 changed the title ubo & wp [English only] Questions about uBlock Origin/AdGuard Jul 20, 2020
@Yuki2718
Copy link
Owner

Yuki2718 commented Sep 2, 2020

https://www.wilderssecurity.com/threads/ublock-a-lean-and-fast-blocker.365273/page-194#post-2943997
https://github.com/uBlockOrigin/uAssets/issues/7853

Of note: https://www.wilderssecurity.com/threads/ublock-a-lean-and-fast-blocker.365273/page-192#post-2939831
https://github.com/DandelionSprout/adfilt/pull/94

In addition, the only rule Rasheed needs:
@@||wilderssecurity.com/threads/$xhr,1p

@Yuki2718 Yuki2718 added the question Further information is requested label Sep 2, 2020
@Yuki2718
Copy link
Owner

@geddyhub @nicolaasjan @SampeiNihira I'll be appreciated if one of you can tell a Wilders member, rethink, that his issue has got fixed in uBlock filters - Unbreak. All he needs will be Purge all caches -> Update now and remove all the added filters from My filters, if he doesn't use default-deny mode. In this case simple cosmetic is enough, no need to allow anything. I'll later forward this to Greek AdBlock Filter and remove the fix from Unbreak once fixed in the regional list.

@Yuki2718
Copy link
Owner

@SampeiNihira Thank you for telling him :)

@Yuki2718
Copy link
Owner

Yuki2718 commented Dec 25, 2020

@SampeiNihira Actually ||glassdoor.com^*/gd-user-hardsell-overlay. should work as the login modal is called from this. But he have to clear cache after adding the rule to prevent loaded script to be reused. cdnjs.cloudflare.com/ajax/libs/react-dom/17.0.1/umd/react-dom.production.min.js is a common library for React app, blocking it just breaks the page and other websites. Please, if you have an issue REPORT IT to an appropriate place, Github, Reddit, Lanik's forum, or even AdGuard's anonymous reporting tool if you use AG filters - you'll rarely get correct answer in security forums. This seems to be a login modal so I think it can be blocked in Annoyances filters (if the filter doesn't break anything else), but not in uBlock Annoyances as so far doesn't require advanced capability of uBO.

@Yuki2718
Copy link
Owner

@SampeiNihira I'm feeling sorry to you for all these indirect conversation. Anyway, he can block BOTH css & js of gd-user-hardsell-overlay without obvious side effect BUT has to test with default settings. If Continue reading doesn't work, it probably means he didn't unblock cdnjs.cloudflare.com/ajax/libs/react-dom/17.0.1/umd/react-dom.production.min.js because that button relies on this script to work.

Blocking #HardsellOverlay and locking overscroll will make the page always jump to the top

Can be easily fixed by e.g (Do not add these! -> rethink).

glassdoor.com##+js(aopr, scrollTo)
glassdoor.com###HardsellOverlay
glassdoor.com##body.loggedOut[style*="overflow: hidden"]:style(overflow: auto !important;)

Haven't tested but cookie-remover may be another way as the site looks cookie to fire the modal. But there's no reason to use cosmetic and scriptlet unless ||glassdoor.com^*/gd-user-hardsell-overlay. actually causes problem.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 15, 2021

@SampeiNihira Can you convey my warning to MT (https://malwaretips.com/threads/adguard-tweaking-yes-i-confess-i-did-it-again-changed-adblocking-strategy.104523/page-3)?

Please, DO NOT use queryprune unless you know how to write correct and efficient filters. FilterAutherMode is reserved for filter authors who have such skills. queryprune has been used by built-in lists and a few other lists, it's not something on plan and the only reason FilterAutherMode is required is gorhill doesn't want people to play with this. He has been irritated to see so many people write and share inefficient filters, as a result, he warned he will restrict the option to built-in lists and My filters once he see people abusing this option [1].
||google.*^$queryprune=|ei|gs_lcp|psi|psy|uact|ved,domain=google.*
is such an inefficient filter, untokenizable, not even type-scoped, wrong regex (queryprune is regex-based internally), not to mention this format itself will soon be deprecated. This time I don't explain why it's bad, tokenization is not that simple to explain and apparently nobady understood much simpler case of why
HTTP://*^$third-party,~image,~stylesheet
is wrong despite I explained it when I dropped MT in [2]. Why don't prefix with |? Why ^? I explained ^ is a separator and NOT something to be added to the end. * can be omitted. And why ||*^$ping? || is for domain but no domain specified? Not to mention dynamic filtering is much more suited for the purpose [EDIT: Never mind, I forgot ping is not supported in dynamic filtering]. Sorry if my saying sounds toxic, but I repeat that "working" does not mean "correct". If you make a filter by imitating appearance of a good filter, there's high risk that yours is not. Everything needed to write correct and efficient filters is available online. Unless you're ready to dig Github in for that, better to use other extensions to remove tracking parameter. Of note, the reason the option was added was not for tracking parameter, it was for SSAI. TBH I'm in dilemma - I'm asking as a person and trying to hide from gorhill's eyes so that a few list authors and I can keep to use the option in public lists [3], but if I see more abuse, I have to bring it back to discuss even though it's detriment to me, as I'm now a member of uBO team.

[1] https://www.reddit.com/r/uBlockOrigin/comments/jzezl3/is_there_yet_any_major_filterlist_with_the/gdc0k1s/
[2] BTW even in correct form it's still an inefficient filter. Performance-wise, single inefficient filter is worse than 10,000 efficient filters - but I don't say any inefficient filter shouldn't be used.
[3] I so far use the option only in a Japanese list to remove parameter not covered by other addons or user scripts.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 15, 2021

Changing to AG's $removeparam doesn't help, it's till untokenizable and should never be used on uBO. Hm, AdGuard TPL seems to have

||digikey.com^$removeparam=/^mkt_tok/
||digikey.com^$removeparam=/^utm_cid/

These are tokenizable in domain part unlike the aforementioned filter, but not ideal to uBO. I'll take it back to discuss with relevant parties.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 16, 2021

Pinging @SeriousHoax too. I could be more helpful if it was other staff, but I'm quite uneasy as gorhill has been very quick to do what he warned (searching for the latest example in Reddit but can't find - he restricted something soon after a user didn't respect what is written in documentation of advanced settings). I'm pretty sure he will restrict the access to queryprune once he spotted it.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 16, 2021

Lenny Can't you come here to discuss? I don't have MT account but you have GH account. Don't get emotional. It's simple, if gorhill find it he will soon restrict the access - and FilterAuthorMode is Filter Author Mode, it's not meant for everyone to easily turn on. This is the final barrier as uBO can't hide anything as an open source project. There's no centralized documentation about tokenization, your new filters are only better in that search is now tokenizable and only requests from Google search will be checked but are still problematic as parameter parts are not tokenizable ( the old or new format is unimportant, corresponds to what I said "soon deprecated". Also no need to change * to nl). You can't be a good filter author by one day, there's no royal road or cheat sheet - otherwise there should be many (but surely you can in long term). Even if you get, how do you guarantee all those who read your posts are too? Some people will just imitate your filters and start to abuse. Turning on FilterAuthorMode should simply not be recommended, I mean, even if you are, do not recommend others to follow. I'm ready to explain if you want, but then please do respect the developer = gorhill's desire if you use his product.

@Bruce-Bane
Copy link

@Yuki2718 what you think about this tracking parameters to be used to clean URLs with AdGuard Stealth mode?(removed from MT forum)

__hssc,__hstc,_hsenc,_hsmi,_reqid,_trkparms,ad_bucket,ad_size,ad_slot,adid,adserverid,adserveroptimizedid,adtype,adurl,AffiliateGuid,assetId,assetType,bdref,bstk,c_id,Campaign,campaign_id,campaignId,cid,clickid,client,clkurlenc,cmpid,dclid,elqTrack,elqTrackId,exitPop,fb,fb_action_ids,fb_action_types,fb_ref,fb_source,fbclid,first_visit,ga_content,ga_fc,ga_hid,ga_medium,ga_place,ga_source,ga_vid,gclsrc,glcid,gs_gbg,gs_l,gs_Lcp,gs_mss,gs_rn,gws_rd,hmb_campaign,hmb_medium,hmb_source,hsCtaTracking,ImpressionGuid,itm_campaign,itm_content,itm_medium,itm_source,itm_term,matchid,mbid,mc_cid,mc_eid,mediatadaid,minbid,mkt_tok,nr_email_referer,num_ads,origin,page_referrer,payload,piggiebackcookie,pk_campaign,pk_content,pk_kwd,pk_medium,pk_source,providerid,pubclick,pubid,recipientId,referrer,reftype,revmod,rurl,s_cid,sc_campaign,sc_channel,sc_content,sc_country,sc_geo,sc_medium,sc_outcome,sclient,sei,siteId,sourceid,spJobID,spMailingID,spReportId,spUserID,tldid,trackid,tracking,uact,uid,usegapi,utm_campaign,utm_channel,utm_cid,utm_content,utm_medium,utm_name,utm_place,utm_pubreferrer,utm_reader,utm_referrer,utm_social,utm_social-type,utm_source,utm_swu,utm_term,utm_userid,utm_viz_id,ved,vero_conv,vero_id,zoneid

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 17, 2021

@LennyFox I just ask if you have willingness to collaborate or not. The only condition is you to stop and withdraw recommending others to enable FilterAuthorMode. This is to prevent people from abusing queryprune/removeparam and other advanced feature by the mode, which will spoil all the effort of uBO developer and is threatning to us - complaints from non-advanced user who somehow enabled advanced setting is already major annoyance for this volunteer-driven project. You don't need that, as you have GH account and can distribute a list without requiring such a dangerous practice - then you'll share the same interest with me, somebody starts to abuse and your list may become invalid. Documentation is not bible, more so for new feature which is changing and it clearly states

Poorly crafted removeparam filters can have deleterious effects on performance, experienced filter authors are expected to understand well how to craft optimal filters.

as pointed out by HarborFront. It says nothing about "how to craft optimal filters" and that's not something can be in a thin manual - even if it was, writing such manual would require time. We write filters on solid understanding rather than just following examples in wiki. I can give at best case-by-case advices which may not cover all the cases, but at least can tell you the most important points to prevent making inefficient filters. Please come here if you want to collaborate. If you don't, I'll simply report all these and beg not to restrict the option.

@Yuki2718
Copy link
Owner

@Bruce-Bane IDK what you mean by what you think. Most are obvious tracking parameter, while some look questionable or may cause trouble e.g. zoneid is used in some shopping sites but IDK if removing this causes actual trouble or not.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 17, 2021

@LennyFox So you don't want to collaborate? I clearly stated your multi-line rules are also problematic. You may think you copied from gorhill's in "exactly the same manner ", which is NOT. You still misses what's important, all because you only look at appearance than meaning. Okay, that's enough. I'll report all these.

@Yuki2718
Copy link
Owner

Just want to say thank you to @SeriousHoax for telling my comments and sorry to all forum members, including Lenny, if these made you unpleasant.

@gorhill
Copy link

gorhill commented Jan 17, 2021

Many iterations have gone into the original design of queryprune (which is now deprecated in favor of removeparam following discussion with AdGuard people to align both syntax for the sake of filter list authors, and on my side to further optimize the enforcement of removeparam filters as much as possible with as little friction when creating those filters -- I didn't share publicly all those changes[1] (sorry about this) except when I finally released the official documentation (last week) when I considered the filter option was finally mature.

When no pattern is provided, i.e. *$removeparam=..., uBO 1.32.0 is able to extract tokens from the removeparam value when possible. For example, from the "Actually Legitimate URL Shortener Tool" list:

$removeparam=utm_campaign

This filter is fine. It has no pattern from which to extract a token, but then uBO will fallback into extracting a token from the removeparam value, when possible. However a filter like the following would be a performance concern:

$removeparam=/utm_campaign|utm_content/

Because no token can be extracted from either the pattern or the removeparam value. In such case, it's best to split the above filter into two distinct filters:

$removeparam=utm_campaign
$removeparam=utm_content

So mainly the usual concerns must be raised when writing removeparam filters as with any other filters, you want them as narrow as possible given what the filter is meant to accomplish:

  • Is it tokenizable? (tokens act as very good narrowing option)
    • If pattern is *, uBO will try to extract token from removeparam value if any
  • Does it have a type option? (script, image, etc)
  • Does it have a party option? (1p, 3p)
  • Does it have a domain= option?

All those narrowing options help uBO to know when to visit the filter, and not visiting a removeparam filter if it's not going to accomplish anything in the end is the goal here.


[1] Commits:

@Yuki2718
Copy link
Owner

@gorhill Thx for dropping by my thread to clear things up. So now token, other than search, can be extracted from filters like ||google.nl/search?*aq$removeparam=aq,domain=google.nl. My bad.

@gorhill
Copy link

gorhill commented Jan 17, 2021

So now token, other than search, can be extracted from filters like

To be accurate, for the filter you present, the tokens google, nl, search can be extracted from the pattern.

However if the pattern had been *, uBO would have then fallen back onto the removeparam value to try to extract a token, with the result that aq would have been extracted as a valid token.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 18, 2021

To be accurate, for the filter you present, the tokens google, nl, search can be extracted from the pattern.

However if the pattern had been *, uBO would have then fallen back onto the removeparam value to try to extract a token, with the result that aq would have been extracted as a valid token.

Despite google is a bad token? I missed nl is not a bad token, I tend to assume TLD is bad token but in this case not. Anyway it's domain scoped so that part doesn't matter much - and no fallback means in the end parameter part won't be tokenized, so I was not so wrong. And if what gwarser said is true, anyway the mentioned filter can be improved as a way I would have written if I wrote it.

@gorhill
Copy link

gorhill commented Jan 18, 2021

A "bad" token is still better than no token, so uBO would still pick google if none better were available.

@Yuki2718
Copy link
Owner

Yuki2718 commented Jan 18, 2021

Ah, okay. BTW it seems he still keeps doing the flawed benchmark. Other than the point you mentioned in internal discussion, I don't think he eliminated network latency which significantly affects the results. As you said, there's not much we can do, I'm writing this solely for those who are serious to truth not to be too influenced.

@Yuki2718
Copy link
Owner

Yuki2718 commented Feb 9, 2021

@geddyhub @SampeiNihira I don't remember why I closed this issue, maybe because I want to keep issues opened only which I have to address. But open or closed is irrelevant - anyone can comment on closed issues. As far as I am aware, I haven't contacted by imdb recently.

So question is Twitch ads? The recent Twitch video ads are from the beginning Server Side Ad Injection which is impossible to block by nature. All counter measures were NOT about blocking, but about how to make you not to be chosen by Twitch as a target to deliver ads. Initially removing some parameter were effective so scriptlet was updated and subsequently queryprune was added to uBO. But it didn't last long and cat-and-mouse game began. Changing UA to Google bot was effective at one time but was soon countered, and according to https://github.com/pixeltris/TwitchAdSolutions the current best solution is proxying m3u8 resolution to a country which doesn't get ad. uBO is a contents blocker and NOT an "extension to do whatever thing to avoid Twitch ads". I also really hope people to understand these ads are context-dependent; i.e. whether you see ads or not depends on various factor even with the same setup.

dailymail floating video? Add
dailymail.co.uk##+js(aopr, Object.prototype.videoAdvertisingMode)
in AG Annoyances.

Cookie? Annoyances filters blocking cookie-consent have nothing to do with cookie in direct sense. They either hide the consent or block script initiating it, the latter may consequently affect cookie tho. If you want to block cookies, just set your browser to do so or use uMatrix-like addons. Sure, some websites take skipping the consent as a go-sign to set cookie, but so what?

uBlock Annoyances? It can be used stand-alone and we recommend to enable it if people comaplain about soft- or dismissable- anti-adb on Reddit. The major part of this list is anti soft anti-adb and anti right-click, copy, etc. It also complements Fanboy and AG Annoyances lists, say, if FB Annoyance can't solve a problem as it doesn't use advanced syntax then this is a job of uBlock Annoyances. It addresses issues specific to uBO too. Just do not expect it to be a comprehensive annoaynce list.

Want to use only specific part of a list? Some popular lists are combination of many sublists. If you want only cookie consent parts of AG Annoyances these are the ones:

https://raw.githubusercontent.com/AdguardTeam/AdguardFilters/master/AnnoyancesFilter/sections/cookies_general.txt
https://raw.githubusercontent.com/AdguardTeam/AdguardFilters/master/AnnoyancesFilter/sections/cookies_specific.txt
https://raw.githubusercontent.com/AdguardTeam/AdguardFilters/master/AnnoyancesFilter/sections/cookies_whitelist.txt

However, these are not optimized for uBO so generally not very recommended. AG puts lists optimized for uBO under https://filters.adtidy.org/extension/ublock/ path but these are compiled and thus no more sublists. uBO ignores incompatible filters but AG occasionally fixes uBO-specific issues with NOT_PLATFORM directive which uBO doesn't understand (nor AG actually) .

If you got an issue, report it. Most issues reported in security forums were a minute work for me to fix silently or not. I can't fully test BBC iplayer as it requires registration, but confirmed AG Annoyances broke it so fixed this part.

@Yuki2718
Copy link
Owner

Yuki2718 commented Apr 19, 2021

@SampeiNihira
On Firefox uBO 1.34.1+ can block cookie with the following form: example.com##^responseheader(set-cookie) which works like AdGuard's $cookie except that this is limited to document. But using such filters for cookie management is not recommended - HTML filtering has its own cost and should be used only if no other solution works. The cost is less of an issue on AG apps where HTML filtering anyway takes place by single network rule.

cookie-remover (and AG's remove-cookie) is cookie remover, it doesn't block cookie and instead remove cookie(s) from local storage1. I'm not aware of any limitation of the scriptlet and it has been working perfectly, it's simply not to block or manage cookie. Most common usage of this is to prevent a site from certain action by referring cookie. IIRC the last time I used it was to block fastly click-through tracking where other solution have side effects. Add beaumontenterprise.com#@#+js(cookie-remover, realm.cookiesAndJavascript) to temporary disable the filter, click any articles on the site, and carefully see what happens.

  1. Obvious but not meant technical term of local storage by HTML5

@gorhill
Copy link

gorhill commented Apr 19, 2021

HTML filtering has its own cost

^responseheader() however is a different code path than response body filtering, the cost is not an issue for those filters.

@Yuki2718
Copy link
Owner

@gorhill Thx, good to know.

@spirillen
Copy link
Contributor

@spirillen TBH I don't know what https://mypdns.org/my-external-stuff/ublockorigin-rules/-/blob/master/ublockorigin-rules.template is. Is this meant to be subscribable filter list?

did you take a look at the link??

@Yuki2718
Copy link
Owner

Yuki2718 commented Sep 13, 2021

Yes,i if it's meant to be a subscribable list (because I see [Adblock Plus 2.0] at the top), include doesn't work at least for uBlock Origin. See https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#include-file-name

@spirillen
Copy link
Contributor

spirillen commented Sep 13, 2021

Yes,i if it's meant to be a subscribable list (because I see [Adblock Plus 2.0] at the top), include doesn't work at least for uBlock Origin. See https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#include-file-name

😃 ok about the include statement, I'll read up on that later, but from a very quick review. Are you saying my command flrender -v -i ublockorigin-rules=. ublockorigin-rules.template _public/blockrules.txt no longer works for building the compiled list?

But to my original question, is there OR which of the files from the ./filter would be needed to include the complete Blacklists from uAssers, the reason I ask i it is unclear to me if there actually is compiled a list, and I would like to include them all in my own list (mostly for the sake of the tor browser), but also to alert me of stuff I haven't noticed elsewhere.

own note to issue on flrender/include: https://mypdns.org/my-external-stuff/ublockorigin-rules/-/issues/12

@Yuki2718
Copy link
Owner

If you can incorporate filters.txt, the whole uBlock filters will be too because of these lines at the end of the file:

!#include filters-2020.txt
!#include filters-2021.txt

@spirillen
Copy link
Contributor

Super, thanks @Yuki2718

I think I go all in, and incorporates the FOP.py to sort the output file.

TRUE or FALSE? filter.txt is a combined file of all other files in the filter folder except filter-202*.txt =

cat annoyances.txt badlists.txt badware.txt \
    legacy.txt privacy.txt resource-abuse.txt \
    unbreak.txt > filters.txt

@Yuki2718
Copy link
Owner

Of course annoyances.txt etc. are not included in filters.txt Note uBO's function of !#include requres target files to be in the same directory (in this case filters.txt and filters-2021.txt are under the same directory)

@spirillen
Copy link
Contributor

Ok, that do actually mean I have to include them one by one with flrender 😭

https://pypi.org/project/python-abp/
https://github.com/adblockplus/python-abp

@spirillen
Copy link
Contributor

@ghost
Copy link

ghost commented Sep 14, 2021

You can use both of ||example.com^$3p and ||example.com^$1p to counter ||example.com^$badfilter

@spirillen
Copy link
Contributor

You can use both of ||example.com^$3p and ||example.com^$1p to counter ||example.com^$badfilter

READ THE FOLLOWING WITH A SMILE ON THE LIPS

ok... so to re-enable one rule, you need two extra rules 😄 are we spinning in circles 👾

😃

List 1

||example.com^

List 2

||example.com^$badfilter

List 3

||example.com^$1p
||example.com^$3p

who started by loading lists 1 :priceless:

@Yuki2718
Copy link
Owner

Yuki2718 commented Oct 26, 2021

@nicolaasjan I'm pretty sure you misinterpreted somthing gorhill or gwarser said, in https://old.reddit.com/r/uBlockOrigin/comments/jebg1q/email_marketing_tracking_pixel_block_list/ gorhill says

I expect that EasyPrivacy blocks them by default. If you find examples of this not being the case, just report those here.

and in fact EP has been adding rules for email tracking pixel. Just search for email in EL repo and you'll see in Commits tab. At least those from major global and European player should mostly be covered.

@Yuki2718
Copy link
Owner

Yuki2718 commented Dec 6, 2021

I want to block only 3rd party fonts except 1st party fonts. Is there an AdGuad rule for this problem?

AG doesn't allow that uBO style rule but you can use equivalent regex as workaround:
/.*/$font,third-party
and if you want to exclude example.com
/.*/$font,third-party,domain=~example.com

The same goes for $ping, but if you use AG app and not AG browser extension, you'll have to wait for the next major release to make it work reliably. I'm not sure whether the uBO style rule will work or not on the next major release. AG devs are working hard to get the release hopefully before new year.

@ghost
Copy link

ghost commented Dec 6, 2021

@Yuki2718 Got it. Thank you.

@Yuki2718
Copy link
Owner

Yuki2718 commented Dec 6, 2021

In case you want to exclude more than 1 site:
/.*/$font,third-party,domain=~example1.com|~example2.com|~example3.com

@spirillen
Copy link
Contributor

spirillen commented Dec 6, 2021

What is the easiest way to find out if a domain is dead?
Which method do you use when detecting dead sites? Website or a program?

The easiest way is depending on many things like OS etc.
But I know @Yuki2718 are using @PyFunceble from https://github.com/funilrys/PyFunceble as it can test both urls and domains by default to check the lists, for individual domain check it can be easier and faster to just use drill -T $domain or curl -I $URI

Hoping this gives you some answers.

@Yuki2718
Copy link
Owner

Yuki2718 commented Dec 7, 2021

Oh, sorry I forgot to answer that. Sure, I use PyFunceble with --adblock for the first screening of dead domain - this is industry-standard and used by many filter author. Keep in mind though, as correctly described in their document, INACTIVE by PyFunceble is not the direct indicator of the domain being dead. There's no easy way to check whether all the subdomain of a domain are dead or not. For those flaged domain I check with Google search ("example.com" and "site:example.com") and publicwww (example.com depth:all - I have access to premium publicwww).

@SKEIDs
Copy link
Contributor

SKEIDs commented Dec 21, 2021

I was asked to add a rule for the twitter widget.
Is this required? Or if it is possible?
example: #103438 , #103450

@Yuki2718
Copy link
Owner

I was asked to add a rule for the twitter widget. Is this required? Or if it is possible?

Our criteria is "block if it's not part of contents", however, whether the widget is part of contents or not is sometimes grey zone and requires individual assesment and discussion. This is why I asked the official members in some of your PRs. One thing for sure is we do not block each and every these widgets.

@SKEIDs
Copy link
Contributor

SKEIDs commented Dec 21, 2021

In my opinion, I don't have to block the official Twitter most of the anime official webisite.
For example, these sites #103789 , #103828 are part of the content, right?
I don't think I need to block them.
If I had to say, I think it might be a good idea to block this site. #103800

P.S. sorry for my bad english. I don't speak english on a daily.

@Yuki2718
Copy link
Owner

@SKEIDs I agree with you, I might discuss with the team tommorrow.

@SKEIDs
Copy link
Contributor

SKEIDs commented Dec 21, 2021

@SKEIDs I agree with you, I might discuss with the team tommorrow.

OK,I apologize for the inconvenience.

@Yuki2718
Copy link
Owner

OK,I apologize for the inconvenience.

No need to apologize.

@SKEIDs
Copy link
Contributor

SKEIDs commented Jul 16, 2022

@Yuki2718
Copy link
Owner

@SKEIDs http://www.city.aomori.aomori.jp/ https://www.city.chiba.jp/

@SKEIDs
Copy link
Contributor

SKEIDs commented Jul 16, 2022

sorry, my words not enough.
aomori.jp###tmp_publicity is working?
I am trying to fix this issue now, and I thought aomori.jp is dublicate.

@Yuki2718
Copy link
Owner

Yeah, apparently is not working. I thought what won't work is TLD (jp###tmp_publicity). @ameshkov Shouldn't aomori.jp###tmp_publicity work?

@Yuki2718
Copy link
Owner

Yuki2718 commented Jul 16, 2022

For a moment you can replace the rule with more specific domain. Note the rule can't be generic - ###tmp_publicity has false positive IIRC.

@SKEIDs
Copy link
Contributor

SKEIDs commented Jul 16, 2022

I'd replace aomori.jp with city.aomori.aomori.jp.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants