Marko :)

Level 16
Verified
IMO your list is pretty excessive and redundant, you dont need AdGuard Base + EasyList (choose one), no need to have "I dont care about cookies", Anti-Facebook, AdGuard Annoyances, AdGuard Social Media when you have Fanboy's Annoyance.
There are two AdGuard Base filters;
  • first one that has more than 118.000 rules (used by AdGuard's browser extension and software)
  • second one (with approx. 41.000 rules) used in other browser extensions like uBlock Origin.
First one is fork of EasyList so when using it, you don't have to use EasyList filter and it written to work specifically in AdGuard. The second Base filter is optimized for other ad blockers (with incompatible strings removed) and is meant to be used along EasyList filter.

Now, regarding the I don't care about cookies filter, it's by far the most complete cookie notice filter. Actually, none of those filters you've mentioned block cookie notices on Croatian websites. Maybe few of them might be blocked, but large number of websites still show notices. That's where I don't care about cookies comes in; developed by Croatian, so all our popular websites are covered.

And these social media filter block different things. Some won't block Facebook's widgets, but Anti-Facebook blocks them entirely.
Actually this is accurate more than 95% of the time but not always. uBlock Origin can't parse all the rules from Adguard filters including Adguard Base, Adguard Tracking Protection, AdGuard Annoyances, AdGuard Social Media. I've seen this combo missing ads, empty place holders quite a few times. Some rules only works with Adgaurd extension. I've seen the reverse thing also, Adguard extension failing to parse a rule from EasyList related filters but less frequent than the previous combo. Only using Adguard related filters on uBlock Origin is not the best idea. If using only one then I would suggest to use EasyList related filters on uBlock Origin instead of Adguard. It works much better this way. Using both doesn't cause any harm also but if using less filters is a priority then uBO with EasyList and Adguard with Adguard is the way to go.
Basically, we're talking about the two different filters with same name.
I understand the logic that less can be near equal to more, but no prizes for having the shortest list and or poor cosmetics if speed is unaffected.
UBO medium and rules as below work for me
View attachment 242529
You shouldn't use full AdGuard Base filter because this one is optimized for AdGuard specifically. This filter contains a lot of rules that uBlock Origin does not recognize. It skips those rules, but the filters itself still slows down browsing because of its size.
 

Yuki2718

New Member
This will be a temporary account, just letting @Marko :) to know I provide such rules (CDNs included in Decentraleyes but not in my rules are Chinese ones, no need unless you visit Chinese sites). Sad nobody mentioned it as it's brought up on Wilders very recently where some ppl here are members. A few comments before leaving:

@Marko :) They're not for other ad-blocker but for uBO - some filters are excluded by NOT_PLATFORM directive and others are commented out while a few issues are specifically addressed for uBO. There are still incompatible filters left but you can see which are them by opening the logger and adding/updating the filter. Some filters are not in stock lists but you can direcly type URL of the filter you want, say, optimized Annoyances filters is here. I PERSONALLY do not recommend subscribing too many annoyances filters because in many cases they hide the same elements by different rules - unlike network rules they are not discarded by uBO and having too many cosmetic filters injected into a page is not good in terms of performance.

@JohnR There is no difference in speed by having less or more rules, it's provable both by theory (computational complexity) and by measurements (use dev tools or a benchmark tool, never rely on human perception - you also need to know how to measure it properly, network latency affects much more than rule matching). More rules causes more memory consumption, but what causes slowdown is NOT the number of rules but that of inefficient rules (1). Also the number of cosmetic filters injected to the page can affect performance, more so if they're inefficient. The main point of having less filters is to reduce FPs and anti-adblock warnings - for these and also for really better performance turning generic cosmetic filters off is the most effective way (2) but you may need to replace these rules by yourself, a reason I provide Placeholder Hider but it's not very comprehensive. Another point is stricter blocking - piling many filters up also piles whitelists up which have precedence over blocking filters. I have been reporting unnecessary whitelists to major lists and even provide Anti-whitelist but I don't look at other lists enough, and occasionally see loose whitelists in some minor lists. One should really be careful before subscribing unpopular lists, Internet is full of low-quality lists.

(1) Efficient rules are tokenizable, or at least well narrowed-down by request types, origin, domains etc. The number of tokenizable rules doesn't affect speed but for this tokenization what holds on ABP syntax does not always holds on uBO.

(2) The most common trigger for anti-adb is cosmetic filtering (particularly ##.adsbygoogle), just see how many $generichide or $ghide (and #@#.adsbygoogle) are included in EL, uBlock filters, or AG Base. Ofc others are triggered by network rules (the champion in this part will be pagead2.googlesyndication.com/*/adsbygoogle.js but there are really many patterns.).

@oldschool 23 probably means you write cosmetic filters by your own, that's no bad. Just saying while writing cosmetic filters is so easy that even the picker can do it for you, writing efficient cosmetic filters is not. If one ends up creating hundreds of cosmetic filters without understanding how to write efficinet filters it will probably be better to subscribe good lists.

@SeriousHoax Just two rules in EP
/b/ss/*&aqe=
/id?d_visid_
have been blocking hundreds if not thousands of CNAME trackers and it's really rare Frogeys's list catches anything missed by EP (tho in the logger simple hostname rules have precedence). This is another reason you shouldn't only look at the number of rules.

@Lenny_Fox I haven't looked MT for a while but ty for crediting me in another thread, however, you forgot the most important part - prefix with | for rules starting from http or https, otherwise such rules are slow and can cause FPs. BTW in the last few months I've seen even common bad actors have been switching to https (accroding commit in my list) so I think the relevance of the rules is decreasing, tho http was still used in redirect chain and also I've found a few hijacked http sites (haven't reported to AG except one case because in the rest of cases malicious scripts were blocked by EL rules such as &prvtof=*&poru=).

@Tiamati (Related to the comment to Lenny) Why ||* ? If you don't specify a domain, just use * like *$object,ping,popunder. I'd warn *$popunder may prevent some legitimate action, just for demonstration set Google search not to open in a new window, middle-click a link first then left-click the same or another link. Note the rule is not tokenizable tho still get benefit of request-type optimization; usually such rules are narrowed down by $domain=. For this kind of control I rather recommend dynamic filtering - having GUI only for some request types doesn't mean other types are not supported.

Will leave here but please don't take it as a spam - I decided to spent my spare time in contribuing to uAssets/AG/EL/Brave than discussion, thus left Wilders.
 
Last edited:

Lenny_Fox

Level 12
There is no difference in speed by having less or more rules, it's provable both by theory (computational complexity) and by measurements (use dev tools or a benchmark tool, never rely on human perception - you also need to know how to measure it properly, network latency affects much more than rule matching). More rules causes more memory consumption, but what causes slowdown is NOT the number of rules but that of inefficient rules (1). Also the number of cosmetic filters injected to the page can affect performance, more so if they're inefficient. The main point of having less filters is to reduce FPs and anti-adblock warnings - for these and also for really better performance turning generic cosmetic filters off is the most effective way (2) but you may need to replace these rules by yourself, a reason I provide Placeholder Hider but it's not very comprehensive. Another point is stricter blocking - piling many filters up also piles whitelists up which have precedence over blocking filters. I have been reporting unnecessary whitelists to major lists and even provide Anti-whitelist but I don't look at other lists enough, and occasionally see loose whitelists in some minor lists. One should really be careful before subscribing unpopular lists, Internet is full of low-quality lists.
RE number of rules don't impact speed
Yuki thanks for helping with optimizing my filters on Github, but the above is a load of (I will censor this myself). Select cases, SQL-queries, regular expression even AI inversion engines take more time when evaluating more instances or occurrences. I respect your opinion and experience in writing filters, but what you are telling is simply not true.

I agree that less rules cause less FP's and "disable your adblocker prompts", but the main reason for having less rules is that there are just not an infinitive number of advertising networks. I refer to the academic research mentioned in the thread adblocking innovation on this forum. With computational power increasing with every new CPU launched, the performance argument is not relevant, the argument that there are simply not that many advertising and tracking networks is much more relevant.



Re HTTP less relevant
Yes because browsers are blocking HTTP websites malware authors are moving to HTTPS. But when you have a look at the latest Open-DSN reports it is still beneficial to block scripts and (i)frames to third-party content when your region (e.g. like me in NL) is converted for 99,99% to HTTPS websites

When reading DNS-report you are right, there is a big increase of malware on HTTPS, but an increase from 5% to 11% in the the last year means I still block 89% of the malicious HTTP websites when blocking third party active HTTP content, I promiss you when it increases from 11% to over 50% I will remove the rule ||HTTP://*$third-party,~stylesheet,~image,~media

;)
 
Last edited:

Yuki2718

New Member
@Lenny_Fox It's true. The idea more rules causes slowdown comes from a wrong assumption of linear matching which doesn't make sense - why don't you use decision tree of a kind (plain explanation by Ghostery) when most rules are rarely or not used for common requests? See gorhill's comment too (it's not only about pure hostname rules). Ofc it's not that adding 1,000,000 rules doesn't add 1ms, what matters is whether adding reasonable amount of rules adds perceptible amount of time - most people can't tell >100 ms difference. Mathematically it's trivial to show that complexity of matching a n-length request to X filters is at most O( n ) order, meaning the number of rules doesn't matter. Seeing is believing, I myself measured but at first failed as network latency which changes every seconds affects much more than the matching, so I had to keep browser cache. There's also Brave's article. You see adding 16,000 rules of EP doesn't add 1 micro-second/request once token-based approach was adopted.

Well, the number of ad-networks are finite but some of them use hundreds of different domains (a very notorious example) and others use cloudfront or amazonaws. Advertisers & web masters have been seeking to bypass ad-blocker and it's becoming harder and harder to block them by simple rules. Say, "9to5" sites detect ad-blocker and re-inject ads with ad-proxing so that they can't be blocked by simple rules (fortunately uBO has scriptlets to counter this). It's not only 9to5, incredibly many sites detect ad-blocker tho how they react depends (some sites detect and do nothing e.g. thewirecutter). It's also no more rare trackers implemented as 1st party even without CNAME cloaking. In my region (JP) Treasuredata does this with data sharing agreement. Ads as 1st party are not rare too, indeed very common on WordPress sites with images hosted on the site and href=(advertisers or shopping sites with an affiliate code). Blocking ad-networks may block redirect on click but that's not many user want - they want the images blocked and here generic rules come into play, sometimes even regex rules are used (slow but acceptable trade-off if narrowed down). I tend to believe those who spend their internet time only on major sites are actually rare. In my case most of sites are one-time visit - aggregated together these one-time sites take 50%+ of my internet time - but it doesn't mean I want to see ads or trackers there.

I just said decreasing and not said it doesn't make sense ;) It was a bit surprising as this lucky-visitor scam has long been used http.

Anyway, I came here not for discussion - need to leave soon.
 
Last edited:

Lenny_Fox

Level 12
Can't edit, just adding an example - it's essentially the same as the reason time for Google search is not affected by the number of sites. what solved the issue is not computational power or code optimization, but the idea of indexing and a proper algorithm.
I explicitly mention non-linear search mechanisms in the examples I gave. Technically you are forgetting the time it takes to convert and index the external lineair data (the filters) into memory at first launch of the extension and when updating filters. But as mentioned earlier, the compelling argument against 'user maintained filters with many rules'' is not the processing power of the CPU or the extension's search mechanism, but the fact that those lists contain a lot of dead rules and have rules for websites with very low visitor rates (so the chance this rules will ever be beneficial or triggered is near to zero).

@Lenny_Fox It's true. The idea more rules causes slowdown comes from a wrong assumption of linear matching which doesn't make sense - why don't you use decision tree of a kind (plain explanation by Ghostery) when most rules are rarely or not used for common requests? See gorhill's comment too (it's not only about pure hostname rules). Ofc it's not that adding 1,000,000 rules doesn't add 1ms, what matters is whether adding reasonable amount of rules adds perceptible amount of time - most people can't tell >100 ms difference. Mathematically it's trivial to show that complexity of matching a n-length request to X filters is at most O( n ) order, meaning the number of rules doesn't matter. Seeing is believing, I myself measured but at first failed as network latency which changes every seconds affects much more than the matching, so I had to keep browser cache. There's also Brave's article. You see adding 16,000 rules of EP doesn't add 1 micro-second/request once token-based approach was adopted.
Yes I know of these articles I used them in the thread adblocking innovation: When you participate in Brave's filters, you also are aware of the "the mounting cost of stale adblocking rules" article published by Brave team. [quote from article] Brave found that a large percentage (> 90%) of EasyList appears to provide little benefit for common browsing cases, due to its large size and accumulation of stale (rarely used or even expired) rules. So your friends at Brave confirm what I stated im my response above :)

Well, the number of ad-networks are finite but some of them use hundreds of different domains (a very notorious example) and others use cloudfront or amazonaws. Advertisers & web masters have been seeking to bypass ad-blocker and it's becoming harder and harder to block them by simple rules. Say, "9to5" sites detect ad-blocker and re-inject ads with ad-proxing so that they can't be blocked by simple rules (fortunately uBO has scriptlets to counter this). It's not only 9to5, incredibly many sites detect ad-blocker tho how they react depends (some sites detect and do nothing e.g. thewirecutter). It's also no more rare trackers implemented as 1st party even without CNAME cloaking. In my region (JP) Treasuredata does this with data sharing agreement. Ads as 1st party are not rare too, indeed very common on WordPress sites with images hosted on the site and href=(advertisers or shopping sites with an affiliate code). Blocking ad-networks may block redirect on click but that's not many user want - they want the images blocked and here generic rules come into play, sometimes even regex rules are used (slow but acceptable trade-off if narrowed down). I tend to believe those who spend their internet time only on major sites are actually rare. In my case most of sites are one-time visit - aggregated together these one-time sites take 50%+ of my internet time - but it doesn't mean I want to see ads or trackers there.
Anyway, I came here not for discussion - need to leave soon.
Now you are on my territory (I am a digital marketeer). Two reason's why your personally experience is not relevant for the broad majority of people

1. Digital time spending habits
Most people tend to spend more than 80% of their digital time on the same websites/games/social media/entertainment portals. So your 50 percent time spend on 'other' websites is quite extra ordinary (in marketing we call that a self-refence error when marketeers assume that their own experience/preference is also applicable for their customers).

2. "Gone with the wind" views
In marketing community, the customers (the digital marketeers like me) are not worried about ad-blockers, the producers (ad and tracking networks and Google, Facebook, Amazone and their display networks) are worried because ad-blockers reduce their income. This marketing controversy is called "Gone with the wind views" (a tongue in cheek reference to famous last words of that movie). This controversy is about the buying readiness of the audience on the websites which profit by advertising and the pay per view costs of ads for marketeers.

In short most marketeers don't give a damn about ad-blockers blocking ads, because these ads were shown to an audience which is not ready to buy anyway (these ad-views would have low click through and conversion rates). Using the laws of capitalisms, those means of bypassing ad-blockers will probably not become mainstream because there is no demand for these advanced services (bypassing adblockers) from digital marketeers. Even the example you gave to proof your point, PopAds (using countless domains to bypass ad-blockers, like PropellerAds) has a marginal, near to zero market share and proof the point I am making :).

source W3tech: PopAds vs. PropellerAds usage statistics, June 2020
1591866093349.png
 
Last edited:

Yuki2718

New Member
Well, my territory is so-called AI and I'm aware of the paper even before my participation, in fact shared here in Sept. 2019 - I guess I'm one of few people who read and understand most of papers from Brave. Simply, there's no perceptible, no, statistically significant in milliseconds scale measurements, difference in matching speed by having 500 or 50,000 rules (as far as the rules are efficient), nobody has proven there is, with rigorous and unbiased benchmark.

I frankly don't care how others browse (1), all I know is even with medium mode and carefully chosen 100,000 network filters I still see many (2) analytics and trackers slipped through, occasionary even ads (ads tend to be reported to major filters soon as they're often - but not always due to cosmetic filters - visible, analytics & trackers not). I know if I don't use medium mode the slips go way more than tripple. EL + EP + best combination of regional filters are not at all satisfactory (in the above thread I also posted a paper proving this). Whatever mainstream is, as a matter of fact there are quite many sites using circumvention techniques such as SSAI, it's undeniable (just see AG Base for example, some are high-traffic sites) as these are a part of what we're fighting against everyday on uAssets etc. and I myself occasinary find such a site. In the end it's a matter of personal preference and there's no 'right' choice, only 'wrong' choice are there such as subscribing too many redundant filters or non-optimal filters. As I can fix FP and disarm anti-adb usually in a minute, I don't need to worry about them too much (tho really dislike FP) and when there's no other detriment in having 100,000 filters I just keep it, rather I've been complementing them for years, it's only recently I started to contribute to popular lists too. Anyway, it's time for me to say good-bye, ofc without bad or hostile or anything such a meaning.

(1) Yet it's a finding - admittedly I don't use social media or don't watch movies, but don't others use Google or Bing to know, say, what is the meaning of a word I heard today XXX? This often, tho not always, leads to one-time visit. Even if it's 1% of their internet time, do all those adblock users are happy to see ads on them? I only accept ads if they're very relevant to the contents and not at all excessive or intrusive. BTW statistics often doesn't tell meaningful thing to an individual - there's no single 'general' person. As an expert in statistics I'm very concerned about recent usage of statistics.

(2) "Many" can have many meanings, as the number of rules it's 1,500+ only for analytics/trackers (beyond my capacity to report; with rules for ads it's over 2,000), but some are rarely (rarely means rarely, it's still hit) hit others frequently.
 

Lenny_Fox

Level 12
@Yuki2718 (Yuki-san :) )

Thanks for dropping by. Let's agree to disagree. For my own feedback/education on improving the messages I am posting, I have a final question.

What triggers you (in my answers) to repeatedly respond that the time to process thousand or a million rules is the same, when I am (also repeatedly) trying/intending to tell you, that processing time is not the reason I don't see any added value in using a million rules?

Regards

Lenny (karate-kid )
 
Last edited:

Yuki2718

New Member
@Lenny_Fox On what we disagree? I don't see conflicts but sorry if something I said made you feel I denied your choice. Your choice is clearly not the 'wrong choice' and reasonable in one way, and mine is also reasonable in another way, just that. IDK how the discussion looks from Westerner but I enjoyed the arguments - whatever theme it is it's always my pleasure to talk with someone who is brilliant and doesn't get emotional.:)

One clear reason is I tend to skim-read posts in forums - it's my bug but sorry can't easily fix :p. Another may be a matter of different perspectives. When a rule is rarely hit, do you think it doesn't make sense or "it's still hit"? I'm/happened to be on the latter side. While rules like /images/ad/* are commonly hit, I haven't ever seen /service/tracking/* was hit anywhere other than eDreams and I don't use eDreams. But I wanna keep this rule - who knows if I use eDreams in future or it gets hit on another site in future? In fact another paper in the thread prove EL often proactively blocks ads by generic rules; i.e. blocking rules have been around even before the ads born. IF the cost of having many rules is close to 0, there's only benefit left even if it's little. And it's not necessarily little - each rule may be rarely hit, but aggregated together they can be quite an amount - this effect of aggregation is sometimes missed by statistics beginner. So a question is if the cost can be regarded as 0 and I guess this is where we choose different perspectives each. Probably we agree they don't cause slowdown (remembered, the paper measured on older implementation of ABP). Then to me, more memory usage (but >10MB) and initial launch don't matter unless one uses a PC with 500MB memory or so often launches and closes browser (even if so, it's still negligible on my 14y old laptop), and default update period is 4 days unless specified by the filter (anyway I usually don't notice when they're updated). So TO ME it's 0 cost. Maybe I said before, but EL got significant overhaul in last autumn (Fanboy's tweet). I still see many redundant rules in EL/EP but don't care much (discarded by uBO) - I only care about unneeded whitelists.

Thx, tho I don't play Karate I appreciate your thoughtfulness - in Japanese kidukai (or kizukai) which probably has slightly different sense, maybe @show-Zi can explain it. I love old amsterdam cheese (no Japanese made, really!).
 

Lenny_Fox

Level 12
@Yuki2718 thanks for your taking the time to explain.

I understand your answers better now: when the cost is zero, and the benefit is near zero, the aggregated benefit of a million near zero's might be worth having a million plus rules. With a highly randomized surfing behavior like yours this would align with your personal experience (with 50% random visits it is beneficial to have a million plus rules).

On the other hand most people tend to free-style surf within a more or less stable set of digital communities and interest groups, so the accumulated benenfit would be near zero. This is what I am experiencing (1000 rules work as good as a million plus rules).

:) Amsterdam old cheese, I like it to
 
Last edited:

Yuki2718

New Member
I am to close my account but hope this last comment to be posted. Well, a million will be excessive - haven't measured with such many rules but guess the cost will no more be 0. To clarify I'm neither of "more is more" or "less is more" side, a reasonable amount of filters should be determined by a solid and informed reasoning incorporating all the variables such as browsing habit, PC/browser use habit and specs, how much you care about missed ads/tracker, how you address FP or anti-adb, etc.; i.e. the minimal amount which covers all your needs. If you argued more is more I would have made different arguments and in fact sometimes advise against too many filter subscription on AdGuardFilters - such wrong choices are not rare among beginners. Thx and bye for all (for some members maybe see you again on GH)!
 
Top