I am with you, but uMatrix can't block some sneaky popups either. When the pop-up includes a redirect the redirected website becomes first-party. So it does not cover all potential dangerous holes. I am not going to point you to a test-it-yourself-websites (learned that lesson


). uMatrix's power is restricted to the API it can use, so it does not qualify as a browser firewall (again I could point you to websites which evade uMatrix filtering capabilities)
2o7.net_full.txt 2o7.net first party trackers, both alias & CNAME
2o7.net.txt only alias
cname.sqfs cleaned up version of freely available rapid7 cname database
ebis.ne.jp_full.txt ebis.net.jp first party trackers, both alias & CNAME
ebis.ne.jp.txt only alias
eulerian.net_full.txt eularian.net first party trackers, both alias & CNAME
eulerian.net.txt only alias
omtrdc.net_full.txt omtrdc.net first party trackers, both alias & CNAME
omtrdc.net.txt only alias
for rest of the missing first party trackers, check:
Using cname.sqfs on Linux
-------------------------
NOTE: you must have Squashfs and Squashfs XZ either as
a built-in or as a module in your kernel.
If you have file /proc/config.gz or something like that you can use the
following command to quickly check if you have the needed pieces, if not,
then kernel compiling time!
zgrep -e SQUASHFS=y -e SQUASHFS_XZ=y -e SQUASHFS=m /proc/config.gz
If the pieces are there you see this for built-in:
CONFIG_SQUASHFS=y
CONFIG_SQUASHFS_XZ=y
Or if there is squashfs module with XZ compression support:
CONFIG_SQUASHFS=m
CONFIG_SQUASHFS_XZ=y
mkdir tmp
sudo mount -o loop cname.sqfs tmp
cat tmp/cname and watch the ##### scroll....around 170 million alias/CNAME combos ...
Note that the cname.sqfs is just raw data dump of cnames collected
by rapid7 project (updated once per month). I just removed the
extra stuff (like timestamps) from their json file to make it
more readable.
So not everything there is first party (or third party)
tracker!
However, here's how you can make your own tracker lists based on
that valuable raw data:
Let's say you found another company that has started using this
dirty first party tracking technique (like ebis.ne.jp).
You can create your very own tracker list by giving the following
Linux command (of course, after having mounted the cname.sqfs like above):
grep "\.ebis\.ne\.jp$" tmp/cname > ebis.ne.jp_full.txt
That gives full version with both alias and cname included. That
list is mostly just for information purpose. Because that grep
pattern is a regular expression it's recommend to escape
dot (.) characters with \ and put dollar ($) sign at the end.
To minimize the likehood of false positives ending into the list.
The following however, gives you the real deal:
grep "\.ebis\.ne\.jp$" tmp/cname | sed 's/\(.*\)\t\(.*\)/\1/g' > ebis.ne.jp.txt
That gives you generic version of the tracker aliases without the
target cnames.
If you already know the specific format you are going to need it's easy
to modify the above command (there are just too many various formats
for various programs out there, that's why I only have very generic ones
listed here. Maybe later I add hosts file format .... and Unbound.
But that's all)
For example the following variation gives you hosts file format of the
above:
grep "\.ebis\.ne\.jp$" tmp/cname | sed 's/\(.*\)\t\(.*\)/0.0.0.0\t\1/g' > ebis.ne.jp_hosts.txt
Finally, if you just want to know how many specific trackers there
are in the data dump do this:
grep "\.ebis.ne\.jp$" tmp/cname | wc -l
Other stuff ...
------------------------------
I have work on progress for a (quick!) cname enumerator &
will kick it here when it's ready.
P.S.
Ad blocking should be ideally two layer process:
1. First line of defense should be either hosts file or local caching DNS server.
If you only need to block few addresses then hosts file is okay for that.
But if you wan't to block all the s**t that's out there (and keep coming),
then it is strongly recommended to use local DNS server instead of hosts file
because in the end, they are more flexible and more importantly, more
scalable than rigid, non-regular expression entries in the hosts file.
You can't even use wildcards in hosts file. So in the end the resulting hosts
files will be multiple times larger than the special zone files used for ad blocking
in local DNS servers. Some have tried to tackle the problem of managing
ever growing hosts files with separate program (it still don't fix the scalability problem)
So if you need separate program for that, then why not just use local DNS server?
Also, as a nice bonus, if you put ad blocking local DNS server into your
router/gateway/etc.. then your whole private LAN will benefit from the junk filtering.
Including your Wifi using smartphones.
2. After the bad domains (and subdomains) have been blocked at DNS level, the remaining
trackers will be handled by the browser extension (like uBlock origin).
Even thought the first line should have blocked 90% of the stuff out there, the
second line is still very important. For example, many "good" sites will track their
users with scripts (like ga.js or analytics.js). No DNS blocking will help there
because you can only block (sub)domains with that and you can't block the otherwise
"good" site that just have only one, or two (or one million) tracker scripts littered
on it's pages. That's where the browser ad blocking extension picks up and filters
the remaining, non-domain specific junk.
Note: It is also possible to use special proxy software (like privoxy or squid) to
filter the non-DNS stuff. But only if they come over HTTP....
In theory you could do also ad blocking in squid proxy for HTTPS connections
but it would mean that you would have to configure it with your own SSL-certificates
which is not exactly fun and might be dangerous too. You would be basically MITM your
own secure HTTPS connections!
P.S2:
And yes Geoffrey....still reading the logs
And that a old file. so is this what you mean?