Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

pixeldetracking

I'm the author, good to see this on HN, raising awareness on the topic

I don't know who made the translation and when it was made, but the original article in french (https://pixeldetracking.com/fr/google-tag-manager-server-sid...) contains more information on recent GTM "improvements"): mainly on how you can easily change JS library names and detailed instructions on how to host your container in other clouds or self-host

gildas

> I don't know who made the translation and when it was made

This page was saved with SingleFile (I'm the author of SingleFile). Therefore, I can tell you that this page was produced on Tue Dec 08 2020.

easrng

Thank you for making SingleFile, it's been an absolute lifesaver in a project I'm working on. I was having a lot of trouble trying to manually save pages with puppeteer but the singlefile CLI worked perfectly, even with added extensions. (To get extensions to work I had to add --browser-headless=false --browser-args ["--enable-features=UseOzonePlatform", "--ozone-platform=headless", "--disable-extensions-except=/path/to/extension", "--load-extension=/path/to/extension"] )

gildas

Thanks for the feedback! It's very timely, I just have an issue that discusses the problem of sideloaded extensions (and profile data).

samstave

Gawd I love HN you beautiful bastards.

sharps1

SingleFile and SingleFileZ are great!

It’s a shame with manifest v3 that it will hamstring the size of pages that can be saved (to 43mb if I remember on the SingleFileLite GitHub page).

gildas

Thank you! Actually I did some tests recently with SingleFile Lite and was able to save a page weighting 120MB+ so the 43MB limit seems obsolete. There are still some annoying issues though.

pixeldetracking

thanks for the info! maybe it's Jerry: https://info.woolyss.com/

jikoo

Hello pixeldetracking, Yes it is me ;) I translated your excellent page and made an HTML archive with the excellent SingleFile extension. Thank you very much for all. I like to keep a copy of interesting content. https://chromium.woolyss.com/#guides

Regards

jikoo

Hello pixeldetracking, Excellent article! Bravo. About the translation, yes, it is me! ;)

gorhill

I read the original article back when it was published in November 2020[0]. This is what led me to introduce new static network filter options:

- strict1p, strict3p [1]

- header=, experimental, disabled by default [2]

I used Simo Ahava's blog as test case, and with these new options, I could craft a filter to block the Google Tag Manager script on Simo Ahava's blog. However due to the lack of more test cases, no more progress has been made about this since then.

Things that stood out to me when reading about all this:

Simo Ahava's refers to the CNAME approach as "vulnerable"[3]:

> This way you’ll be instructed to use A/AAAA DNS records rather than the vulnerable CNAME alias

"Vulnerable" to what? To uncloaking as I understand it, and by extension, "vulnerable" to users taking steps to protect their privacy.

Whether the very experimental solution in uBO ends up working or not, this case shows very well how Google Chrome's Manifest Version 3 (MV3) put a lid on innovation content-blocking wise: All the new filter options introduced above can't be implemented with declarativeNetRequest.

===

[0] https://www.pixeldetracking.com/fr/google-tag-manager-server...

[1] https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#...

[2] https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#...

[3] https://www.simoahava.com/analytics/server-side-tagging-goog...

danShumway

Thanks for adding this comment. My immediate reaction when seeing this was that I thought it looked familiar to previous conversations I saw a while back. But I didn't know for sure that they lined up exactly, and I wasn't looking forward to doing the research to find out.

> All the new filter options introduced above can't be implemented with declarativeNetRequest.

My understanding was that stuff like CNAME uncloaking was already unsupported in Chrome[0]. Of course, Manifest V3 won't make the situation any better though.

[0]: https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b...

Vinnl

For context, gorhill is the author of uBlock Origin.

And for context on MV3, see https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b...

GoblinSlayer

Sure that means vulnerable to widespread blocking.

jeroenhd

This kind of data collection abuse is why I think we need more addons like AdNauseam [1]. Unlike uBlock Origin, it's not available from the Chrome web store anymore, which is a good sign that Google hates these types of addons more than they hate simple blockers.

Blocking A/AAAA domains with custom URLs to prevent tracking is almost impossible, so instead let's flood the trackers with useless, incorrect data that's not worth collecting.

[1]: https://addons.mozilla.org/en-US/firefox/addon/adnauseam/

matheusmoreira

Completely agree. Stuff like uBlock Origin is just online self-defense against hostile megacorporations. Maybe it's time we started going on the offensive by poisoning their data sets with total junk data with negative value. They insist on collecting data despite our wishes? Okay, take it all.

samstave

I Like the cut of your jib, and I would like to subscribe to your newsletter.

undefined

[deleted]

y42

I worked for a agency a couple of years ago, when, out of the blue, tracked data contained tons of random data instead of the expected UTM parameters. It took us a while to figure out what was happening. It was some kind of obfuscating plugin that was messing up well known tracking parameters.

What I want to say is: stuff like that could actually cause a lot of fun on the other side.

malermeister

Does anyone know which addon that might've been? Seems like a good addition to adnauseam.

undefined

[deleted]

toss1

Yup. I've used NoScript for years, and one of the most frequently appearing sites that remain blocked is googletagmanager.

I totally second the sentiment that this is merely minimal defense against hostile 'service providers'.

This avalanche of tracking libraries is now almost as toxic as email spam in its worst-controlled days. Much of the internet is literally unusable, as pages take dozens of seconds to minutes to load - on a CAD-level laptop that can rotate 30MB models with zero lag.

In fact, does anyone have a blacklist of trackers that we can just blackhole at the HOSTS file or router level? Maybe time to setup a pihole?

GoblinSlayer

In my experience the most popular noscript trackers are googletagmanager and facebook, so with just two domains you can get a lot. But e.g. bloomberg uses full first party proxy for facebook pixel with pseudorandom base url, it's difficult to block even by url; I suspect they duplicate the page request to facebook too, but this is unobservable on client side. Hopefully this solution doesn't scale well.

troyvit

This is my go-to: https://github.com/StevenBlack/hosts

It helps a lot.

walterbell

Since this extension actively clicks on ads which may trigger payments, how do ad-fraud services classify endpoints running this extension? Could they consider this malware and add the client IP to blacklists?

matheusmoreira

> Could they consider this malware and add the client IP to blacklists?

Do malware developers consider the countermeasure softwate created to resist them to be malware as well?

rplnt

If we were to split what malware does into Infection (getting into the system), Avoidance (hiding from system, AV, or attacking AV) and work (sniffing, sending spam, etc..) then the Avoidance would be by far the biggest and most complicated (and most interesting) category.

GoblinSlayer

They absolutely do.

ratww

Good. If it is a shopping or some other service that charges money, then they lose business.

If it is some service that you have no choice but to use, but relies on network effects (like Facebook Events), then you can just send a screenshot to the interested party and they Might consider not using a service that is broken for other people.

danuker

Sure, and perhaps also the accounts of users running this while logged-in. Have contingency plans if you run this and your, say, GMail account is blocked.

malka

it is precisely why I degoogled my life.

I did not want to live under the constant threat of big G locking me out of my own life anymore.

User23

Anyone still using gmail today for anything other than throwaway purposes is behaving foolishly.

ohgodplsno

With a bit of luck, it gets server owners banned from AdMob/MoPub/etc for fraudulent clicks.

jeroenhd

I wish, but I haven't stopped receiving ads yet.

cobbzilla

Can uBlock do payload inspection? It would be easy to block an upstream json POST that matches a certain structure.

Const-me

Interesting idea, installed the addon.

I’m using MS Edge BTW, Microsoft doesn’t care about Google’s advertisement revenue, the addon is available in their marketplace.

fartcannon

Microsoft doesn't care because they collect everything through their desktop environment. That's why you need an email to set windows up now.

Const-me

> they collect everything through their desktop environment

There're many relevant questions during the install. If one actually uses the OS installation wizard GUI instead of skipping it with "next" buttons, Microsoft won't be collecting much.

Another thing, they don't have to because their business model is honest. They're building software, users are paying for that. Microsoft ain't an advertisement company, they have little motivation to track people.

> you need an email to set windows up now

I did clean installation of Windows 10 last week (recycled an old laptop after migrating to a new one), the email was optional.

GoblinSlayer

Luckily, ms provides throwaway mailboxes at outlook.com.

consumer451

I am very interested in this, thanks for sharing.

Adding another party into my web browsing is always a tough pill for me to swallow. I am also a noob at reading trust signaling. What are some of the reasons that I should trust this dev and their processes?

danuker

You should not trust them. You can download the add-on and inspect it yourself, if you know some JS. Right-clicking yields this URL:

https://addons.cdn.mozilla.net/user-media/addons/585454/adna...

But it seems to include a lot of code, including some uBlock Origin code.

Either way, this kind of sabotage might get you banned on Google. Be mindful of the risks, and have contingency plans.

jeroenhd

You should put the same amount of trust in this dev as you should in any other. I myself trust Mozilla's store reviews enough to run the addon, but if you're more conservative with trust, you can inspect the source code and build the addon itself.

The addon comes down to a uBlock Origin fork with different behaviour. I believe most of the addon code is actually the base uBlock code base.

I haven't seen any obvious data exfiltration in my DNS logs, but then again I'm just another random on the internet. If you don't feel comfortable installing something with a privacy impact as broad as an ad blocker, you should definitely trust your instincts.

undefined

[deleted]

gigel82

God damn... this is it, this is the end-game. There's no way to fight this unless you customize and maintain blocking scripts for each individual website.

Yes, websites could always have done this, but the REST (CDN-bypassing) requests' cost and the manual maintenance for the telemetry endpoints and storage was an impediment that Google just gives them a drop-in solution for :(

I think Google is happy to eat some of the cost for the "proxy" server given the abundance of data they'll be gobbling up (not just each request's query string and users' IP address but -being a subdomain- all the 1st party cookies as well). I don't have the time or energy to block JavaScript and/or manually inspect each domain's requests to figure out if they use server-side tracking or not.

I honestly don't know if there's any solution to this at all. Maybe using an archive.is-like service that renders the static page (as an image at the extreme), or a Tor-like service and randomizes one's IP address and browser fingerprint.

hilbert42

"I don't have the time or energy to block JavaScript and/or manually inspect each domain's requests to figure out if they use server-side tracking or not."

By default, I don't run JavaScript. I don't see blocking JS as a problem - in fact, it's a blessing as the web is blinding fast without it - and also most of the ads just simply disappear if JS is not running.

On occasions when I need JS (only about 3-5% of sites) it's just a matter of toggling it on and refreshing the page. I've been working this way for at least 15 years - that's when I first realized JS was ruining my web experience.

I'm now so spoilt by the advantages of the non-JS world that I don't think I could ever return. I'm always acutely reminded of the fact whenever I use someone else's machine.

heavyset_go

> By default, I don't run JavaScript. I don't see blocking JS as a problem - in fact, it's a blessing as the web is blinding fast without it - and also most of the ads just simply disappear if JS is not running.

Years ago I was on the "people who block JavaScript are crazy" bandwagon, until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load. I spent more time waiting for things to finish loading than I spent browsing the actual sites, which killed my battery life. I'd get a couple of hours of battery life with JS on, and with it off, I could work all day on a single charge. It was nice.

Ever since then, I've been using NoScript without a problem. I've spent all of maybe 5 minutes, cumulative over the course of several years, clicking a single button to add domains to the whitelist. If whitelisting isn't something you want to do, you can use NoScript's blacklist mode, too.

> I'm now so spoilt by the advantages of the non-JS world that I don't think I could ever return. I'm always acutely reminded of the fact whenever I use someone else's machine.

I relate with this 100%.

Semaphor

> until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load.

That sounds like you not only didn’t block JS, you also didn’t block ads. Which is a very different argument. I only block 3rd-party JS by default (and that already requires a lot of whitelisting for almost every site that has any interaction) and I don’t have those issues because I also block ads.

paulryanrogers

Tried NoScript for years and it was a pain. Too many of the sites I use need so many domains full of JS. So I think this will vary widely depending on the person and their preferred/needed sites.

unicornporn

> Years ago I was on the "people who block JavaScript are crazy" bandwagon, until just loading a single news article online meant waiting for a dozen ads and autoplaying videos to load.

Seems like clear case of "crossing the river to collect water" (as the Swedish saying says)? This is what I use uBlock Origin (with the right blocklists) for and it happens automagically. I did use uMatrix for quite a awhile, but eventually ended up ditching it because uBlock Origin worked so well.

maccard

uBlock Origin solves the problem you had too, without breaking multiple sites.

forgotmypw17

There's another, indirect benefit to blocking JavaScript.

Over time I have noticed a strong correlation between sites which don't work right without JS and low-quality content which I regret having spent time reading.

Most of the time I encounter one of these sites I now just close the tab and move on with a clear conscience.

hilbert42

"Over time I have noticed a strong correlation between sites which don't work right without JS and low-quality content...."

Absolutely true, I can't agree with you more. I've reached the stage where if I land on a site and its main content is blocked if JavaScript is disabled then my conditioned reflex kicks in and I'm off the site within milliseconds.

Rarely is this a problem with sites that I frequent (and I too don't have time to waste reading low quality content).

zelphirkalt

Similar here. When I am searching for something and a website wont show it unless I enable JS on that website, then usually it is the case, that after enabling JS to see the content, I realize, that the website's content is worth nothing and that I activated JS for naught, regretting to have spent time on that website.

bentcorner

I used to run NoScript then at some point (maybe switched browsers?) I stopped using it. You've persuaded me to re-enable it.

Also - Firefox on mobile supports NoScript!

behnamoh

No, only FF on Android supports extensions.

quambene

Concerning noscript, is this [1] still a thing?

[1] NoScript is harmful and promotes malware - https://news.ycombinator.com/item?id=12624000

kevin_thibedeau

Firefox has never been slow for me over the last 15 years because NoScript makes it light years better than Chrome. Conversely, I routinely have the Android assistant lock up on me from JS bloat despite the supposed performance enhancement of AMP pages.

mderazon

I don't know which web you're viewing that only needs JS for 3-5% of websites

PhantomGremlin

HN totally usable for basic functionality w/o JS.

profootballtalk.com works great if you don't want to vote or comment

macrumors.com great functionality

nitter.net happily takes the place of twitter.com

drudgereport.com works great and I rarely turn on JS when I go to the sites he links to, usually the text on target sites is there if not as pretty as it could be

individual subreddits (e.g. old.reddit.com/r/Portland/ ) are quite good w/o JS. But the "old." is probably important.

I admit that there are lots of sites that don't work, e.g. /r/IdiotsInCars/ doesn't work because reddit uses JS for video. For so many sites the text is there but images and videos aren't. Also need to turn off "page style" for some recalcitrant sites.

In conclusion, contrary to your JS experience, I'd say that I spend over 90% of my time browsing w/o JS and am happy with my experience. Things are lightning fast and I see few or no ads. I don't need an ad blocker since 99% of ads just don't happen w/o JS.

hilbert42

Read my reply to paulryanrogers about whether one's a JavaScript or a non-JavaScript type person.

The 3-5% of sites I'm referring to are ones where I have to enable JS to view them. In by far the vast majority of the sites that I frequent I do not have to enable JS to view them.

Also note my reply to forgotmypw17, one doesn't need JS if one avoids low quality dross.

ajdude

This. I use the no script addon by default, and it’s amazing how many different domains sites try to bring in. Then I hit Twitter, imgurl, quora, etc and I am left with nothing but a blank page with plain text telling me that I need JavaScript to view the site. It makes me wonder what kind of tracking they are pushing.

Syonyk

All of them. If you allow everything and have Ghostery running in "don't block anything but tell me what's there" mode, it's horrifying just how many things get loaded.

You can play with page load sizes in the debugger console with stuff blocked and without too - about half the downloaded material on any major news website is stuff that Ghostery will block. It's quite terrifying.

kobalsky

> and also most of the ads just simply disappear if JS is not running.

since we are talking about the future I'd like to point out that they can always serve ads from the origin domain without javascript.

I mean the anti-adblock battle will evolve until each page we visit is a single image file that we have to OCR to remove ads. then we will need AI, and they will have captchas that will ask which breakfast cereal is the best.

you can stay ahead of the curve but it's always moving forward.

hilbert42

"...they can always serve ads from the origin domain without JavaScript."

But most of them don't. Yes, they can change their model and in time they likely will.

As it stands now, one doesn't have to watch ads on the internet if one doesn't want to - all it takes is a little perseverance and they're gone. If one can't rise to the occasion then one has a high tolerance for ads.

Even YouTube can be viewed without ads with packages such as NewPipe and similar.

You're right about AI, OCR etc. and I think in time it will come to that.

It seems to me people like us will always be ahead because we've the motivation to rid ourselves of ads. It reminds me of the senseless copyright debate - if I can see the image then I can copy it. No amount of hardware protection can stop me substituting a camera for my eyes. What's more, as the fidelity goes up HD, 4k etc. the better the optical transfer will be (less comparative fidelity loss).

That said, the oldest technology - standard TV - is still the hardest to remove ads from. Yes, one can record a program and race though the ads later (which most of us are very adept at doing) but it's still inconvenient.

What I want is a PVR/STB that figures out the ads and bypasses them. Say I want to watch TV from 7 to 11pm (4 hours) and there's a total of one hour of ads and other breaks in that time that I don't want to watch then I want my AI-aware PVR/STB to suggest that I start watching at 8pm instead of 7 as this will allow it to progressively remove ads on-the-fly across the evening.

The person who makes one of these devices will make a fortune. If the industry tries to ban it (as it will) then we resort to a software version and download it into the hardware. Sooner or later it's bound happen and I'll be an early adopter.

jcfrei

How does blocking javascript in this case prevent tracking? It's done via the same cookies the website uses, as I understand it. Do you disable cookies too?

ec109685

Apple’s Private Relay blocks this type of cross site tracking.

Given this tracking is all server side, third party cookies across sites aren’t possible using this mechanism, and private relay cycles through your IP addresses frequently and uses common IPs across multiple users.

Regarding your other point, unless Google execs want to be thrown in jail / sued, they can’t use things like first party cookies for their benefit since that is against their terms of service.

novok

How is private relay different from a vpn? A lot of fingerprinting scripts also can track you despite vpn.

top_sigrid

Private Relay uses ingress and egress relays. The ingress proxy does know your IP but not which sites you are visiting and what you are doing. The egress proxy is only connected to the ingress, sees what you visit but does not know who you are. Both proxies are run by different parties.

With a VPN you would have to trust one provider, who sees all of your traffic.

undefined

[deleted]

irrational

I wonder why Safari is required? I’d be interested in paying for this if it worked with Firefox.

GekkePrutser

Yeah that would be a useful service that Mozilla could offer and I'd actually pay for.

I don't like their VPN as it's too basic in terms of privacy protection and it's much more versatile to just sign up with Mullvad myself because then I can use it on other stuff than just the browser.

bhauer

I think in the short-term the strategy is this from the article:

> Or ... block all the IP addresses of Google App Engine, at the risk of blocking many applications. having nothing to do with tracking.

Anyone hosting legitimate apps in the Google ecosystsm is indirectly complicit in this and at least for my personal network, I have no concern with blocking Google App Engine holistically.

Additionally, I think it's important to hurt Google as much as possible for escalating in this way. Widespread blocking of GAE may seem extreme but it's also arguably warranted.

reaperducer

I have no concern with blocking Google App Engine holistically

Unfortunately, it seems that more and more government web sites rely on Google services to function. And there's no replacement for those.

timbit42

Use two browsers. One where you don't block tracking and can access government and make purchases on shopping sites, and one tracking is blocked and JavaScript is turned off.

paulryanrogers

How can it be legal for a government to make increasingly core services depend on these amoral, for profit monsters?

tgsovlerkhgsel

Aren't browsers shifting to a per-domain cookie jar?

While you can never prevent one specific site from tracking you, this still doesn't (directly) allow your activity on Site A to be linked to activity on Site B, does it?

Of course, fingerprinting combined with IP addresses will ultimately allow something that comes very close to it, so the current state (a few hundred trackers per website, all ending up harmlessly incrementing the adblocker's counter) is better for privacy for power-users, but I'm not sure if this is the big "game over".

josefx

Google is pushing to have the browser itself track your interests and share them with whoever asks. The first attempt FloC backfired rather quickly as it was an all around privacy nightmare. The second attempt Topics promises to fix a lot of the problems FloC had but that is not a high bar and Google left itself a lot of room for future changes.

lewantmontreal

This is what I’m interested in. Article itself did not mention cross site tracking.

Every website having their own tracking subdomain makes third party cookies not work cross site even without browser changes.

GekkePrutser

They can still cross-track based on IP or any other fingerprint worthy information. I expect this is exactly what they're doing. Doing this all on a central service makes this process much easier unfortunately...

pixeldetracking

yes, they would need to get another identifier, and that's what is done with players like Facebook.

Sorry another of my articles in french: https://pixeldetracking.com/fr/les-signaux-resilients-de-fac..., but Facebook is making it easy to integrate their "Conversion API (CAPI)" with GTM Server-Side tagging

callmeal

The cross site tracking is done by a third party. From reading the docs, the way it works is, publisher sets a unique id, browsers send that unique id to the publishers domain, publisher forwards that (via the tag manager app engine) to the third party.

quicklime

> Maybe using an archive.is-like service that renders the static page (as an image at the extreme)

A lot of companies are starting to use "browser isolation" which is essentially what you're saying. A proxy runs between the client and the server, but it does more than just direct TLS streams - it actually builds the DOM and executes the JS. The resulting web page is sent to the actual client browser, which might send back things like mouse and touch events to the proxy, which will then update the page.

I think most companies are using this as a malware protection thing, but it does hide the actual client IP address and fingerprint, and I imagine it would make tracking very difficult.

https://en.wikipedia.org/wiki/Browser_isolation

GekkePrutser

Browser isolation isn't quite that. It's just running a browser that is heavily sandboxed from internal files and networks, or running on another machine so any exploits don't hit your machine.

It's very much like running a browser through Citrix (in particular the remote flavour which is the most common as far as I've seen). But of course any data in the browser itself is still within reach for the malicious code... Which only solves half the problem. Unless you rigidly separate internal browsing from external sites.

But it doesn't run all the JavaScript and then send you a screenshot or anything. The resulting page is still interactive.

Remote browser isolation has the ability to change the landscape of personal computing enormously by the way. Right now we equip all our laptops with at least 16GB (32 for customer care) because some web apps like Salesforce Lightning are such memory hogs.

Considering the importance of the browser in modern computing this model world basically make the PC more like a terminal and require much less resources.

Of course this has already been going on with web based apps and streaming of things like games but this could be the final nail in the coffin of the PC as we know it. Not sure I'm happy with that...

kibibu

Opera Mobile has been doing this for years and years

Quai

The Opera product you are thinking of is Opera Mini. Opera Mobile is a browser running mostly on your device (except for "turbo" which optimized media trough a proxy setup, but did not, afaik, execute any of the javascript).

Opera Mini can be looked at as a browser running in the cloud, sending OBML (Opera Binary Markup Language, if I remember correctly) causing the (very thin) client to draw things on the mobile screen, like text, images, etc without having to transfer, parse, execute, flow and paint every thing on the device.

cookiengineer

> Maybe using an archive.is-like service that renders the static page (as an image at the extreme), or a Tor-like service and randomizes one's IP address and browser fingerprint.

I'm building a peer-to-peer network of Web Browsers [1] that doesn't trust anything by default, and only allows to render types of content incrementally; while disabling JS completely. Most of the time, you can find out what the content is with heuristics. The crappy occasional web apps that don't work without JS can be rendered temporarily in an isolated sandbox in /tmp anyways.

I think that the only way to get ahead of the adblocking game is to instead of maintaining blocklists, we need to move to a system that has allowlists for content. The user has to be able to decide whether they're expecting a website serving a video, or whether the expectation is to get text content, image content, audio content etc. News websites are the prime example of how "wrong" ads can get. Autoplayed videos, dozens of popups, flashing advertisements and I haven't even had time to read a single paragraph of the article.

And to get ahead of the "if fanboy gets hit by the bus" problem... we need to crowdsource this kind of meta information in a decentralized and distributed manner.

[1] https://github.com/tholian-network/stealth

misterbwong

Called it [1]. It's a cat-and-mouse game and, unfortunately, advertising is just _that_ lucrative. Privacy-minded browsing will help those that care (for now...), but that's an unsustainable option with the current monetization channels available.

If a content publisher cannot monetize you, they will think nothing of blocking you. There will be some public backlash against companies that do so and there will be some sites who will lose money because of it, but the rest of the publishers will simply follow the money while the industry shifts towards more intrusive tactics.

There needs to be a monetization channel that is 1) good for both users AND publishers and 2) pays just as much as current methods. Unfortunately none of the current systems support that.

[1] https://news.ycombinator.com/item?id=9975955

drusepth

>There needs to be a monetization channel that is 1) good for both users AND publishers and 2) pays just as much as current methods.

I agree, but what party would you like that money to originate from?

Ads work well right now for consumer-to-consumer (e.g. I create a blog and you view it) because there's a rich, third-party that money can flow from (a company running ads --> money to me) without having to charge you, the end-user who is more than likely significantly less well-off than a corporation.

To buck that pattern, you need the money to come from somewhere else. Subscriptions and direct payments are an obvious choice (see: the boom of SaaS over the past few years) but people are already complaining that they have so many subscriptions they lose track of them all, and spend too much money on what used to be a "free" internet.

So, I don't think there's a solution where the money comes from the end-user. However, any time you add in a third party for the money to flow from, they're going to want something in return. And unless you want that cash flowing from the site owner to that third party (...why would you?), they're gonna need to offer something else.

I don't see any solution other than "a third party pays for something users and/or the site can create for free". Is the answer to just find something free other than analytics/usage, or are there other approaches to monetize a site while still making it "free" to access?

misterbwong

Unfortunately I don't see a good solution either. Large direct to consumer business models like SaaS or subscriptions are really only sustainable at scale, and even then it's dicey. In a SaaS model, the big fish win and we lose the democratic nature of the current internet.

Society has driven the perceived price of content so low that the content itself is worth less than the aggregate audience. Really, in what other space does the average consumer set their price expectations at free AND balk at paying $5/mo for unlimited access to a product?

The only thing that seems to come close to moving the needle towards privacy is somehow pushing advertisers into in-market advertising (think early internet-style site banner ads) and out of programmatic/user tracked ads. There is some evidence that these programmatic ads don't really perform as well as they claim but from what I can gather, the data is still unclear.

booleandilemma

Simpler protocols (Gemini, Gopher...), outright refusing to use what the modern web has become. I only use HN and a few select sites. You don't need an ad-blocker if there are no ads in the first place.

ReactiveJelly

Using Gemini as an allowlist doesn't seem any better than allowlisting known-good domains for HTTPS sites

EE84M3i

HN is a link aggregator for HTTP(s) links. How do you read them?

aenis

Not sure about the parent poster, but I am here mostly for the comments, and rarely visit the linked content.

sdoering

Disclaimer: I am a data analyst. I consult companies in regards to ethical data collection. But I also know of black sheep.

I don't have a problem with websites measuring what I view, click, add to cart or buy. I want them to be able to see what doesn't work in terms of user experience.

And if they do marketing I even want them to be able to see from which source of traffic (aka marketing effort) how many conversions (whatever comprises a conversion) stems.

The problem imho isn't GTM (Google Tagmanager) running as proxy. This would (or at least could) be a data privacy win if done ethically. At least under one imho essential condition: I could be able to run the proxy on any infrastructure that I like. Not only one Google's cloud offering.

And on the second essential condition that marketing departments act ethically. They can send the web analytics data to whatever tool they like. But they should absolutely not send my identifying information with it. They should use the proxy as a privacy protector. The same when sending conversion data to the marketing tools. I am OK with the marketer sending information back that a specific ad (not a specific user clicking on a specific ad) led to a conversion.

I don't need Meta or Alphabet tracking me personally (or my clients'users) with every click. But I understand the business need to measure the effectiveness of marketing money spent. Solutions like these could be a way to achieve this. If done right. And not done in the way GTM does (only hosting on Google, using an A/AAAA subdomain, grabbing every cookie possibly and so on).

janpot

> I want them to be able to see what doesn't work in terms of user experience.

That's not what they're doing, at all. They want to be able to see what doesn't work in terms of maximising profits. That may correlate with good user experience sometimes, but more often it results in the opposite.

franga2000

Exactly! A company's goal is profit and most of the time, that does not align with the customer's goals. Amazon's goal is to sell me the highest margin item, I want the best value or highest quality.

I have very limited information about which items are a good value or high quality, so why should amazon have the tools to most effectively steer me towards high-margin items? They exist to provide us a service and we grant them the right to make a small % of profit while doing it. Not the other way around!

airstrike

> I have very limited information about which items are a good value or high quality, so why should amazon have the tools to most effectively steer me towards high-margin items? They exist to provide us a service and we grant them the right to make a small % of profit while doing it. Not the other way around!

As a small aside, The capitalist's answer is that regulating companies to prevent them from steering to the most profitable items is both impossible to be adequately done and prohibitively costly. Even assuming cost isn't an issue, it's hard to imagine such regulation to be equally applied to all market participants (or to be equally effective). So we would be left with companies that cooperate and others that defect, and the defectors would be favored (more profitable) and outcompete the cooperators in the long run.

So instead we start from the assumption that companies are greedy and let them compete to offer customers the best value -- and if that value comes (at least in part) from not being tracked, companies that do not track will attract more customers. We probably just haven't made enough of a fuss about it with our dollar-votes.

For what it's worth, I block all ads without giving it a single thought. The way I think about it is that on the flip side of the prisoner's dilemma, I'm just defecting like some companies would. It's a race to the bottom in terms of the trust between customers and companies, but I didn't make the rules of the game...

collegeburner

That's not true. I run a site like this and I want both. Yes I want to test what maximizes conversions, but this is also what helps me provide value to more users. And I also need it to determime how to improve the service I provide to users.

andrewingram

The bit you're replying to here hasn't yet introduced the problem of marketing teams.

The kind of tracking in the first section is "understanding how people use your product", and is usually introduced by the product team, rather than marketing. And most product teams i've worked on fiercely fight back against the addition of excessive tracking. Whilst the goal of a business (and therefore a product team), is usually about maximising profits, it's not exclusively about that. I've worked for businesses that literally have a social charter in their articles of association, but they still want to measure how people use their products.

jeltz

You have been lucky then. At the places I have worked the product people have not fought with thech but if anything they have fought against tech on this matter.

y42

That's to easy IMHO. Yes, online marketing is about profit. But tracking is not always about profit. I work for a customer that offers kind of a job search engine. All they want to maximize is the rate of succesful employments. Yes, they need to optimize marketing budget. But not to sell useless stuff, but to reach out to potential employees.

jacquesm

Marketing is almost by definition not going to act ethically: their whole goal is to create a need where there isn't an organic one, and the KPIs by which marketing departments are run are proof positive of that. Nobody starts off with 'what would be the natural limit of our product sales', instead they start off with 'what is the total addressable market and how do we maximize our fraction of that' implying that if you are counted in their market that you are fair game whether you like it or not.

etempleton

Most people only see 1-2% of what a marketing department does. The primary goal of a marketing department is to inform and present information in a clear and attractive manner. A good marketing department is also an advocate for what the consumer wants based on research and consumer feedback.

Are there bad actors in marketing. Yes. A lot. Marketing agencies are full of them. Agencies, to generalize, only care about short-term results and selling the client on the next big idea. They won’t be around or have to live with the repurcuions of their bad actions. In fact, the clients are their customer and so they don’t really care about the client’s customers at all so long as the client is paying them. They just need superficial numbers to go up to show the client. They are screwing over the client and customers are unfortunately collateral damage, but the agencies, again, don’t really have to deal with that.

A lot of the most anti-consumer tactics do not work in the long-run. Most consumers aren’t so easily tricked into buying a product today and they most certainly won’t be tricked twice. It doesn’t take too long—usually—for the snake-oil salesman to get run out of town. They just do a lot of damage while around.

slx26

Even when recognizing that there are a lot of bad actors in marketing, that's still an extremely over-optimistic perspective: at some point, tricking people becomes easier than improving the products, value propositions become muddier, and snake-oil starts to be used as the lubricant for business relationships. Only the most obvious offenders get run out of town, while most evolve and get to raise the new normal boiling point; as long as refining the snake-oil is cheaper than refining the actual products, the situation keeps getting worse.

Either the dynamics work in favor of the people, or they don't. That we continually mistake the comfort of our ships with the state of the sea is just the blessing and tragedy of our ignorance.

Mezzie

As a 'white-hat' marketer (I work for some place that's similar to Vote411/ The League of Women Voters and I initially started in library outreach; I don't think anybody would consider my work unethical), the issue is the need for constant growth and profits.

You can do cool and interesting things in marketing and outreach and there are actual use cases for them. For example, libraries often carry unconventional items, and making the community aware that they can borrow a sewing machine/get seeds to plant/get museum passes is technically marketing and 'creating' a need, but it's not exploitative.

It's a very similar situation to dev work in that if I were willing to chuck my ethics out the window, I would make a lot more money, and marketing people do also like money.

jacquesm

The implicit observation that there is such a thing as a white hat marketeer relegating the remainder to black hats is an astute one.

I would rephrase the one as raising consciousness about important issues, and leave the other one under the label marketing, which to me is limited to commercial enterprises and indirect money grabs, a lot of which is related to politics and creating artificial divisions in society (the 'haves' vs the 'have nots' and so on).

JumpCrisscross

> their whole goal is to create a need where there isn't an organic one

This is reductionist. Was telling people about trains and cars creating a need where there wasn’t one? In a sense. But in another sense, it was broadcasting a better way of being. Marketing doesn’t have to be evil. Saying all marketing is evil is sort of a cop out for the people who do it badly.

ATsch

> Was telling people about trains and cars creating a need where there wasn’t one?

That's a great example actually, because the reason you can't get anywhere without a car these days is marketing campaigns by the automobile and oil industry. First by suggesting the newly necessary road safety standards and ridiculing people for being in the street without a car ("jaywalking") to the point that it was criminalized, then by sponsoring enormous displays about the glorious car-dependent future at multiple world fairs (GMs "Futurama" holds the attendance record at 5 million visitors to this day), shutting down streetcar companies via lobbying and acquisitions and eventually even providing the US secretary of defense, who then used the defense budget to bulldoze inner cities to run highways through them. A development that caused the US to have the highest car dependence, car ownership and transport emissions of any large nation today.

So yes, I think it's fair to say there was a bit of artificial need created here.

jacquesm

> Was telling people about trains and cars creating a need where there wasn’t one?

People were telling each other about these.

> Marketing doesn’t have to be evil.

No, indeed it doesn't. But as a rule it definitely appears to be. It's a bit like arsenic: it doesn't have to be negative but usually it is.

> Saying all marketing is evil is sort of a cop out for the people who do it badly.

If 99% of the people engaging in an activity are doing it badly then I'm all for reigning them in, in spite of the 1% that are doing a swell job.

GoblinSlayer

Marketing is basically hacking, so yes, it doesn't have to be evil. Apparently there are a few white hats.

Lamad123

There had always been a need to move people and stuff from point A to point B and move it fast!!!

slightwinder

> Marketing is almost by definition not going to act ethically: their whole goal is to create a need where there isn't an organic one

That's very single-minded. Marketing mainly informs about a product, which obviously also works even if you already have the need for it. And it can also help in realizing a specific need which the customer has not pinpointed yet. That's the whole point of acting ethically, to support, not to bait, trap and abuse.

magicalhippo

Indeed. That's exactly what our marketing department does.

Our product helps our customers comply with the law. The law created the need, we're just trying to make our customers lives easier by assisting them with complying.

So our marketing team focuses on informing potential customers what it takes to be in compliance as few are well aware of what it takes, and how our product can help with that.

scoutt

What would be the utmost, top dream of a Marketing team? I think it is to be able to read my mind. Followed by being able to project an ad into my retina (if writing into my mind is not possible).

If the above is not possible, then they will come to analyze my behavior online.

It's truly sad...

Paraphrasing The Godfather 3 "Finance is a gun, politics is knowing when to pull the trigger" and I would add "marketing is knowing HOW to pull the trigger".

> And it can also help in realizing a specific need which the customer has not pinpointed yet

Don't you love cold calls, spam and pop-ups?

Marketing helped to ruin the latest and finest revolution of our time, that is, the Internet.

collegeburner

Ridiculous. "Tracking" and your so called "artificial" metrics have significantly increased my site's conversions to paying users and my users' experience. I did nothing unethical in the process.

medium_spicy

- This thread is about marketing. Did you do all of the marketing, or did an existing infrastructure perform tracking and serve ads for you?

- What data support claims about your users’ experience? Conversions are not a good metric of user experience.

- People generally have a hard time evaluating the ethical merits of things that benefit them. Do you have some kind of independent evaluation so support your claim that you did nothing unethical? If a politician hires a lawyer as a fixer, and pays them to make problems go away with a minimum of information returned, is that politician acting ethically? If the fixer hires a hitman for that problem, does the politician’s ignorance of that act constitute ethical impunity?

black_puppydog

> the second essential condition that marketing departments act ethically

This seems like a pretty strong assumption, given that both an engrained culture, lived experience, and an analysis of the different parties' incentives stand against this.

Until we have strong (and crucially, really enforced) legislation against this, I'd say technical means (blocking JS mostly) will be the only thing I'd be willing to bet on.

robalni

> I don't have a problem with websites measuring what I view, click, add to cart or buy. I want them to be able to see what doesn't work in terms of user experience.

The problem is not that they measure things. The problem is that they enter the user's private area; they run code on the user's computer and probably grab information about the user too (I don't know exactly how tag managers work because I have never used one). It's like if I enter your home and start measuring things, the problem is not that I measure things, it's that I entered your private area.

sdoering

Well actually it is more like they you are entering their store. They are measuring the number of people that come in. The number of items (and what items) these people look at. Add to their basket. How many stand in line at the cashier and how many buy. And how many filled baskets stand in the isles at the end of the day.

But - they also could write down the gender of anyone entering the shop. Or the hair color. Or they could note down the license plate of your car. With whom you arrive. The brand of your car. The color. The brands of the clothes you visibly wear.

Then they correlate that to the payment method, your Visa card, the credit ranking they receive back from visa (digitally at least). And so on.

and they measure how often you return.

They could do all of this (and actually a big lot of them does) and not only log that for themselves and do whatever analysis with it, but also send this data happily to the advertising agency that manages the big signs all over town so that they can show you additional advertisements for a new car, because you have money, but your car is old.

That is were the problem begins. It begins when doing way too invasive logging of user attributes that do only marginally have anything to do with measuring how the shop (or the website) work. And more so when this data is being sent to who knows whom in this advertising space out there.

I have no problem with an online store storing the fact that I came by clicking on a display ad. Or on an email newsletter. Or that I am using Firefox. Or Chrome. And that I am on a WIn10 desktop device. Or that I tend to add a lot of stuff to my shopping cart, wait two hours and then sort what I don't need.

I even do not have a problem showing me additional products based on what I looked at in their shop.

But to correlate that with offsite data, sending this to advertisers and so on is a no go for me.

collegeburner

No, you are voluntarily downloading and running their code on your computer. What you describe is hacking into somebody's computer, that is different. Stores take measurements about their customers, so do sites.

matheusmoreira

I'm also voluntarily running uBlock Origin whose entire purpose is to sanitize their borderline malware code into something that I can actually consume. As you said, it's my computer and they really need to submit to my will instead of finding creative new ways to work around it like some malware developer.

shkkmo

> Stores take measurements about their customers, so do sites.

When stores use Bluetooth or other tech to track their customers movement within their stores, that is also a creepy and unethical.

Also "voluntarily" is a complete misnomer as nobody is volunteering for this, a more correct world would be "unwittingly" or possibly "begrudgingly" depending on their level of tech saviness.

kyrra

Btw, someone wrote up a guide on hosting this on AWS, which covers what it would take to run it yourself.

https://www.simoahava.com/analytics/deploy-server-side-googl...

If I'm not mistaken, the key bit is that Google makes the docker image available at: gcr.io/cloud-tagging-10302018/gtm-cloud-image:stable

Edit: oh, Google published a guide to self host maybe? https://developers.google.com/tag-platform/tag-manager/serve...

curiousmindz

Sadly, most publishers are not interested in developing their own proxy solution just for the sake of data privacy. They vastly prefer a ready-made solution that they can just use.

Much of the power of the advertising space come from people (publishers, consumers and advertisers) generally choosing the path of least resistance. They don't have the technical know-how and they would only acquire it if there were enough benefits. Sadly, privacy is not enough on its own.

I think the solution that can solve all that is when a company acts as a "wall" between consumers and publishers/advertisers. Then, that company can protect the consumer while keeping the user experience as simple as possible.

"Sign in with Apple" is one such solution. But of course, it brings its own (different) downsides.

rvanlaar

You're hitting the nail on the head. I'm not against the website owners seeing what I do on their website.

What I am against is what other parties are able to do with the data when sold. They're able to correlate website visits with specific businesses and linkedin profiles.

charcircuit

>I could be able to run the proxy on any infrastructure that I like. Not only one Google's cloud offering.

This is already true.

https://developers.google.com/tag-platform/tag-manager/serve...

sdoering

Thanks - didn't know that. Interesting. Might be a solution for a client of the team I work in (but not my client).

thenanyu

As someone who has spent a lot of time on both sides of this, I think this is a great outcome, personally.

The most annoying part of ad-tech for me, as a user, was the fact that I was running all sorts of random javascript, any bit of which could blow up performance on my browser.

As someone who used to lead an e-commerce operation, I hated running all of this crap in my users' browsers because I knew it would get blocked randomly or cause hard-to-diagnose errors.

I eventually moved us to basically this approach using a home-grown solution and everyone was happier. It was even more robust because it just used session/cookie data and didn't require running any javascript execution to work.

croes

The most annoying part for me is getting tracked against my will.

So now it's worse.

Teever

Isn't it so strange that if you or I were to do these kinds of things to an individual it would be considered creepy cyber stalking but when companies do it they are rewarded?

snomad

No reason ad tech companies should have freedom to associate real world data with online data. This seems like the perfect candidate for a US state proposition.. no company engaged in online ad tech may combine or allow any other entity to marry online identities with real life.

gernb

Do what? Record who came in and out of your house?

nine_k

Blocking of feigning responses to particular remote APIs is still better than having to run a bunch of random tracking JS snippets, because without one of them a page just errors out.

brundolf

I have mixed feelings about it, even just as a user. There are two reasons people block tracking scripts: 1) privacy, and 2) to stem the deluge of crap that marketing departments dump onto the page, harming performance (both load-time and otherwise) [1].

This basically gives everyone the benefit of #2, even if they don't or can't use an ad blocker. That's pretty cool, in isolation. But of course it also makes it much harder to accomplish #1.

[1] I've seen React-based websites with literally 10x as much JavaScript (by weight) coming in from GTM and other third-party marketing vendors, as the amount powering the actual app functionality. This happens (partly) because every single ad provider has you load their own arbitrary JS bundle onto the page, just so they can measure conversions. This is obscenely inefficient (and frankly, even though it makes things easier to block, in some ways it's potentially a lot more insecure/privacy-invading). People on here complain about frameworks ruining web performance, but in reality GTM is far more responsible (or has been, so far).

ghostpepper

Don't forget about ad-served malware

brundolf

GTM doesn't serve ads though, so my understanding is the OP doesn't apply to ads, only trackers

onion2k

All that random crap will still run in your browser though.

The important thing here is that Google are asking users to proxy scripts from a Google server via a subdomain of their site. That's relatively trivial to do as far as the code and config goes, and not costly for the user or for Google. The advantage to the site and Google is that those scripts now look like first party files; Google are using a first part subdomain to subvert the Same Origin Policy via a proxy.

Every other tracking and ad service will set up the same thing. The reason it hasn't happened in the past is because it was hard to configure. Google are giving every other service the gift of explaining how to do it to users. Going to a website that had 50 tracking bugs from 50 domains will now have 50 tracking bugs from 50 Same Origin Policy allowed subdomains, all unique to that site, and all different so blockers will have a much harder time working out what to block.

The code that runs in the browser doesn't change. The only difference is where it appears to originate from.

GrifMD

That’s inaccurate. You do still need to run a client side GTM script that adds event listeners for specific actions (like “clicks purchase button”). This is then sent to a server side container with whatever first party identifier the site may have (3rd party IDs aren’t supported as there’s no 3rd party cookies from Facebook and the like). From there a server to server network request is made to whatever tracking platforms (GA, Google Campaign Manager, etc).

Most of the tracking script these days is well written, at least the Google and Facebook libraries, so they generally don’t affect page performance, but some of the smaller players have script that can slow down performance.

With server side GTM, only it’s client side component needs to run, everything else will be server side.

Raed667

Been there, until you're instructed to inject ads in your website, and then you're back to GTM again.

btdmaster

¿Por que no los dos? (Server-side and client-side spying synergise. Which computer is spending resources transferring telemetry?)

undefined

[deleted]

losteric

Citing adblock feels like clickbait. Google Tag Manager can't run ads so I don't follow the comparison. Marketing analytics could always side-step anti-adblocking tools through server-side tracking.

coffeefirst

Right. Everything in the article is wrong.

GTM is still GTM and can be trivially blocked; the container itself isn't moving server-side.

It's just gained the ability to proxy data to third parties instead of needing to load scripts for every tracker. This is better for performance, and should be explicitly in control of exactly what data is passed on to where.

All you really lose is the ability to block a subset of analytics scripts selectively.

probotect0r

How are you going to block it "trivially" if you don't know which script to block? They recommend changing the name of the GTM script, and paired with changing the content slightly, you won't be able to tell which script is GTM and which is actually important to the functioning of the site.

c0balt

+It's gonna be very hard to detect once they actually bundle up (which I suspect only a few will do) the tag manager and obfuscate

dlubarov

Where does Google recommend changing the name of the script? The author claims that they do, but their link just recommends self-hosting the script. In Google's recommended JS, the path is exactly the same, only the hostname is different ("www.googletagmanager.com" replaced with "<DOMAIN NAME>").

Self-hosting by itself might make blocking marginally more difficult, but there are other reasons to do it:

- Browsers these days segment caches by origin, so there's no caching benefit to using Google as a CDN.

- With HTTP2, a first-party request is likely to immediately go through an existing (multiplexed) connection, saving a handshake.

- It's arguably better for privacy, as users and legislators seem to be concerned about links to Google leaking IPs (https://news.ycombinator.com/item?id=30135264).

gizzlon

If it's 99.99% the same I think we will manage :)

jacquesm

You'll know that after loading the first couple of bytes though.

iamacyborg

> GTM is still GTM and can be trivially blocked; the container itself isn't moving server-side.

Not when the script sending data to the server side GTM is a first party one.

undefined

[deleted]

matt_heimer

Adblockers block ads and tracking, if the new gtag manager makes easier to defeat the tracking protections of an ad blocker then it seems accurate.

I think the key thing here is that ad/tracking blockers often rely on domains or requests being 3rd party. In the past it was more work to hide the 3rd party trackers as 1st party, this makes it easy so its more likely to happen now.

fay59

Saying that Adblock users want it 100% to block ads and 0% to protect their privacy is a misleadingly narrow analysis, even this use isn’t completely effective.

sodality2

Server side tracking based on what, server access logs? That's not particularly helpful compared to the info you get with clientside analytics libraries.

matt_heimer

No, the gtag in the browser sends all the data to the server-side proxy. Then on the server-side config you can pick which parts of the data to share with 3rd parties. So there is still client side data capture, its just reduced to one component capturing the data.

ghoomketu

Pretty sure a big ban hammer is coming for Google with all such shenanigans, especially in trigger happy places like Europe and India who don't like their citizens tracked and are happy to create legislative bans.

So you may win the cat and mouse adblock game but what are you gonna do when countries start making it illegal to use GA? (1)

(1) https://www.forbes.com/sites/emmawoollacott/2022/02/10/frenc...?

ulrikrasmussen

I can't wait for this to happen. Personally I think we just need to ban all targeted advertising based on viewer profiles, even session data such as IP and geo-location. This in turn should severely limit or destroy business models based on optimizing for engagement, as non-paying users are no longer profitable. It's going to cost a lot of people in ad-tech their jobs, but there is no shortage of demand for IT work, so surely they'll find something else to do.

pmoriarty

"I think we just need to ban all targeted advertising based on viewer profiles, even session data such as IP and geo-location"

I'd go further and ban all unsolicited advertising.

eitland

Not all ads are created equal:

The other day I learned from an ad that my favorite 6 year old Bergans jacket can be repaired at a shop next to where I work for a price that is next to nothing.

ouid

I don't hear these words enough :(.

drusepth

In $current_year, I kind of want to go even further and just ban the Internet.

zelphirkalt

We also need to include hefty fines for handling data to Google and their ilk behind the back of users. It is required, but not sufficient to ban businesses like ad and spy business of Google.

dartharva

I'd wish so too but I don't see much that can happen in this context.

ignoramous

> India who don't like their citizens tracked...

Pretty sure the Indian govt bullied everyone into getting an Aadhar. The quintessential tracking device.

aliswe

It's a government.

ignoramous

...which likes to police, surveil, impose, oppress the governed a bit too much.

buro9

> As we have seen, Google does not explain ( https://developers.google.com/tag-manager/serverside/custom-... ) the reason for creating a subdomain of the website for its "proxy" server:

> > The default server-side tagging deployment is hosted on an App Engine domain. We recommend that you modify the deployment to use a subdomain of your website instead.

The reason is simple: it creates a denial of service attack on DNS block lists used by things like Pi-Hole and NextDNS. Sure, Google knows that some of the subdomains will be blocked for some block lists... but the vast majority won't be blocked on the vast majority of block lists.

southerntofu

Looks like the only sane thing to do is to block routes to GAFAM AS directly on your router instead of relying on DNS tricks. I knew people doing that over ten years ago and i thought they were kind of crazy, but in retrospect they were right all along.

What if your website is hosted by Google Cloud Engine or AWS, should we block it? I certainly would. Please find a decent host that does not use their customers as human shield/leverage to engage in criminal conspiracies against privacy.

contravariant

Gorhill managed to get uBlock to block CNAME masked domains, so surely this wouldn't be that out of reach for an adblocker?

Good luck getting this to work in google chrome though.

brobinson

Maybe this will drive people back to Firefox? It's a perfect opportunity for Mozilla to do a marketing drive... oh, wait, they are busy partnering with Facebook (er, Meta) to do advertising stuff. Sigh.

hwers

Blocking all GCP and AWS hosted sites is about as effective as turning off all javascript. It reduces the usable set of sites on the web to basically worthlessness.

southerntofu

I've found the web to be very enjoyable without Javascript. Some sites don't work but fortunately they're usually of the SEO-clickbait kind without any sort of interesting content. I'm already blocked by many providers for using tor, which is a redflag for abusive behavior on their part.

I just wish we had a manifesto, automated test suite, and dedicated search engine for websites that respect their users.

selfhoster69

A cron job on the network gateway that creates iptables rules to drop connections to x IPs sounds like a good plan.

mkdirp

It is clear Google is finally feeling the hurt from adblockers and the like. That means we are winning. Google knows it's not what people want, but they clearly do not care. In my opinion, if you work for Google on things like this, you are equally to blame. You have Google on your CV, you can easily go elsewhere and find a decent job.

Having said that, uBlock Origin, and I'm assuming other similar extensions, offer inline script filtering. The code being served has to have some common code since it's all coming from a single org. What is stopping a filter that includes a filter like this?

The issue obviously being that this still prevents DNS filters from blocking Google, which is equally a big issue. Assuming the scripts indeed have some common code that can be blocked, perhaps this is where we start crowdsource filters. Something that runs in the background, and inspects scripts, which then gets posted to a server, validated automatically, and then later served as a block list that anyone can download.

[0] https://github.com/uBlockOrigin/uBlock-issues/wiki/Inline-sc...

was_a_dev

Yes a Google employee could go work elsewhere. But is there an equally well paid position at a more ethical company? As far as I can tell, all FANGs are as unethical as each other

danparsonson

There are no perfect outcomes in life - if you're going to make an ethical decision then more often than not you'll have to compromise elsewhere. Another example would be cheap goods that come at an ethical and/or environmental cost - you'll usually have to pay extra to avoid those because the bad behaviour is what allows companies to keep costs down.

In some sense, FAANG employees are being paid extra to look the other way.

Perseids

I don't understand the reasoning here. How does being paid more justify unethical acting? Especially since you are getting by very very well in the tech industry in general. Isn't that like saying "I'm kicking puppies all day, but it's paying enough to finance the second Lamborghini, so how could I decide against it"?

(If you were referring to moral offsetting, that could indeed work, assuming you donate enough to charities, but your post didn't sound like that.)

was_a_dev

How does being paid more justify unethical acting?

Honestly, it doesn't. But trying to appeal to someone already working in an unethical position isn't going to work - that person themselves is unethical. Like most things, the driving force will be down to economics and prestige. What else is keeping that employee at Google? I doubt it is loyalty

technion

Note that the Google announcement in question was August 2020. This didn't seem to make any significant changes to the ad-block space when it rolled out, and pretty much every site is still running the Javascript frontend.

terrycody

Sorry I can't understand the article, but does server side Google tag manager already out?

technion

Yes it's been out for quite some time.

It's also requires running a proxy as a GCP application, so people running GTM largely because it's free/cheap aren't going to go along with this.

rootusrootus

If I am reading it right, the article is saying about 1/3 of all web sites on the Internet already use GTM.

matt_heimer

Using Google Tag Manager doesn't mean you are using the server-side tagging. You have to configure it in your account. It is something you have to pay for. If you read the instructions on https://developers.google.com/tag-platform/tag-manager/serve... you have to have GCP billing setup to pay for the App Engine instance running the server-side tagging proxy.

danhilltech

Server-side tracking has been around for a while (indeed this article is dated Nov 15, 2020; and of course, you could argue simply parsing your Apache/nginx logs to get visitor stats has existed forever). The article I think conflates several different pieces.

There's probably a few actual use cases marketers may care about for tagging/tracking/analytics:

1. Simplest: I want to know how many people use my site/app, how many come back, how many are real (not bots), which pages are popular, etc. I'd like to see all this in a nice UI where I can cut and filter the data.

2. Same as #1, but I'd like to do it across devices. Still all within my own site/app, but simply connecting a non-logged in session across desktop and mobile web. Google and FB probably have the largest available dataset on this.

3. I'd like to enrich all this information with data from other sources, for example to target ads, serve ads, etc.

Site owners/marketers then try and tackle these in a few ways, the first 3 equally bad:

1. Just dump a bunch of scripts into your site (GA, FB, Segment, whatever). Pros: easy. Cons: very easily blocked, so your data is super biased.

2. Self host some of these scripts, or CNAME them. Pros: maybe a bit better for performance? Cons: still rather easily blocked with content signatures etc. A nightmare to ensure consistency if self-hosting.

3. Run your own JS that sends events to your server, and then your server fans out to whomever. Pros: much harder to block, and likely quite performant. Cons: its unlikely your self built lib is going to give all the same 'features' as GA (features meaning device fingerprinting and so on).

4. Just get everything from HTTP logs. Pros: very performant, can't be blocked. Cons: much more limited data to work with.

Personally, I think #4 is the future (and also where we started 20 years ago). What I don't think anyone is doing yet is relaying that data out to all the other parts of the stack: GA, FB, Mixpanel, whatever. If you could solve both - giving users privacy and performance and giving marketers the same tools they're used to - sounds like a win. You might argue "well we'd be missing a bunch of user data", but you're already missing it with adblockers and iOS privacy features.

Raed667

> 3. Run your own JS that sends events to your server

If your platform is popular enough, those telemetry endpoints will end-up on ad-blockers lists.

Then it is up to you, if you want to do an arms race of obfuscation or just accept it.

olliej

1) can be done trivially with first party cookies.

2) you can already tell what device someone is using. If you mean “I want to know if the same person is on different devices” get them to login, don’t try in effectively spy while also providing google etc with the ability to actually spy

3)you cannot know how to target ads on a per user basis unless you are spying on your users. You have no justification that supports a claim to such information.

danhilltech

Yea, I think we're saying the same thing. Ultimately both the best choice (for privacy, performance etc.) and the one that's most likely (given adblockers and and ever increasing push for privacy from browsers and OSs) is to stop trying to find a way around adblockers, and simply invest in the technologies that work - http, cookies, sessions, logins, and os on.

tootie

I think some of the whiplash in the market isn't just the tit for tat battle with ad blockers and regulators but the realization that there's so much useless data being collected. The best data we get is first party (ie things people click or type into forms on our sites) or qualitative feedback from surveys. GA and GTM are valuable tools for us but Google's network isn't really.

danhilltech

Yea. Though, GA does (at least) two things: analyzes your own data, and, uses the data they collect from all their other sites to improve your experience via better bot detection, recommendations, insights. Google's network is useful, like it or not, for a) their cross device graph - they know which mobile devices and which desktop browsers are the same user (ish) and b) from that, building better MTA models than you can with pure first-party data - especially if most of your traffic isn't logged in.

But I agree, the future is pointing toward a world where privacy and empowerment is more in the hands of the user, and that's a good thing.

Daily Digest email

Get the top HN stories in your inbox every day.

Google Tag Manager, the new anti-adblock weapon (2020) - Hacker News