Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

andybak

My projects could genuinely benefit from telemetry as I have no idea about usage patterns and my community (mainly artists) is not famous for maintaining a close dialogue with software developers.

I haven't bothered because a) opt-out risks a backlash and b) opt-in affects the data so much it becomes useless (much smaller sample and probably self-selecting a certain type of user)

Skimming the comments here, it seems everybody assumes telemetry is always nefarious. I get the distrust of large corporations and other obvious bad actors - but the blanket cynicism for all telemetry here is kinda surprising. Have none of the developers here ever had a need for it themselves?

dahart

I’m sympathetic to both the default distrust and to devs like you who want telemetry to improve their software and won’t use the data for anything else, but it is because of bad actors and enough dark ad patterns that we just can’t trust companies to play nice, and it’s too difficult to expect people to scrutinize each and every app or site individually. So I get why the default assumption is nefarious behavior.

But you’re totally right - telemetry & crash dumps & analytics are helpful & great for devs who care about the customer UX and don’t use the data for advertising or anything other than fixing & writing good software, so it’s a real kind of tragedy of the commons that we can’t have safe, trustworthy, and pro-consumer telemetry.

I went from building a web app that used Google Analytics and some other kinds of anonymous telemetry (and using that data only for identifying functional software & site issues), to building driver software that absolutely cannot send data out, and I wish for telemetry all the time. Not only is it difficult to understand what users are doing, they usually don’t even know themselves and can’t tell me what happened when things crash. The result is that turnaround times for critical issues are in months, when it could be days or hours if we had crash dumps and analytics, the lack of automated reporting hurts users.

I’m not sure there’s a way to separate the good from the bad, to designate some kinds of telemetry as safe and to be able to trust it while disallowing the stuff we don’t want. If that were somehow possible, if anyone has ideas, I would love to help figure out how to make it a reality.

1vuio0pswjnm7

Opt-in data is "useless"

That's one I have not heard before

Useless for what

Targeting a certain "type of user" perhaps

"I get the distrust of large corporations and other obvious bad actors - but the blanket cycnicism for all telemetry here is kind surprising"

There is effectively no way for a user to determine whether an actor is "bad" or "good" and that definition may vary depending on the user

The user cannot verify how the data might be used or where it might be transferred. As such, there is almost zero incentive for the data collector not to engage in malfeasance (as the user defines that term); deterrents are lacking

Perhaps there is irony in criticising "blanket" cynicism whilst arguing for "default" telemetry. Both suffer from the same "one size fits all" error

undefined

[deleted]

deanishe

> I get the distrust of large corporations and other obvious bad actors […]

> the blanket cynicism for all telemetry here is kinda surprising

Who's providing the telemetry/analytics if not one of same large corporations?

Many devs say they care about user privacy, but very few seem to care enough not to farm surveillance out to a 3rd-party they have no control over.

ragall

Telemetry only tells you what users do, not why and doesn't explain their mental models. Try asking directly: open a discussion board (for example Github's Discussions) and encourage them to post about aspects of the software they found puzzling/annoying/inefficient. Take 15 minutes a week to go through the posts to see if anything attracts your attention.

marshray

I feel that I have been a victim of "good telemetry" too, as when advanced product features were removed which were probably not popular but that I personally relied on.

charles_f

It's interesting that we're so used to be tracked at this point that no one balks at being opted-in by default. A flag called DO_NOT_TRACK sounds like a good idea, but also suggests the default is CONSENT_TO_TRACK=1, and I find that creepy.

d2p

> A flag called DO_NOT_TRACK sounds like a good idea, but also suggests the default is CONSENT_TO_TRACK=1, and I find that creepy.

It could also be used to prevent showing an opt-in notification at all even in software that requires opt-in.

rapidaneurism

It is a bit sad that in privacy enthusiastic consent is understood as failing to shout 'NO' in the right way.

croes

Nitpicking: There is no being opted-in by default, that‘s opt-out

marshray

Hmm. I read a semantic difference between "opt-out" and "being opted-in by default".

The first denotes an abstract policy, the second an action that has been done to you in which you were a passive participant. And this is all about our lack of agency.

You may prefer that we speak of abstract policies. But to say "there is no" about an otherwise sensible phrase implies that you think that we have agreed to stay within some fixed set of terminology. I didn't think that we had.

croes

If you hadn’t the option to go in it is not opt-in.

If so put you in by default but you have the option to go out it’s opt-put

So this is either opt-out or not a option at all

shevy-java

I actually consider such a flag to be problematic. I don't want to give out any information - of course I never want to be tracked, but marking this via an ENV variable alone, already makes zero sense to me. I don't understand people who like that while claiming they do not want to be tracked; if they give that information, then this means they are marked.

undefined

[deleted]

thephyber

Do not track WHEN?

This flag is sent by my browser when I connect to SOMEONE ELSE’s SERVER.

The internet only took off because the primary business model which ran on ads and derivative information that servers do to their users.

It’s not fun. It’s not private or secure. It’s not illegal (in most jurisdictions for most industries). The flag exists as a response to the de facto and de jure state of the world, not some fairytale scenario.

RugnirViking

> The internet only took off because the primary business model which ran on ads

No? It took off before advertising was widespread as a primary or sole funding business model? Also there's literally nothing about advertising that requires data collection about users. Sure they love to do it, and they might even believe that it helps their profits in some way. But it's not inherent, they got along just fine with billboards and newspaper classifieds. TV ads never required personal information. Not did pre roll cinema ads, or radio adverts. Nobody was bemoaning in the streets that they couldn't possibly find anything to buy

y42

> The internet only took off because the primary business model which ran on ads and derivative information that servers do to their users.

quite the opposite I would argue:

https://nickyreinert.de/2020/2020-10-24-marketing-killed-the...

mkl

> This flag is sent by my browser when I connect to SOMEONE ELSE’s SERVER.

No, it's set in your command shell (e.g. bash) and tells CLI programs that support it to not connect to a server. It has nothing to do with browsers or ads. This is all very clear in the article.

thayne

You can have ads without tracking.

tdeck

The article is about local desktop / CLI tools that collect telemetry, not the web browser "do not track" standard.

righthand

You’re confusing the Internet with Google.

charles_f

Article quite literally talks about tracking of cli tools you run on your own computer, half of which are to pilot products that you pay with your own money.

Get off your high horse.

doginasuit

I would advocate for not getting your horse high to begin with, or hide your stash better.

Griffinsauce

> The internet only took off because the primary business model which ran on ads and derivative information that servers do to their users.

Arguable, on the other hand it did kill the internet. (or, almost so far, we'll see whether we rebound after decades of enshittification)

_flux

I always choose to go with positive terms with variables etc, so this would then be ALLOW_TRACKING=0. It brings in some consistence and makes it easier to reason, as you get to avoid double negation.

Perhaps the "DO NOT TRACK" name is somewhat of an established term, though.

FrauElster

One could also implement ALLOW_TRACKING as comma separated list for applications I choose to allow it. Say I would like to share telemetry with go and brew, but not aws and the rest ALLOW_TRACKING=go,brew

_flux

..and what kind of tracking, e.g. anonymous usage statistics vs update checks, e.g.

  *:analytics=1:google_analytics=0,syncthing:upgrade=1
The specification could go on and on!

PufPufPuf

This is set up for the same fate as DNT in browsers. Collecting all the "do not track" env vars into a single "do_not_track.env" file, however, may not be a bad idea...

whitlock

https://toptout.me - exists and handles a lot of these problems, if not looking to create a new wheel.

Though if you just want a simple ENV var that handles this WHILE honoring the specification on this page: https://github.com/alloydwhitlock/do-not-track-cli

LocalH

Advertisers chose to ignore DNT because they claimed Microsoft making DNT enabled by default took agency away from the user. In reality, they probably weren't going to honor it anyway.

Gigachad

There's an inherent conflict. No one _wants_ to be tracked, there is no direct benefit to being tracked and only downsides. And advertisers want to track you. So there was no way to respect the flag other than making it obscure so only a few dedicated people turned it on.

xigoi

> No one _wants_ to be tracked

Plenty of people seem to genuinely believe that “personalized ads” are good for them.

LocalH

In other words, advertisers wrangled out of something that could help people because they claimed it wasn't the true intent of people?

Advertisers are the scum of the Earth, as someone with ADHD who doesn't ever consent to my attention being stolen in that way. I really don't care what their opinion is, since they're intruding into my headspace without permission

socalgal2

To play devils advocate there is a direct benefit to being tracked, at least theoretically search and ads will more relevant to you. I get no one wants ads but you do see ads here and there. It would arguably be better for you if everyone of them was relevant than not. Similarly search or even LLM answers could be better if the preferences of the asker are known

No, in not making excuses for tracking and I do lots of stuff myself of avoid being tracked

I’m only responding to the false premise that there are no benefits. There are. You can just choose to believe they aren’t worth the cost. I believe they aren’t but I have friends who opt into all tracking and even register their presence with multiple apps. They believe they’ll make more positive connections

mmooss

Microsoft is too sophisticated to plead ignorance; they are responsible for that outcome and I think we can assume they knowningly chose it. (Though now Microsoft browsers are such a small portion of the market that it doesn't matter.)

The biggest failure of DNT was browser makers - including Mozilla - removing it. It has zero performance impact (1 bit?) or development cost. As long as it was out there, when there was momentum against tracking, advocates had evidence of both demand for privacy and of trackers ignoring user wishes.

applfanboysbgon

> advocates had evidence of both demand for privacy and of trackers ignoring user wishes.

This evidence both still exists and is also completely useless for anything. The more important consideration, by far, is that the DNT flag was actively harmful to users in the real world because, if it was acknowledged at all, it was used maliciously to help fingerprint and track users. There is no reason for browsers to continue providing to their users a toggle that not only misleads them about what will happen with the setting enabled, but actively contributes to the opposite outcome because we live in a world where being evil is the norm.

dylan604

Lately, I've come across websites that instead of a cookie banner display a banner that states they recognize and honor my wish to not be tracked. Whether that really do or not is something I did not spend time looking into. The first time I saw it I thought it was a fluke, and then it happened a few more times with in a short time period. Couldn't tell you what sites they were though as it was just something from search results.

LocalH

No, the advertisers were responsible for that outcome, by using that as a flimsy excuse to ignore the setting

Browsers only removed it once it was clear that the advertising industry was going to refuse to honor it

pseudalopex

Global Privacy Control replaced Do Not Track.

whitlock

Love it. This is an annoying problem and likely the actual solution than asking folks to use a universal one. I'll put something together as a starting point.

endgame

No. It shouldn't be an opt-out, and it is bad practice to write conditional settings in the negative.

monk_grilla

The original creator of this standard has retroactively called it “a mistake”

https://git.eeqj.de/sneak/consoledonottrack.com/src/branch/m...

bstsb

the original creator calls everyone implementing their standard a “scumbag” for having any form of analytics, which seems a bit of an overreaction

spudlyo

I was surprised how hard it was to stop the Python transformers library from phoning home to Hugging Face. I set HF_HUB_DISABLE_TELEMETRY=1, and when I called Wav2Vec2CTCTokenizer.from_pretrained I explicitly passed local_files_only=True, but still I got got a warning about not having a valid HF_TOKEN. It wasn't until I stumbled upon HF_HUB_OFFLINE=1 that I'm somewhat confident that I'm not making outgoing connections to HF every time I load a wav2vec2 model from disk.

I wouldn't have realized this was happening at all if it weren't for the obnoxious HF_TOKEN warning.

woodson

HF is notorious for making it difficult to work offline (or at least not waste time trying to connect when everything needed is offline) and is constantly changing how it is being handled. Previously, there was TRANSFORMERS_OFFLINE, HF_DATASETS_OFFLINE, etc.

dylan604

Does something like Little Snitch catch these to help find the things doing hidden shenanigans?

Grom_PE

I was worried about .NET sending telemetry once I found about the existence of the DOTNET_CLI_TELEMETRY_OPTOUT env.

Thankfully, the dotnet package installed by package manager on Arch Linux disables telemetry by default. I left the env set just in case.

But my trust towards "modern" software has lowered. I default to run CLI tools, especially those built in JavaScript or .NET with network disabled:

    firejail --net=none
For ilspycmd, for example, I had to defuse its default "update checking" behavior:

    alias ilspycmd='ilspycmd --disable-updatecheck'
This is what I'd call user-hostile defaults.

ximm

Looks like a helpful honeypot! Any tool that will public announce support for this spec is a tool I know to avoid because it collects telemetry without explicit opt-in in the first place.

GuB-42

DO_NOT_TRACK support doesn't mean tracking is not an explicit opt-in.

Example: the software crashes, and there is a crash handler that asks you if you want to send a crash dump. With DO_NOT_TRACK, the crash handler is disabled entirely, no question, no dump.

If it gets some adoption, that's probably how it will work. Those who have an financial interest in using tracking (ex: ads) probably won't support such an option.

bstsb

i can't think of a single CLI that is possibly collecting analytics for ads

SpyCoder77

Most services are already collecting telemetry, them announcing support for it won't change that.

xandrius

Well, don't look too deep else you won't be using many modern tools.

msla

Hey, it's a list of services to feed fake data to!

drnick1

It's probably easier to run your own DNS and blacklist the offending domains. There are good blacklists with millions of telemetry domains, e.g. https://github.com/hagezi/dns-blocklists.

tosti

Better yet, don't allow such spyware crap on your computer.

0123456789ABCDE

pfft, just don't have a computer and you'll be good

MajorTakeaway

Some hobbies are more fun than others.

rvz

That is the correct way of handling this.

Everyone proclaiming a "standard" is just adding to the long list of (unofficial) alternatives.

dylan604

0123456789ABCDE

how is this relevant?

meling

For the record, Go’s telemetry is local by default (not uploaded): https://go.dev/doc/telemetry

tomtomtom777

This proposal is really harmed by the name.

There is a reason none the existing methods use the word "TRACK". Although connecting home can be used for tracking it doesn't have to be.

If a tool uses connecting home for telemetry, implementing "DO_NOT_TRACK" would suggest it does track its users without the setting, even if it may not.

Rename it this to "DO_NOT_CONNECT_HOME" and it may be a useful standard.

Daily Digest email

Get the top HN stories in your inbox every day.