If you value your privacy, you may not be too pleased with recent developments in computer and mobile device technology. Computers and smartphones have gotten smarter and smarter, but this has come at the cost of having communicate more and more data about you, including potentially personal data, back to the cloud so the cloud can know what you’re doing.

Ars Technica has a lengthy article exploring all the ways this is happening for recent operating systems, especially Windows 10. Siri, Cortana, and Google all send data upstream to improve speech recognition, for example—including information about contacts and appointments.

There are two common reasons for this kind of data collection. The first is that these services simply need to know these things to be useful. Siri needs to know the names of your contacts to be able to set up calls or send messages. Cortana needs to know when and where your appointments are to tell you when you need to leave the home or office to get to them.

But there’s a deeper reason: the software powering these capabilities is fundamentally heuristic, using approximation and guesswork to generate its results. Traditionally this wasn’t the case; a hardware keyboard with no autocompletion doesn’t need any fancy heuristics, it just needs to directly map key presses to characters. But speech recognition, software keyboards of all kinds, and handwriting recognition don’t have this precision. The software driving these things has to construct and evaluate a range of different possible interpretations and then pick a most likely option among those interpretations.

Analyzing real-world data allows the companies to improve their algorithms. This is the same sort of reason why Google offered “Goog 411” speech recognition directory assistance for a while—to hone its voice recognition algorithms for Google Voice voicemail transcription. Of course, sending data back from your computer is a good bit more personal than recording when you ask for a phone number.

Another thing modern operating systems do has to do with location monitoring. Modern operating systems take a snapshot of what Wi-Fi networks are in range when your device checks its location, and send that upstream so that they can improve their location-sensing database for devices that don’t have GPS.

Also, whenever your Windows 10 device has a software glitch, it will send at least some information about your system up to the cloud to help Microsoft figure out and fix what went wrong in future patches. Unless you have Windows 10 Enterprise Edition, you can’t turn this off.

Most information the operating systems collect is filed under an anonymous identifier, and generally only examined in aggregate, which is some reassurance. But people who are concerned about privacy may not be sanguine about it nonetheless.

The problem is that as operating systems in general attempt to become more useful to people, anticipating things they want to do in order to help do them better, this is going to become more or less endemic. Using older operating systems will work for a while, but not forever—they’ll become less and less useful over time as they stop being updated.

More and more these days, it seems the way a lot of people think about privacy is to console themselves that they’re too unimportant and anonymous for anyone to worry about wanting to track. But that’s only true until you’re not. Who knows whose attention you’ll attract tomorrow? And at least some of that information can be subpoenaed.

It seems unlikely the trend in making operating systems more useful is going to reverse. So where does that leave us?


  1. Well, I don’t know, I don’t seem to have that much trouble with my linux boxes phoning home. Sure, Firefox is getting snoopier, but I can still turn most of that off. A good ad blocker, Ghostery and mac address spoofing help my laptop blend into the scenery. So rather than “modern operating systems,” I’d say it like it is – “Windows, Mac and Chrome are gradually eroding user privacy”.

  2. Linux can be a useful tool for some users, but it has its limitations, especially if you want to buy software, digital music, or ebooks. I’m sure the true believers will pipe in and say DRM free files will run on Linux just fine, which is more or less true, but depending on your personal preferences, the DRM-free selection are kind of slim.

    But what will really piss me off is the possibility of selling this data to advertising and marketing companies. What I want to buy, where I buy it, and when I buy it is my own bloody business, and I don’t need ads to help me along. But that is probably the direction we’re headed.

