Bookmarked 'Data is a fingerprint': why you aren't as anonymous as you think online by Olivia Solon (the Guardian)
More recently, Yves-Alexandre de Montjoye, a computational privacy researcher, showed how the vast majority of the population can be identified from the behavioural patterns revealed by location data from mobile phones. By analysing a mobile phone database of the approximate locations (based on the nearest cell tower) of 1.5 million people over 15 months (with no other identifying information) it was possible to uniquely identify 95% of the people with just four data points of places and times. About 50% could be identified from just two points.
Olivia Solon demonstrates some of the problems that we face with privacy. This touches on some of the challenges that Michael Golumbia addresses in his post on personal data. Both authors come to the same conclusion, we are expecting too much of the consumer.

via Ian O’Byrne

Bookmarked 18 best practices for working with data in Google Sheets - Ben Collins (Ben Collins)
This article describes 18 best practices for working with data in Google Sheets, including examples and screenshots to illustrate each concept in action.
Ben Collins provides a guide for working with data in Google Sheets. Some of the useful steps that stood out were documenting the steps you takeadding an index column for sorting and referencing, creating named ranges for your datasets and telling the story of one row to check the data. Another tip I picked up from Jay Atwood has been to import data, if moving from Excel to Sheets, rather than simply copying and pasting.
Bookmarked We Don’t Know What ‘Personal Data’ Means - uncomputing (uncomputing)
It’s Not Just What We Tell Them. It’s What They Infer. Many of us seem to think that “personal data” is a straightforward concept.  In discussions about Facebook, Cambridge Analytica, GDPR, and the rest of the data-drenched world we live in now, we proceed from the assumption that personal data means something like “data about myself that I provide to a
David Golumbia provides a list of six types of personal data: provided, observed, derived, inferred, anonymised and aggregate. In unpacking the work of Virginia Eubank and Cathy O’Neil, he warns about what we share only when we do not really know who is collecting such information.

Yes, we should be very concerned about putting direct personal data out onto social media. Obviously, putting “Democrat” or even “#Resist” in your public Twitter profile tells anyone who asks what party we are in. We should be asking hard questions about whether it is wise to allow even that minimal kind of declaration in public and whether it is wise to allow it to be stored in any form, and by whom. But perhaps even more seriously, and much less obviously, we need to be asking who is allowed to process and store information like that, regardless of where they got it from, even if they did not get it directly from us. source

Golumbia says that governments need to get on top of issues associated with data, because the public is struggling.

Replied to Analytical moves by Ian Guest (Marginal Notes)
Although these details are not attempting to satisfy the more positivist-leaning criterion of enabling replicability, they should nevertheless make it clear that I conducted a ‘rigorous’ study. Is there enough here to convince you of that? If not, what else would you like to see?
Once we trade in reproducibility I imagine that all we have is a case of ‘good-enough’ analysis? The problem I have is that if we were to approach this question from Fish’s interpretive communities then being convinced is not the challenge? If I am a positivist, will I ever be satisfied?
Bookmarked Unfollowing Everybody by Anil Dash (Anil Dash)
Keeping in mind that spirit of doing necessary maintenance, I recently did something I'd thought about doing for years: I unfollowed everyone on Twitter.
Anil Dash discusses the steps he took to unfollow everyone on Twitter and start again. There are some interesting ideas in this piece, such as archiving a list of people you are following. Might be one to come back to.
Replied to Too Long; Didn’t Read #158 (W. Ian O'Byrne)
Each week when I write this newsletter, it is always interesting to me to see stories that suggest that social media is downright bad for us. For people that are hooked, it is like a drug. For people that don’t use social media and networks, they don’t understand why people care, or use these tools.
Ian, the irony of the JSON change is that I downloaded my content and cleaned it up months ago. Really hoping that someone develops an easy to use parser one day so that I can store all my statuses and check-ins in my site, even if they are private.
Liked Monetizing Your Device Location Data With LotaData (apievangelist.com)
In a world where our data is the new oil, I’m interested in any way that I can help level the playing field, and seeing how we can put more control back into the device owners hands. Allowing mobile phone, wearable, drone, automobile, and other connected device owners to aggregate and monetize their own data in a personal or professional capacity. Helping us all better understand the value of our own bits, and potentially generating some extra cash from its existence. I don’t think any of us are going to get rich doing this, but if we can put a little cash back in our own pockets, and limit the exploitation of our bits by other companies and device manufacturers, it might change the game to be a little more in our favor.
Replied to Too Long; Didn’t Read #157 (W. Ian O'Byrne)

Some computer science academics at Northeastern University ran an experiment testing over 17,000 of the most popular apps on Android to see if they’re collecting information and sending it back somewhere else. They found no evidence of an app unexpectedly activating the microphone or sending audio out when not prompted to do so. Like good scientists, they refuse to say that their study definitively proves that your phone isn’t secretly listening to you, but they didn’t find a single instance of it happening. Instead, they discovered a different disturbing practice: apps recording a phone’s screen and sending that information out to third parties.

I thought that it was just me with the strange feeling like I am being listened too. Really disconcerting that instead they are capturing images. This is a worry on multiple levels. That any semblance of privacy has seemingly left the building, but also the waste associated with collecting such data.

I am reminded of the discussion of a big data tax mentioned in Sabeel Rahman’s post The New Octopus. James Bridle also talks about the ‘Age of the Image’ in the New Dark Age:

As digital culture becomes faster, higher bandwidth, and more image-based, it also becomes more costly and destructive – both literally and figuratively. It requires more input and energy, and affirms the supremacy of the image – the visual representation of data – as the representation of the world.

Replied to Making Sense of Blog Post Content Data? My Own Spanner Found in the Bottom of the Toolbox (CogDogBlog)
For the obviously obvious statement, WordPress is built on a database. The question is, besides data like visitor counts, what can you infer from the data in the posts and metadata itself?
I am always fascinated what data we are collecting, whether conscious of it or not.

This reminds me of Tom Woodward’s work with Sheets and data. I wonder if this will work for Post Kinds too? Off to dig around in the code.

The good news for both advertising and publishing is that neither needs adtech. What’s more, people can signal what they want out of the sites they visit—and from the whole marketplace. In fact the Internet itself was designed for exactly that. The GDPR just made the market a lot more willing to start hearing clues from customers that have been laying in plain sight for almost twenty years.

Doc Searls

A reflection on looking at cars and sharing data.