Understanding Personal Data

[This post also appears on #wethedata.]

It’s been a wild year. From living in Chongqing, to getting married in the US and moving to the UK, we’ve been living an itinerant but momentous life this year. We’ve arrived in Oxford, gotten set up with broadband, and now we’re hitting the ground running and getting back to work.

So, what am I doing here at Oxford? I’m reading (as the Brits say) for my MSc in the Social Science of the Internet at the Oxford Internet Institute [and yes, at the end of this, I will be a Master of the Internet, and I ask that you please refer to me as such]. I’ve been working in tech and the internet in some capacity for the last few years, from enterprise IT to online video, so this feels like a pretty natural next step. The main difference now is one of focus: I’m studying personal data. I’ve been thinking about personal data for a while now, and I wanted a chance to study it more rigorously, and with the institutional backing to do so.

I’m also here to learn. I’m not a social scientist by training. I fashioned myself a comparative media scholar by way of English and Film Studies in undergrad, and emerged as a business analyst by teaching myself how to read financial statements. Now I’m here to explore how I might be an anthropologist, or an economist, or a legal scholar of how society understands (or perhaps doesn’t) personal data.

But I’m discovering that in getting trained in broad social science methods, I’m also here to meta-study the social scientists themselves. As much as Facebook and other platforms are using personal data to improve products and build business models, the social scientists are just as, if not more eager, to make use of the wealth of social data that’s being collected. While the latter may have IRB sanctions and methodological ethics training, the former doesn’t necessarily, so I’m interested to explore some of the similarities and differences across these two different sets of users of data. The promise of Big Data looms large right now for both, but those messages haven’t necessarily been translated to consumers except in fear-mongering, sensationalized ways (see Target pregnancy story). Bridging that gap is hopefully where I fit in.

One of my biggest motivations for studying personal data as opposed to privacy is because I believe we as a society don’t have practical frameworks for understanding the day to day decisions we make about our relationships to firms that use our data in exchange for their “free” services. It’s not just about the potential for breaches, security, protecting ourselves from identity theft. We’re in an important moment right now: we are still in the process of exploring and delineating these new norms and expectations are around data use, information flows and contexts (like many, I’m heavily influenced by Nissenbaum). I want to help shape and describe those norms, those expectations as they develop. My project is one of user awareness, savviness. It’s not one of conservatism, it’s one of conscientiousness.

And I’m in good company in thinking about how we think about data. Projects like #wethedata are trying to raise awareness and show the potential for different relationships to data. The World Economic Forum has been describing data as a new asset class. These are all steps in the right direction. But there are also some metaphors being used to liken data to oil, to be mined and refined by large corporations (i.e. Big Data). This metaphor disempowers and dehumanizes the individuals in the equation. People aren’t natural resources. I want to contribute to this dialog, and figure out better metaphors for talking about our data that could ultimately shape policy, consumer behavior and decisions, and even have an impact on the very economic model of the internet (“free” services and content in exchange for data). There won’t be market demand for changing practices around personal data if users don’t have a concrete grasp of how data is used today and could be potentially used tomorrow.

We are bringing more and more of our lives online, and in doing so, we are generating more data, leaving more crumbs, but we haven’t necessarily seen the benefits of our own quantification on a personal level. The existence of this data isn’t necessarily good or bad, but how we use it, who can use it, and where it goes in the future is all up for discussion right now, and we need a savvier user to have a say in that discussion.