Always Already Data

UPDATE: John and I are no longer working on this book together. It turns out it is a huge challenge to merge voices into one story, even we both deeply care about the subject.


I’m thrilled to be able to share this with the world, now that it’s real (because it’s on the internet).

After I wrap up my thesis here at the Oxford Internet Institute I’m moving on to the next big project: the book. I’m joining John Battelle in coauthoring if/then, an archaeology of the future of data. We will explain how today’s technological artifacts will inherently change society if widely adopted by the next generation. We’re using a set of artifacts, each in turn acting as a lens for asking the hard, existential questions about what we want our society to look like as it becomes data.

Here’s John talking about the book at LeWeb this week (starts at 14:00):

In some ways, my Oxford research has been a micro-lens for the larger story, and I’ve always intended it as such. In looking closely at how the Quantified Self community talks about data, I’ve discovered how the group’s conferences and Meetups serve as a space for asking some of these hard questions that arise as they engage with data in novel ways. They are asking what it means to filter our health through one company’s definition of a step, fuel, or fitness points. They are asking what new skills medical professionals need to acquire in order to engage with rich individual patient data for proactive and preventative healthcare. They are experimenting with how to say “don’t life-log me” when they encounter someone wearing Memoto or Google Glass. And, more philosophically, they are asking what it means to see ourselves objectively through data. These examples only scratch the surface, but follow on the kinds of questions we’ll be asking in the book.

John is the ideal collaborator in thinking through these hard questions. It turns out that we have been working in parallel for a while, drawing inspiration from many of the same books like Super Sad True Love Story and The Information. But we’re also bringing together complementary experiences in advertising and academia. I believe that the only way to tackle these hard questions is to blend varied interests and perspectives in a balanced and nuanced way.


It was so energizing to be able to share this over the past few days with people I met at LeWeb and OpenCo, but I noticed a pattern in the follow-up questions which all seemed to ask some form of: “What got you interested in data in the first place?” As it turns out, it’s a long story. In some form or another I’ve been thinking about the nature of information and digital data for a very, very long time. What follows is a kind of reverse intellectual history of my love affair with data.


Most recently, what motivated me to study at the OII was the fact that personal data keeps the internet running. Whether its browser cookies or clickstreams or networks, all that data is what drives the economy of the internet. And I also wanted to work with my advisor Viktor Mayer-Schönberger, who has written about both the risks and the potential of data in Delete and in Big Data. Drawing from law, economics, and ethnographic methods, I have been working out a more comprehensive framework for talking about our relationship with personal data, one that’s more positive and useful for understanding how data works in our everyday lives. The Quantified Self community has been proved to be such an interesting case because its members are interested in using any data that’s about them, regardless of the circumstances of its collection. “If the data is about me, shouldn’t I be able to learn something from it?” They are hyper-aware of the fact that, between the internet of things and transactional data out there in the world (like Tesco and Tube databases), data exhaust is everywhere. Rather than reacting against this as an intrusion, they want to make personal use of that data to know themselves better and ultimately make better decisions and support better behaviors. [Stay tuned for more on this - my thesis is due August 1!]

I realized it was time to go back to school after I first posed these questions at the SXSW panel I moderated in 2011, Paying with Data: How Free Services Are Not Free. And I later wrote about this when Facebook asked me to categorize my relationship to my fiancé (now husband) and when Google explicitly started to see me as a holistic profile across all its products. While the media have raised awareness of personal data privacy concerns in recent years, the coverage often smacks of sensationalism, which does little to help people make practical decisions about data in their everyday lives.

Before SXSW, I had been thinking about data management from the point of view of Chief Information Officers in my time as an analyst at The Research Board, a Gartner subsidiary. I spent a lot of time talking with Fortune 500 CIOs and enterprise technology industry leaders about their Business Intelligence efforts. In just a few years since, that’s evolved into Big Data.

I cut my teeth on enterprise technology as a green but inquisitive intern in the Enterprise IT department at Liberty Mutual. I can’t tell you how much time I spent that summer pouring over Wikipedia pages on computing acronyms. With next to no experience in IT, aside from hanging out on AOL a lot as a kid, I was fortunate to work with a great pair of mentors who believed in employing interns who asked the right questions. I distinctly remember walking through storage architecture diagrams and asking pointedly, “What is metadata?” It was the right question then, and it’s precisely the question on everyone’s mind when it comes to wrapping our heads around PRISM.

Even as an undergrad at Harvard I was thinking about the analog and digital nature of media, with a focus on the technical affordances and formal characteristics that change when media becomes data in digital format. I was absorbed in the philosophy of film (the literal celluloid kind) as cinema becomes digital, reading galley chapters from my advisor D.N. Rodowick’s The Virtual Life of Film, and I wrote my thesis on the influence of hypertext on the formal textual experiments in Mark Z. Danielewski’s House of Leaves. I wrote papers about how the iPhone was going to fundamentally change our relationship to the screen as a convergence device in our pockets, just months before it debuted in June of 2007. I suppose it’s no surprise how much my focus on these media objects runs parallel to this artifacts of the future structure for the book.

But my interest in these binary, digital concepts goes back further still. I can follow the thread all the way to my seventh grade plan to become a geneticist when I grew up. I loved the elegant simplicity of Punnet squares—four simple bits of information, cross tabulated to predict the expression of a recessive or dominant genotype. And, of course, the ATGC of DNA has been described as the body’s data and information processing system.

Which brings me right back to the book—talking about what happens when the lines between the physical world of atoms and the digital realm if bits become blurred. If we make the claim that DNA is data in the most basic ontological sense, our world has always been made up of data. What’s new (and what’s at stake) is the role we play as humans in creating the technologies that in turn create new data, and then in shaping what our world-as-data should look like. In order to have those conversations, we first need to be more aware of the dawn of the Data Age, and then we can start asking the right questions.

UPDATE: John wrote about our collaboration here.