Big Data with Human Characteristics

Due to sudden illness this past week, I am not able to travel to Beijing to attend the Tencent Internet and Society Institute Summit on Big Data on July 25, 2014. I'm grateful for the invitation from the Oxford Internet Institute, Renmin University, and Tencent and I am sorry to miss what promises to be a fascinating event. I recorded this brief video to share my work and thoughts on "Big Data with Human Characteristics" and the importance of maintaining the subjectivity of individuals in when we look at humanity through an otherwise "objective" big data lens. I'm including the full text of the script below. And huge thanks to my husband, Nick for helping me with the Chinese. It's been a while!


I am thankful still to be able to be able to join you by video today, and I wanted to share some of my work on big data.

Big data is becoming a dominant paradigm for making sense of the world around us. It promises novel insights and knowledge at scale. But the power dynamics of big data privilege those with the consolidated information and with the tools to analyze and interpret at that scale. This power has the potential to go unchecked and unquestioned because of its reliance on the authority and objectivity associated with "data" and because of the black box processes that obscure big data practices.

Of course, we came together today to discuss the potential power of big data because we are interested in what it might tell us about humanity. To be sure, there is great potential. But there is also something about operating at this scale that makes us susceptible to forgetting the individuals that collectively make up big data sets. We risk missing the trees for the forest.

We have to remember that big data is always made up of individuals. It might be our personal purchasing habits, our interest profile, our friends list, the collection of our published thoughts, or perhaps all of the above. On a macro scale, each of those data points allow researchers and firms to categorize populations or segment markets. But it takes work at the micro scale to grasp a contextual view of the individual. Research efforts and funding support must keep this in mind—big data methods can answer some questions, but certainly not all.

My work has focused on the lived experience of data, using qualitative interview methods to understand approaches to thinking about data. I look at data as a medium for personal knowledge creation, interpretation, and meaning making. I have closely studied the Quantified Self, a community in which individuals use mobile applications and wearable sensors to create data about our bodies and behaviors. The technology companies building these tools have interests in aggregate insights, but there is much to be learned from individuals about what our data means to us at a personal scale. 

So what do we need to do to avoid the potential biases and dehumanizing effects of looking at individuals through a big data lens? As those who are developing and supporting big data methods, we need to actively seek out means of preserving the subjectivity of the humans to which this data refers. I urge researchers, designers of internet platforms, and those in the business of data, to keep the humanity of data in mind. Remember to look at a small scale alongside large scale interpretations. This approach is sure to keep us in touch with what big data means across scales of humanity: from the globe, to national populations, to the user base of large internet platforms, to local communities, right down to the individual. 

I realize that Western thinking tends to privilege the position of the individual above the collective. And in turn, eastern traditions tend to privilege the collective over the individual. But we all share a common interest in humanizing our policies and interventions based on big data. In China this is embodied in 以人为本, as a principle of human-centered policy. I argue that we need Big Data with Human Characteristics. Figuring out what that means across cultures will be hard work, but uncovering and engaging with the commonalities and differences in the way we think about humans in big data will be revealing. This is an important step in what is sure to be a fruitful collaboration, as we work together to grapple with what big data means to us all.


Living with Data: Personal Data Uses of the Quantified Self [MY THESIS!]

After a whirlwind of handing in, packing up, moving out, and arriving and getting set up back in the States, it’s finally time to share the work I have been building over the last year. I received my marks last week, and am excited to share that I have officially graduated with distinction and the Oxford Internet Institute MSc Thesis award for this work. Now I’m pleased to share the finished product with the QS community, my participants, and anyone else who has been interested in following my research.

The document is available as a PDF download and embedded below. And here’s the abstract:

Living with Data: Personal Data Uses of the Quantified Self

Between the internet, social media, sensor-enabled devices, and established industrial transactional systems, we are living in a world with more data about ourselves than ever before. Public discourse has largely focused on the opportunities for firms or the risks to individuals as this data environment expands. These framings do not give individuals enough practical understanding of how data impacts and integrates into their lives. The Quantified Self community is an advanced- user community of people who have begun to explore and experiment with novel uses of personal data. As the Homebrew Computer Club’s hobbyist experimentations paved the way for the personal computing revolution, the Quantified Self community offers a glimpse of what engagement with personal data in our everyday lives might soon look like. Through ethnographically-informed interviews and participant observations, this research explores how self- trackers derive personal meaning from personal data. I present a lifecycle of personal data use: from deciding what to track, through collection, analysis, and future uses. I explain how current barriers to use expose the need for revised policies to support individuals’ personal interest in the use of their data. By analyzing the metaphors individuals use to explain their personal uses of data, I put Quantified Self tracking practices in historical context and illuminate the novel affordances that self-knowledge through data provides. I argue the QS community offers ways of framing and engaging with personal data in our everyday lives that can help society at large begin to understand our roles as data selves in a Big Data world.

Living With Data: Personal Data Uses of the Quantified Self

I’m proud of this work, but it is not without some compromises. We had only 10,000 words, but I probably could have written 100,000 from all the material that came out of my conversations with the QS community. And time permitting, I would have loved to keep talking with more people. I could have written an entire chapter on the legal and policy implications of the ways that the QS community expects to be able to use their data, but that in itself would have taken 10,000 words. And I could have spent 10,000 more words just telling individual stories, but instead I had to push those detailed profiles to the appendix. A lot of words were spent establishing methods, boxes that needed to be checked to demonstrate my mastery of my chosen social science methods for the program. In the end, I’m most pleased that I was able to ask the questions I found most interesting while working within the bounds of an empirically-focused, interdisciplinary department.

I’m posting this thesis here “as is,” with only a few minor edits post-submission to the exam schools. “As is,” in the sense that this is a work completed in fulfillment of my Masters program. It’s an artifact of an academic experience with a specific context and purpose. It’s not indicative of the type of writing I aim to do going forward. Thankfully, there’s plenty more to write about in future articles and in the book, and in a more accessible, less jargony way. In the meantime, I’d love to hear your thoughts!

- Master of the Internet

PS. Here’s a webcam shot of me taken using Stan James’ LifeSlice, one of the Quantified Self tools I was playing with throughout my thesis process. Here I’m caught pondering edits in the Cornmarket Starbucks.


And the final days of editing.