Big Data with Human Characteristics

Due to sudden illness this past week, I am not able to travel to Beijing to attend the Tencent Internet and Society Institute Summit on Big Data on July 25, 2014. I'm grateful for the invitation from the Oxford Internet Institute, Renmin University, and Tencent and I am sorry to miss what promises to be a fascinating event. I recorded this brief video to share my work and thoughts on "Big Data with Human Characteristics" and the importance of maintaining the subjectivity of individuals in when we look at humanity through an otherwise "objective" big data lens. I'm including the full text of the script below. And huge thanks to my husband, Nick for helping me with the Chinese. It's been a while!


I am thankful still to be able to be able to join you by video today, and I wanted to share some of my work on big data.

Big data is becoming a dominant paradigm for making sense of the world around us. It promises novel insights and knowledge at scale. But the power dynamics of big data privilege those with the consolidated information and with the tools to analyze and interpret at that scale. This power has the potential to go unchecked and unquestioned because of its reliance on the authority and objectivity associated with "data" and because of the black box processes that obscure big data practices.

Of course, we came together today to discuss the potential power of big data because we are interested in what it might tell us about humanity. To be sure, there is great potential. But there is also something about operating at this scale that makes us susceptible to forgetting the individuals that collectively make up big data sets. We risk missing the trees for the forest.

We have to remember that big data is always made up of individuals. It might be our personal purchasing habits, our interest profile, our friends list, the collection of our published thoughts, or perhaps all of the above. On a macro scale, each of those data points allow researchers and firms to categorize populations or segment markets. But it takes work at the micro scale to grasp a contextual view of the individual. Research efforts and funding support must keep this in mind—big data methods can answer some questions, but certainly not all.

My work has focused on the lived experience of data, using qualitative interview methods to understand approaches to thinking about data. I look at data as a medium for personal knowledge creation, interpretation, and meaning making. I have closely studied the Quantified Self, a community in which individuals use mobile applications and wearable sensors to create data about our bodies and behaviors. The technology companies building these tools have interests in aggregate insights, but there is much to be learned from individuals about what our data means to us at a personal scale. 

So what do we need to do to avoid the potential biases and dehumanizing effects of looking at individuals through a big data lens? As those who are developing and supporting big data methods, we need to actively seek out means of preserving the subjectivity of the humans to which this data refers. I urge researchers, designers of internet platforms, and those in the business of data, to keep the humanity of data in mind. Remember to look at a small scale alongside large scale interpretations. This approach is sure to keep us in touch with what big data means across scales of humanity: from the globe, to national populations, to the user base of large internet platforms, to local communities, right down to the individual. 

I realize that Western thinking tends to privilege the position of the individual above the collective. And in turn, eastern traditions tend to privilege the collective over the individual. But we all share a common interest in humanizing our policies and interventions based on big data. In China this is embodied in 以人为本, as a principle of human-centered policy. I argue that we need Big Data with Human Characteristics. Figuring out what that means across cultures will be hard work, but uncovering and engaging with the commonalities and differences in the way we think about humans in big data will be revealing. This is an important step in what is sure to be a fruitful collaboration, as we work together to grapple with what big data means to us all.