Who Does Data Serve?

10 min

Introduction

Narrator: Imagine a community mapping the deaths of Black children on their city streets, creating a dataset that official records ignore. This isn't just an act of remembrance; it's an act of defiance. It's a powerful statement that some data is actively suppressed, that some lives are not counted, and that the seemingly neutral world of data science is, in fact, a battleground of power. This grassroots effort to make the invisible visible lies at the heart of a groundbreaking book. In Data Feminism, authors Catherine D’Ignazio and Lauren F. Klein challenge the core belief that data is objective, revealing how it can be wielded as a tool of oppression. They argue that to build a more just world, we first need a new way of seeing, analyzing, and using data—one informed by the principles of intersectional feminism.

Data Is Not Neutral, It Is a Product of Power

Key Insight 1

Narrator: The foundational argument of Data Feminism is that data is never raw or objective. Instead, every dataset has a "biography." It is collected, cleaned, and compiled by people, and every decision in that process is shaped by human biases and existing power structures. The authors introduce the concept of a "matrix of domination" to explain how these power systems operate. This matrix can suppress action on available data or, more insidiously, prevent certain data from ever being collected, creating what the authors call "missing datasets."

A stark example can be found in public health. For years, the full scope of pregnancy and childbirth complications, especially among marginalized communities, was not fully understood because the data was incomplete. Power structures within medicine and society determined what was worth counting and what was not. The experiences of women, particularly women of color, were often dismissed or under-recorded, leading to a critical data gap. This isn't an accident; it's a symptom of a system that devalues certain lives and experiences. To counter this, movements like data activism and citizen science emerge to fill these gaps, demonstrating that the very act of collecting data can be a political one.

Dominant Data Practices Can Reinforce Oppression

Key Insight 2

Narrator: When data scientists ignore the social context of their data, they risk perpetuating systemic harm. The authors highlight how a false sense of objectivity can be dangerous, citing scholar Ruha Benjamin’s concept of "The New Jim Code." This is where software and algorithms, under the guise of neutrality, combine to control the lives of Black people and other people of color.

The community mapping project that documented the deaths of Black children on city streets serves as a powerful illustration. Official city data might have recorded these as isolated incidents or failed to capture racial patterns, creating a "deficit narrative" that blames victims or their communities. By creating their own data, the community challenged this official narrative. They exposed the structural oppression and mechanisms of privilege that made their streets unsafe. This act of counter-data collection revealed a truth that traditional, "objective" data science had missed, showing that ignoring race and power in data analysis isn't a neutral choice—it's a choice that upholds the status quo.

Emotion and Embodiment Are Valid Forms of Knowledge

Key Insight 3

Narrator: Traditional data visualization often prizes a minimalist, "emotion-free" style, believing it to be more objective. D’Ignazio and Klein argue this is a "god trick of seeing everything from nowhere," a false claim to a neutral, all-seeing perspective. In reality, this minimalist approach simply hides the editorial choices of the designer and strips the data of its human context.

To challenge this, the authors propose "data visceralization"—a method that engages our senses and emotions. They offer a compelling juxtaposition. Imagine seeing a standard bar chart showing the number of people killed in shootings. The bars go up, you register the numbers, and you move on. Now, imagine a different graphic. This one doesn't show frequencies; it visualizes the "number of years stolen" from each person whose life was cut short. Suddenly, the abstract statistic becomes a visceral representation of lost potential, of stolen futures. This approach doesn't manipulate the viewer; it provides a more complete, more human truth by valuing the knowledge that comes from "living, feeling bodies in the world."

Classification Is a Double-Edged Sword

Key Insight 4

Narrator: The book explores how the act of counting and classifying people is never simple. On one hand, being counted can have positive material consequences. For example, census results are used to allocate public funding for schools, hospitals, and infrastructure. In this sense, being visible in the data can bring tangible benefits to a community.

On the other hand, for vulnerable groups, this same visibility can be a weapon used against them. The authors point to the experience of undocumented people. For them, being counted and identified by the state doesn't lead to benefits; it can lead to expulsion. The "regime of visibility" becomes a direct threat. This reveals the double-edged nature of classification. The authors don't argue against using quantitative methods, but they warn against "naturalizing" these categories as if they are simply "the way things are." Every classification system is a human creation, and it must be critically examined for who it empowers and who it endangers.

Context Is Everything, and "Clean Data" Is a Myth

Key Insight 5

Narrator: Data science has a fixation on "clean data," but the authors argue that this cleanliness often hides diversity and erases crucial context. The "messiness" of data can be a rich source of information, revealing the circumstances of its collection. This leads to a crucial shift in perspective: from analyzing "datasets" to understanding "data settings." A data setting includes not just the numbers, but the entire human and technical process that created them.

This is why the Open Data movement, for all its good intentions, often falls short. It succeeds in "opening up" data, making it publicly available, but it frequently fails to provide the context needed to understand it. Data arriving on our "computational doorstep context-free" is not just less useful; it can be dangerous. To solve this, the authors propose the creation of "data user guides"—narrative portraits of a dataset that explain its history, its limitations, and the ethical considerations for its use. This ensures that analysts are not "strangers in the dataset," who risk committing "epistemic violence" by interpreting information without understanding its world.

The Labor of Data Science Must Be Made Visible

Key Insight 6

Narrator: The popular image of a data scientist is often a lone genius, a brilliant coder who single-handedly extracts truth from a sea of numbers. Data Feminism dismantles this myth. The final principle emphasizes the "many hands" involved in any data project. This includes the people who design the study, the enumerators who collect the information, the community members who provide local knowledge, the workers who clean and label the data, and the designers who visualize it.

However, the labor of these many hands is often downplayed or made invisible, a phenomenon driven by old stereotypes related to gender, race, and class. Work that is seen as less technical, such as data cleaning or community outreach, is often devalued, even though it is absolutely essential for the project's success. By making this labor visible, data feminism calls for a more equitable and honest recognition of who truly creates data knowledge, challenging the hierarchies that exist within the field itself.

Conclusion

Narrator: The single most important takeaway from Data Feminism is that data is not a mirror of reality; it is a powerful force that helps shape it. The choices we make about what to count, how to classify, and whose stories to tell have profound real-world consequences. By treating data as a neutral, objective resource, we risk reinforcing the very inequalities we hope to solve.

The book leaves us with a transformative challenge: to stop asking "What does the data say?" and start asking "Who does the data serve?" It calls on us to become critical consumers and creators of data, to question its origins, to embrace its context, and to wield it not as a tool for abstract analysis, but as a tool for justice, empathy, and co-liberation. The ultimate question is not whether we will use data to shape the future, but whether we will do so with intention, care, and a commitment to challenging power.

00:00/00:00