Abstract
Archival data processing consists of cleaning and formatting data between the moment a dataset is deposited and its publication on the archive’s website. In this article, I approach data processing by combining scholarship on invisible labor in knowledge infrastructures with a Marxian framework and show the relevance of considering data processing as factory labor. Using this perspective to analyze ethnographic data collected during a six-month participatory observation at a U.S. data archive, I generate a taxonomy of the forms of alienation that data processing generates, but also the types of resistance that processors develop, across four categories: routine, speed, skill, and meaning. This synthetic approach demonstrates, first, that data processing reproduces typical forms of factory worker’s alienation: processors are asked to work along a strict standardized pipeline, at a fast pace, without acquiring substantive skills or having a meaningful involvement in their work. It reveals, second, how data processors resist the alienating nature of this workflow by developing multiple tactics along the same four categories. Seen through this dual lens, data processors are therefore not only invisible workers, but also factory workers who follow and subvert a workflow organized as an assembly line. I conclude by proposing a four-step framework to better value the social contribution of data workers beyond the archive.