Abstract
What is the relationship between ideas of sameness and difference for machine learning and AI? Algorithms are often understood to participate in the continual displacement of the different and heterogeneous in society in favour of sameness, of that which is socio-politically similar and proximate. In contrast to this prevalent emphasis on sameness, however, this paper argues that there is a nascent heterophilic logic underpinning the intersection of synthetic data and machine learning, a move towards actively generating differences and heterogeneous data attributes to train, fine-tune, and optimize algorithms. Yet, these synthetic attribute data are nonetheless always machine compatible, devoid of their socio-cultural dynamics and tensions. As such, through a critical examination of three core dimensions of this emergent politics of difference of synthetic data – disentanglement, compositionality, and normativity – the paper argues that this has the potential to ultimately undercut a politics of intervention that seeks to foreground the systemic unfairness and violence of machine learning.