Synthese 205 (1):1-23 (
2025)
Copy
BIBTEX
Abstract
Recent years have seen incredible advances in our abilities to gather and store data, as well as in the computational power and methods—most prominently in machine learning—to do things with those data. These advances have given rise to the emerging field “data science.” Because of its immense power for providing practically useful information about the world, data science is a field of increasing importance. This paper argues that a core part of what data scientists are doing should be understood as conceptual engineering. At all stages of the data science process, data scientists need to deliberate about, evaluate, and make classificatory choices in a variety of ways, including as part of training and evaluating machine learning models. Viewing these activities as involved in conceptual engineering offers a new way to think about them, one that helps to clarify what is at stake in them, what sorts of considerations are relevant, and how to systematically think about the choices faced. Given the increasing importance of data science, if conceptual engineering is relevant for activities in data science, this also highlights the relevance and impact of conceptual engineering as a method. Furthermore, the paper also suggests that machine learning opens distinctive and novel ways in which data scientists engage in conceptual engineering.