Abstract
This paper concerns top-down approaches in machine ethics. It is divided into three main parts. First, I briefly describe top-down design approaches, and in doing so I make clear what those approaches are committed to and what they involve when it comes to training an AI to behave ethically. In the second part, I formulate two underappreciated motivations for endorsing them, one relating to predictability of machine behavior and the other relating to scrutability of machine decision-making. Finally, I present three major worries about such approaches, and I attempt to show that advocates of top-down approaches have some plausible avenues of response. I focus most of my attention on what I call the ‘technical manual objection’ to top-down approaches, inspired by the work of Annas (2004). In short, the idea is that top-down approaches treat ethical decision-making as being merely a matter of following some ethical instructions in the same way that one might follow some set of instructions contained in a technical manual (e.g., computer manual), and this invites sensible skepticism about the ethical wisdom of machines that have been trained on those approaches. I respond by claiming that the objection is successful only if it is understood as targeting machines that have certain kinds of goals, and it should not compel us to totally abandon top-down approaches. Such approaches could still be reasonably employed to design ethical AI that operate in contexts that include fairly noncontroversial answers to ethical questions. In fact, we should prefer top-down approaches when it comes to those types of context, or so I argue, due to the advantages I claim for them.