The Shutdown Problem: Incomplete Preferences as a Solution

Abstract

I explain and motivate the shutdown problem: the problem of creating artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I then propose a solution: train agents to have incomplete preferences. Specifically, I propose that we train agents to lack a preference between every pair of different-length trajectories. I suggest a way to train such agents using reinforcement learning: we give the agent lower reward for repeatedly choosing same-length trajectories.

Other Versions

No versions found

Links

PhilArchive

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Shutdown-seeking AI.Simon Goldstein & Pamela Robinson - forthcoming - Philosophical Studies:1-13.
Decision theory for agents with incomplete preferences.Adam Bales, Daniel Cohen & Toby Handfield - 2014 - Australasian Journal of Philosophy 92 (3):453-70.
Of marbles and matchsticks.Harvey Lederman - forthcoming - In Tamar Szabó Gendler, John Hawthorne, Julianne Chung & Alex Worsnip (eds.), Oxford Studies in Epistemology, Vol. 8. Oxford University Press.
Opaque Sweetening and Transitivity.Ryan Doody - 2019 - Australasian Journal of Philosophy 97 (3):559-571.
Which symbol grounding problem should we try to solve?Vincent C. Müller - 2015 - Journal of Experimental & Theoretical Artificial Intelligence 27 (1):73-78.

Analytics

Added to PP
2024-03-05

Downloads
748 (#32,615)

6 months
227 (#12,282)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Elliott Thornley
University of Oxford

References found in this work

No references found.

Add more references