Off-Switching Not Guaranteed

Philosophical Studies:1-13 (forthcoming)
  Copy   BIBTEX

Abstract

Hadfield-Menell et al. (2017) propose the Off-Switch Game, a model of Human-AI cooperation in which AI agents always defer to humans because they are uncertain about our preferences. I explain two reasons why AI agents might not defer. First, AI agents might not value learning. Second, even if AI agents value learning, they might not be certain to learn our actual preferences.

Other Versions

No versions found

Similar books and articles

Domesticating Artificial Intelligence.Luise Müller - 2022 - Moral Philosophy and Politics 9 (2):219-237.

Analytics

Added to PP
2025-02-12

Downloads
299 (#100,527)

6 months
299 (#9,351)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Sven Neth
University of Pittsburgh

References found in this work

The Foundations of Statistics.Leonard Savage - 1954 - Wiley Publications in Statistics.
Risk and Rationality.Lara Buchak - 2013 - Oxford, GB: Oxford University Press.
The Value of Biased Information.Nilanjan Das - 2023 - British Journal for the Philosophy of Science 74 (1):25-55.
Against the singularity hypothesis.David Thorstad - forthcoming - Philosophical Studies:1-25.

View all 27 references / Add more references