Results for 'AI Alignment'

976 found
Order:
  1. AI Alignment vs. AI Ethical Treatment: Ten Challenges.Adam Bradley & Bradford Saad - manuscript
    A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  2. Values in science and AI alignment research.Leonard Dung - manuscript
    Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value transparency, critical (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  3.  56
    Expanding AI and AI Alignment Discourse: An Opportunity for Greater Epistemic Inclusion.A. E. Williams - manuscript
    The AI and AI alignment communities have been instrumental in addressing existential risks, developing alignment methodologies, and promoting rationalist problem-solving approaches. However, as AI research ventures into increasingly uncertain domains, there is a risk of premature epistemic convergence, where prevailing methodologies influence not only the evaluation of ideas but also determine which ideas are considered within the discourse. This paper examines critical epistemic blind spots in AI alignment research, particularly the lack of predictive frameworks to differentiate problems (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  4. Disagreement, AI alignment, and bargaining.Harry R. Lloyd - forthcoming - Philosophical Studies:1-31.
    New AI technologies have the potential to cause unintended harms in diverse domains including warfare, judicial sentencing, biomedicine and governance. One strategy for realising the benefits of AI whilst avoiding its potential dangers is to ensure that new AIs are properly ‘aligned’ with some form of ‘alignment target.’ One danger of this strategy is that – dependent on the alignment target chosen – our AIs might optimise for objectives that reflect the values only of a certain subset of (...)
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  5.  25
    Beyond Preferences in AI Alignment.Tan Zhi-Xuan, Micah Carroll, Matija Franklin & Hal Ashton - forthcoming - Philosophical Studies:1-51.
    The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction of preferences, and (3) that AI systems should be aligned with the preferences of one or more humans to ensure that they behave safely and in accordance with our values. Whether implicitly followed or explicitly endorsed, these commitments constitute what we term apreferentistapproach to AI alignment. In this paper, (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  6.  42
    Philosophical Investigations into AI Alignment: A Wittgensteinian Framework.José Antonio Pérez-Escobar & Deniz Sarikaya - 2024 - Philosophy and Technology 37 (3):1-25.
    We argue that the later Wittgenstein’s philosophy of language and mathematics, substantially focused on rule-following, is relevant to understand and improve on the Artificial Intelligence (AI) alignment problem: his discussions on the categories that influence alignment between humans can inform about the categories that should be controlled to improve on the alignment problem when creating large data sets to be used by supervised and unsupervised learning algorithms, as well as when introducing hard coded guardrails for AI models. (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  7. AI, alignment, and the categorical imperative.Fritz McDonald - 2023 - AI and Ethics 3:337-344.
    Tae Wan Kim, John Hooker, and Thomas Donaldson make an attempt, in recent articles, to solve the alignment problem. As they define the alignment problem, it is the issue of how to give AI systems moral intelligence. They contend that one might program machines with a version of Kantian ethics cast in deontic modal logic. On their view, machines can be aligned with human values if such machines obey principles of universalization and autonomy, as well as a deontic (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8.  94
    Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback.Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde & William S. Zwicker - forthcoming - Proceedings of the Forty-First International Conference on Machine Learning.
    Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  9.  17
    Kantian Fallibilist Ethics for AI alignment.Vadim Chaly - 2024 - Journal of Philosophical Investigations 18 (47):303-318.
    The problem of AI alignment has parallels in Kantian ethics and can benefit from its concepts and arguments. The Kantian framework allows us to better answer the question of what exactly AI is being aligned to, what are the problems of alignment of rational agents in general, and what are the prospects for achieving a state of alignment. Having described the state of discussions about alignment in AI, I will reformulate them in Kantian terms. Thus, the (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  10.  13
    Reflections on the AI alignment problem.Dan Bruiger - forthcoming - AI and Society:1-10.
    The Alignment Problem in artificial intelligence concerns how to insure that artificial general intelligence (AGI) conforms to human goals and values and remains under human control. The concept of general intelligence, modelled on human and animal behavior, lacks coherence. The ideal of autonomy inherent in AGI conflicts with the ideal of external control. Truly autonomous agents are necessarily _embodied,_ but embodiment implies more than physical instantiation or sensory input. It means being an autopoietic system (like a natural organism), with (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  40
    Aesthetic Value and the AI Alignment Problem.Alice C. Helliwell - 2024 - Philosophy and Technology 37 (4):1-21.
    The threat from possible future superintelligent AI has given rise to discussion of the so-called “value alignment problem”. This is the problem of how to ensure artificially intelligent systems align with human values, and thus (hopefully) mitigate risks associated with them. Naturally, AI value alignment is often discussed in relation to morally relevant values, such as the value of human lives or human wellbeing. However, solutions to the value alignment problem target all human values, not only morally (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  12.  59
    Calibrating machine behavior: a challenge for AI alignment.Erez Firt - 2023 - Ethics and Information Technology 25 (3):1-8.
    When discussing AI alignment, we usually refer to the problem of teaching or training advanced autonomous AI systems to make decisions that are aligned with human values or preferences. Proponents of this approach believe it can be employed as means to stay in control over sophisticated intelligent systems, thus avoiding certain existential risks. We identify three general obstacles on the path to implementation of value alignment: a technological/technical obstacle, a normative obstacle, and a calibration problem. Presupposing, for the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  13.  86
    Discovering Our Blind Spots and Cognitive Biases in AI Research and Alignment.A. E. Williams - manuscript
    The challenge of AI alignment is not just a technological issue but fundamentally an epistemic one. AI safety research predominantly relies on empirical validation, often detecting failures only after they manifest. However, certain risks—such as deceptive alignment and goal misspecification—may not be empirically testable until it is too late, necessitating a shift toward leading-indicator logical reasoning. This paper explores how mainstream AI research systematically filters out deep epistemic insight, hindering progress in AI safety. We assess the rarity of (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  14. Aligning artificial intelligence with moral intuitions: an intuitionist approach to the alignment problem.Dario Cecchini, Michael Pflanzer & Veljko Dubljevic - 2024 - AI and Ethics:1-11.
    As artificial intelligence (AI) continues to advance, one key challenge is ensuring that AI aligns with certain values. However, in the current diverse and democratic society, reaching a normative consensus is complex. This paper delves into the methodological aspect of how AI ethicists can effectively determine which values AI should uphold. After reviewing the most influential methodologies, we detail an intuitionist research agenda that offers guidelines for aligning AI applications with a limited set of reliable moral intuitions, each underlying a (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  15.  13
    A Note on “Philosophical Investigations into AI Alignment: A Wittgensteinean Framework” by J.A. Pérez-Escobar and D. Sarikaya. [REVIEW]Sorin Bangu - 2024 - Philosophy and Technology 37 (3):1-5.
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  16.  17
    Democratizing value alignment: from authoritarian to democratic AI ethics.Linus Ta-Lun Huang, Gleb Papyshev & James K. Wong - 2024 - AI and Ethics.
    Value alignment is essential for ensuring that AI systems act in ways that are consistent with human values. Existing approaches, such as reinforcement learning with human feedback and constitutional AI, however, exhibit power asymmetries and lack transparency. These “authoritarian” approaches fail to adequately accommodate a broad array of human opinions, raising concerns about whose values are being prioritized. In response, we introduce the Dynamic Value Alignment approach, theoretically grounded in the principles of parallel constraint satisfaction, which models moral (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  17. Applying AI for social good: Aligning academic journal ratings with the United Nations Sustainable Development Goals (SDGs).David Steingard, Marcello Balduccini & Akanksha Sinha - 2023 - AI and Society 38 (2):613-629.
    This paper offers three contributions to the burgeoning movements of AI for Social Good (AI4SG) and AI and the United Nations Sustainable Development Goals (SDGs). First, we introduce the SDG-Intense Evaluation framework (SDGIE) that aims to situate variegated automated/AI models in a larger ecosystem of computational approaches to advance the SDGs. To foster knowledge collaboration for solving complex social and environmental problems encompassed by the SDGs, the SDGIE framework details a benchmark structure of data-algorithm-output to effectively standardize AI approaches to (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  18. Aligning artificial intelligence with human values: reflections from a phenomenological perspective.Shengnan Han, Eugene Kelly, Shahrokh Nikou & Eric-Oluf Svee - 2022 - AI and Society 37 (4):1383-1395.
    Artificial Intelligence (AI) must be directed at humane ends. The development of AI has produced great uncertainties of ensuring AI alignment with human values (AI value alignment) through AI operations from design to use. For the purposes of addressing this problem, we adopt the phenomenological theories of material values and technological mediation to be that beginning step. In this paper, we first discuss the AI value alignment from the relevant AI studies. Second, we briefly present what are (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  19. Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.
    Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  20.  42
    AI in the noosphere: an alignment of scientific and wisdom traditions.Stephen D. Edwards - 2021 - AI and Society 36 (1):397-399.
  21. Artificial Intelligence, Values, and Alignment.Iason Gabriel - 2020 - Minds and Machines 30 (3):411-437.
    This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   72 citations  
  22.  35
    An explanation space to align user studies with the technical development of Explainable AI.Garrick Cabour, Andrés Morales-Forero, Élise Ledoux & Samuel Bassetto - 2023 - AI and Society 38 (2):869-887.
    Providing meaningful and actionable explanations for end-users is a situated problem requiring the intersection of multiple disciplines to address social, operational, and technical challenges. However, the explainable artificial intelligence community has not commonly adopted or created tangible design tools that allow interdisciplinary work to develop reliable AI-powered solutions. This paper proposes a formative architecture that defines the explanation space from a user-inspired perspective. The architecture comprises five intertwined components to outline explanation requirements for a task: (1) the end-users’ mental models, (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  23. ‘Interpretability’ and ‘Alignment’ are Fool’s Errands: A Proof that Controlling Misaligned Large Language Models is the Best Anyone Can Hope For.Marcus Arvan - forthcoming - AI and Society.
    This paper uses famous problems from philosophy of science and philosophical psychology—underdetermination of theory by evidence, Nelson Goodman’s new riddle of induction, theory-ladenness of observation, and “Kripkenstein’s” rule-following paradox—to show that it is empirically impossible to reliably interpret which functions a large language model (LLM) AI has learned, and thus, that reliably aligning LLM behavior with human values is provably impossible. Sections 2 and 3 show that because of how complex LLMs are, researchers must interpret their learned functions largely in (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  24. Saliva Ontology: An ontology-based framework for a Salivaomics Knowledge Base.Jiye Ai, Barry Smith & David Wong - 2010 - BMC Bioinformatics 11 (1):302.
    The Salivaomics Knowledge Base (SKB) is designed to serve as a computational infrastructure that can permit global exploration and utilization of data and information relevant to salivaomics. SKB is created by aligning (1) the saliva biomarker discovery and validation resources at UCLA with (2) the ontology resources developed by the OBO (Open Biomedical Ontologies) Foundry, including a new Saliva Ontology (SALO). We define the Saliva Ontology (SALO; http://www.skb.ucla.edu/SALO/) as a consensus-based controlled vocabulary of terms and relations dedicated to the salivaomics (...)
    Direct download  
     
    Export citation  
     
    Bookmark   4 citations  
  25.  4
    Multi-Value Alignment for Ml/Ai Development Choices.Hetvi Jethwani & Anna C. F. Lewis - 2025 - American Philosophical Quarterly 62 (2):133-152.
    We outline a four-step process for ML/AI developers to align development choices with multiple values, by adapting a widely-utilized framework from bioethics: (1) identify the values that matter, (2) specify identified values, (3) find solution spaces that allow for maximal alignment with identified values, and 4) make hard choices if there are unresolvable trade-offs between the identified values. Key to this approach is identifying resolvable trade-offs between values (Step 3). We survey ML/AI methods that could be used to this (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  26.  23
    Toleration and Justice in the Laozi: Engaging with Tao Jiang's Origins of Moral-Political Philosophy in Early China.Ai Yuan - 2023 - Philosophy East and West 73 (2):466-475.
    In lieu of an abstract, here is a brief excerpt of the content:Toleration and Justice in the Laozi:Engaging with Tao Jiang's Origins of Moral-Political Philosophy in Early ChinaAi Yuan (bio)IntroductionThis review article engages with Tao Jiang's ground-breaking monograph on the Origins of Moral-Political Philosophy in Early China with particular focus on the articulation of toleration and justice in the Laozi (otherwise called the Daodejing).1 Jiang discusses a naturalistic turn and the re-alignment of values in the Laozi, resulting in a (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  27.  63
    Honor Ethics: The Challenge of Globalizing Value Alignment in AI.Stephen Tze-Inn Wu, Dan Demetriou & Rudwan Ali Husain - 2023 - 2023 Acm Conference on Fairness, Accountability, and Transparency (Facct '23), June 12-15, 2023.
    Some researchers have recognized that privileged communities dominate the discourse on AI Ethics, and other voices need to be heard. As such, we identify the current ethics milieu as arising from WEIRD (Western, Educated, Industrialized, Rich, Democratic) contexts, and aim to expand the discussion to non-WEIRD global communities, who are also stakeholders in global sociotechnical systems. We argue that accounting for honor, along with its values and related concepts, would better approximate a global ethical perspective. This complex concept already underlies (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  28.  74
    Human-aligned artificial intelligence is a multiobjective problem.Peter Vamplew, Richard Dazeley, Cameron Foale, Sally Firmin & Jane Mummery - 2018 - Ethics and Information Technology 20 (1):27-40.
    As the capabilities of artificial intelligence systems improve, it becomes important to constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of ethical, legal and safety-based frameworks have been proposed as a basis for designing these constraints. Despite their variations, these frameworks share the common characteristic that decision-making must consider multiple potentially conflicting factors. We demonstrate that these alignment frameworks can be represented as utility functions, but that the widely used Maximum Expected Utility paradigm provides (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   11 citations  
  29. Current cases of AI misalignment and their implications for future risks.Leonard Dung - 2023 - Synthese 202 (5):1-23.
    How can one build AI systems such that they pursue the goals their designers want them to pursue? This is the alignment problem. Numerous authors have raised concerns that, as research advances and systems become more powerful over time, misalignment might lead to catastrophic outcomes, perhaps even to the extinction or permanent disempowerment of humanity. In this paper, I analyze the severity of this risk based on current instances of misalignment. More specifically, I argue that contemporary large language models (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   7 citations  
  30.  71
    Value Alignment for Advanced Artificial Judicial Intelligence.Christoph Winter, Nicholas Hollman & David Manheim - 2023 - American Philosophical Quarterly 60 (2):187-203.
    This paper considers challenges resulting from the use of advanced artificial judicial intelligence (AAJI). We argue that these challenges should be considered through the lens of value alignment. Instead of discussing why specific goals and values, such as fairness and nondiscrimination, ought to be implemented, we consider the question of how AAJI can be aligned with goals and values more generally, in order to be reliably integrated into legal and judicial systems. This value alignment framing draws on AI (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  31.  15
    A Justifiable Investment in AI for Healthcare: Aligning Ambition with Reality.Kassandra Karpathakis, Jessica Morley & Luciano Floridi - 2024 - Minds and Machines 34 (4):1-40.
    Healthcare systems are grappling with critical challenges, including chronic diseases in aging populations, unprecedented health care staffing shortages and turnover, scarce resources, unprecedented demands and wait times, escalating healthcare expenditure, and declining health outcomes. As a result, policymakers and healthcare executives are investing in artificial intelligence (AI) solutions to increase operational efficiency, lower health care costs, and improve patient care. However, current level of investment in developing healthcare AI among members of the global digital health partnership does not seem to (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  32. Variable Value Alignment by Design; averting risks with robot religion.Jeffrey White - forthcoming - Embodied Intelligence 2023.
    Abstract: One approach to alignment with human values in AI and robotics is to engineer artiTicial systems isomorphic with human beings. The idea is that robots so designed may autonomously align with human values through similar developmental processes, to realize project ideal conditions through iterative interaction with social and object environments just as humans do, such as are expressed in narratives and life stories. One persistent problem with human value orientation is that different human beings champion different values as (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  33. Value alignment, human enhancement, and moral revolutions.Ariela Tubert & Justin Tiehen - forthcoming - Inquiry: An Interdisciplinary Journal of Philosophy.
    Human beings are internally inconsistent in various ways. One way to develop this thought involves using the language of value alignment: the values we hold are not always aligned with our behavior, and are not always aligned with each other. Because of this self-misalignment, there is room for potential projects of human enhancement that involve achieving a greater degree of value alignment than we presently have. Relatedly, discussions of AI ethics sometimes focus on what is known as the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  34.  34
    Automation, Alignment, and the Cooperative Interface.Julian David Jonker - 2024 - The Journal of Ethics 28 (3):483-504.
    The paper demonstrates that social alignment is distinct from value alignment as it is currently understood in the AI safety literature, and argues that social alignment is an important research agenda. Work provides an important example for the argument, since work is a cooperative endeavor, and it is part of the larger manifold of social cooperation. These cooperative aspects of work are individually and socially valuable, and so they must be given a central place when evaluating the (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  35. “Desired behaviors”: alignment and the emergence of a machine learning ethics.Katia Schwerzmann & Alexander Campolo - forthcoming - AI and Society:1-14.
    The concept of alignment has undergone a remarkable rise in recent years to take center stage in the ethics of artificial intelligence. There are now numerous philosophical studies of the values that should be used in this ethical framework as well as a technical literature operationalizing these values in machine learning models. This article takes a step back to address a more basic set of critical questions: Where has the ethical imperative of alignment come from? What is the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  36.  83
    Explicability as an AI Principle: Technology and Ethics in Cooperation.Moto Kamiura - forthcoming - Proceedings of the 39Th Annual Conference of the Japanese Society for Artificial Intelligence, 2025.
    This paper categorizes current approaches to AI ethics into four perspectives and briefly summarizes them: (1) Case studies and technical trend surveys, (2) AI governance, (3) Technologies for AI alignment, (4) Philosophy. In the second half, we focus on the fourth perspective, the philosophical approach, within the context of applied ethics. In particular, the explicability of AI may be an area in which scientists, engineers, and AI developers are expected to engage more actively relative to other ethical issues in (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  37. From Confucius to Coding and Avicenna to Algorithms: Cultivating Ethical AI Development through Cross-Cultural Ancient Wisdom.Ammar Younas & Yi Zeng - manuscript
    This paper explores the potential of integrating ancient educational principles from diverse eastern cultures into modern AI ethics curricula. It draws on the rich educational traditions of ancient China, India, Arabia, Persia, Japan, Tibet, Mongolia, and Korea, highlighting their emphasis on philosophy, ethics, holistic development, and critical thinking. By examining these historical educational systems, the paper establishes a correlation with modern AI ethics principles, advocating for the inclusion of these ancient teachings in current AI development and education. The proposed integration (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  38. (1 other version)An Enactive Approach to Value Alignment in Artificial Intelligence: A Matter of Relevance.Michael Cannon - 2021 - In Vincent C. Müller, Philosophy and Theory of AI. Springer Cham. pp. 119-135.
    The “Value Alignment Problem” is the challenge of how to align the values of artificial intelligence with human values, whatever they may be, such that AI does not pose a risk to the existence of humans. Existing approaches appear to conceive of the problem as "how do we ensure that AI solves the problem in the right way", in order to avoid the possibility of AI turning humans into paperclips in order to “make more paperclips” or eradicating the human (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  39. The linguistic dead zone of value-aligned agency, natural and artificial.Travis LaCroix - 2024 - Philosophical Studies:1-23.
    The value alignment problem for artificial intelligence (AI) asks how we can ensure that the “values”—i.e., objective functions—of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic communication is a necessary condition for robust value alignment. I discuss the consequences that the truth of this claim would have for research programmes that attempt to ensure value alignment for AI systems—or, more loftily, those programmes that seek to design robustly beneficial or (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  40.  51
    Instilling moral value alignment by means of multi-objective reinforcement learning.Juan Antonio Rodriguez-Aguilar, Maite Lopez-Sanchez, Marc Serramia & Manel Rodriguez-Soto - 2022 - Ethics and Information Technology 24 (1).
    AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent’s individual and ethical objectives. The second step consists in (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  41. Robustness to Fundamental Uncertainty in AGI Alignment.G. G. Worley Iii - 2020 - Journal of Consciousness Studies 27 (1-2):225-241.
    The AGI alignment problem has a bimodal distribution of outcomes with most outcomes clustering around the poles of total success and existential, catastrophic failure. Consequently, attempts to solve AGI alignment should, all else equal, prefer false negatives (ignoring research programs that would have been successful) to false positives (pursuing research programs that will unexpectedly fail). Thus, we propose adopting a policy of responding to points of philosophical and practical uncertainty associated with the alignment problem by limiting and (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  42.  98
    Challenges of Aligning Artificial Intelligence with Human Values.Margit Sutrop - 2020 - Acta Baltica Historiae Et Philosophiae Scientiarum 8 (2):54-72.
    As artificial intelligence systems are becoming increasingly autonomous and will soon be able to make decisions on their own about what to do, AI researchers have started to talk about the need to align AI with human values. The AI ‘value alignment problem’ faces two kinds of challenges—a technical and a normative one—which are interrelated. The technical challenge deals with the question of how to encode human values in artificial intelligence. The normative challenge is associated with two questions: “Which (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  43.  39
    A comment on the pursuit to align AI: we do not need value-aligned AI, we need AI that is risk-averse.Rebecca Raper - forthcoming - AI and Society:1-3.
  44.  25
    The problem of alignment.Tsvetelina Hristova, Liam Magee & Karen Soldatic - forthcoming - AI and Society:1-15.
    Large language models (LLMs) produce sequences learned as statistical patterns from large corpora. Their emergent status as representatives of the advances in artificial intelligence (AI) have led to an increased attention to the possibilities of regulating the automated production of linguistic utterances and interactions with human users in a process that computer scientists refer to as ‘alignment’—a series of technological and political mechanisms to impose a normative model of morality on algorithms and networks behind the model. Alignment, which (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  45.  29
    Instilling moral value alignment by means of multi-objective reinforcement learning.M. Rodriguez-Soto, M. Serramia, M. Lopez-Sanchez & J. Antonio Rodriguez-Aguilar - 2022 - Ethics and Information Technology 24 (9).
    AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent’s individual and ethical objectives. The second step consists in (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  46.  26
    Aligning Anatomy Ontologies in the Ontology Alignment Evaluation Initiative.Patrick Lambrix, Qiang Liu & He Tan - forthcoming - The Swedish Ai Society Workshop May 27-28, 2009 Ida, Linköping University.
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  47. AI Survival Stories: a Taxonomic Analysis of AI Existential Risk.Herman Cappelen, Simon Goldstein & John Hawthorne - forthcoming - Philosophy of Ai.
    Since the release of ChatGPT, there has been a lot of debate about whether AI systems pose an existential risk to humanity. This paper develops a general framework for thinking about the existential risk of AI systems. We analyze a two-premise argument that AI systems pose a threat to humanity. Premise one: AI systems will become extremely powerful. Premise two: if AI systems become extremely powerful, they will destroy humanity. We use these two premises to construct a taxonomy of ‘survival (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  48.  16
    The state as a model for AI control and alignment.Micha Elsner - forthcoming - AI and Society:1-11.
    Debates about the development of artificial superintelligence and its potential threats to humanity tend to assume that such a system would be historically unprecedented, and that its behavior must be predicted from first principles. I argue that this is not true: we can analyze multiagent intelligent systems (the best candidates for practical superintelligence) by comparing them to states, which also unite heterogeneous intelligences to achieve superhuman goals. States provide a model for several problems discussed in the literature on superintelligence, such (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  49.  20
    Knowledge-augmented face perception: Prospects for the Bayesian brain-framework to align AI and human vision.Martin Maier, Florian Blume, Pia Bideau, Olaf Hellwich & Rasha Abdel Rahman - 2022 - Consciousness and Cognition 101:103301.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  50.  19
    Minangkabaunese matrilineal: The correlation between the Qur’an and gender.Halimatussa’Diyah Halimatussa’Diyah, Kusnadi Kusnadi, Ai Y. Yuliyanti, Deddy Ilyas & Eko Zulfikar - 2024 - HTS Theological Studies 80 (1):7.
    Upon previous research, the matrilineal system seems to oppose Islamic teaching. However, the matrilineal system practiced by the Minangkabau society in West Sumatra, Indonesia has its uniqueness. Thus, this study aims to examine the correlation between the Qur’an and gender roles within the context of Minangkabau customs, specifically focusing on the matrilineal aspect. The present study employs qualitative methods for conducting library research through critical analysis. This study discovered that the matrilineal system practiced by the Minangkabau society aligns with Qur’anic (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
1 — 50 / 976