Results for 'multiagent systems, robocup soccer, keepaway, reinforcement learning, reward design'

971 found
Order:
  1.  23
    マルチエージェント連続タスクにおける報酬設計の実験的考察: RoboCup Soccer Keepaway タスクを例として.Tanaka Nobuyuki Arai Sachiyo - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21 (6):537-546.
    In this paper, we discuss guidelines for a reward design problem that defines when and what amount of reward should be given to the agent/s, within the context of reinforcement learning approach. We would like to take keepaway soccer as a standard task of the multiagent domain which requires skilled teamwork. The difficulties of designing reward for this task are due to its features as follows: i) since it belongs to the continuing task which (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  2.  35
    Deep Reinforcement Learning for Vectored Thruster Autonomous Underwater Vehicle Control.Tao Liu, Yuli Hu & Hui Xu - 2021 - Complexity 2021:1-25.
    Autonomous underwater vehicles are widely used to accomplish various missions in the complex marine environment; the design of a control system for AUVs is particularly difficult due to the high nonlinearity, variations in hydrodynamic coefficients, and external force from ocean currents. In this paper, we propose a controller based on deep reinforcement learning in a simulation environment for studying the control performance of the vectored thruster AUV. RL is an important method of artificial intelligence that can learn behavior (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  3.  45
    Optimization of English Online Learning Dictionary System Based on Multiagent Architecture.Ying Wang - 2021 - Complexity 2021:1-10.
    As a universal language in the world, English has become a necessary language communication tool under the globalization of trade. Intelligent, efficient, and reasonable English language-assisted learning system helps to further improve the English ability of language learners. English online learning dictionary, as an important query tool for English learners, is an important part of English online learning. This paper will optimize the design of English online learning dictionary system based on multiagent architecture. Based on the hybrid (...) cooperative algorithm, this paper will improve the disadvantages of the online English learning dictionary system and propose an appropriate dictionary application evaluation function. At the same time, an improved reinforcement learning algorithm is introduced into the corresponding English online learning dictionary navigation problem so as to improve the efficiency of the online English learning dictionary system. English online learning dictionary is more intelligent and efficient. In this paper, the new online learning dictionary system optimization algorithm is proposed and compared with the traditional system algorithm. The experimental results show that the algorithm proposed in this paper solves the collaborative confusion problem of English learning online dictionary to a certain extent and further solves the corresponding navigation problem so as to improve the efficiency. (shrink)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  4. Karlsruhe Brainstormers a Reinforcement Learning Approach to Robotic Soccer. P. Stone, T. Balch and G. Kraetszchmar, eds, RoboCup 2000: Robot Soccer World Cup IV. [REVIEW]M. Riedmiller & A. Merke - 1999 - In P. Brezillon & P. Bouquet (eds.), Lecture Notes in Artificial Intelligence. Springer.
  5.  22
    Iterative Learning Tracking Control of Nonlinear Multiagent Systems with Input Saturation.Bingyou Liu, Zhengzheng Zhang, Lichao Wang, Xing Li & Xiongfeng Deng - 2021 - Complexity 2021:1-13.
    A tracking control algorithm of nonlinear multiple agents with undirected communication is studied for each multiagent system affected by external interference and input saturation. A control design scheme combining iterative learning and adaptive control is proposed to perform parameter adaptive time-varying adjustment and prove the effectiveness of the control protocol by designing Lyapunov functions. Simulation results show that the high-precision tracking control problem of the nonlinear multiagent system based on adaptive iterative learning control can be well realized (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  6.  17
    Reinforcement Learning-Based Collision Avoidance Guidance Algorithm for Fixed-Wing UAVs.Yu Zhao, Jifeng Guo, Chengchao Bai & Hongxing Zheng - 2021 - Complexity 2021:1-12.
    A deep reinforcement learning-based computational guidance method is presented, which is used to identify and resolve the problem of collision avoidance for a variable number of fixed-wing UAVs in limited airspace. The cooperative guidance process is first analyzed for multiple aircraft by formulating flight scenarios using multiagent Markov game theory and solving it by machine learning algorithm. Furthermore, a self-learning framework is established by using the actor-critic model, which is proposed to train collision avoidance decision-making neural networks. To (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  7.  31
    Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.
    Can humans get arbitrarily capable reinforcement learning agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  8.  12
    Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices.Pedro Juan Rivera Torres, Carlos Gershenson García, María Fernanda Sánchez Puig & Samir Kanaan Izquierdo - 2022 - Complexity 2022:1-15.
    The area of smart power grids needs to constantly improve its efficiency and resilience, to provide high quality electrical power in a resilient grid, while managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities and novel methodologies to detect, classify, and isolate faults and failures and model and simulate processes with predictive algorithms and analytics. In this paper, we showcase the application of a complex-adaptive, self-organizing (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  9.  36
    Melatonin Secretion during a Short Nap Fosters Subsequent Feedback Learning.Christian D. Wiesner, Valentia Davoli, David Schürger, Alexander Prehn-Kristensen & Lioba Baving - 2018 - Frontiers in Human Neuroscience 11:304534.
    Sleep helps to protect and renew hippocampus-dependent declarative learning. Less is known about forms of learning that mainly engage the dopaminergic reward system. Animal studies showed that exogenous melatonin modulates the responses of the dopaminergic reward system and acts as a neuroprotectant promoting memory. In humans, melatonin is mainly secreted in darkness during evening hours supporting sleep. In this study, we investigate the effects of a short period of daytime sleep (nap) and endogenous melatonin on reward learning. (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  10.  24
    A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning.Jian Sun & Jie Li - 2018 - Complexity 2018:1-15.
    The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  11.  59
    SA w S u: An Integrated Model of Associative and Reinforcement Learning.Vladislav D. Veksler, Christopher W. Myers & Kevin A. Gluck - 2014 - Cognitive Science 38 (3):580-598.
    Successfully explaining and replicating the complexity and generality of human and animal learning will require the integration of a variety of learning mechanisms. Here, we introduce a computational model which integrates associative learning (AL) and reinforcement learning (RL). We contrast the integrated model with standalone AL and RL models in three simulation studies. First, a synthetic grid‐navigation task is employed to highlight performance advantages for the integrated model in an environment where the reward structure is both diverse and (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  12.  29
    Action control, forward models and expected rewards: representations in reinforcement learning.Jami Pekkanen, Jesse Kuokkanen, Otto Lappi & Anna-Mari Rusanen - 2021 - Synthese 199 (5-6):14017-14033.
    The fundamental cognitive problem for active organisms is to decide what to do next in a changing environment. In this article, we analyze motor and action control in computational models that utilize reinforcement learning (RL) algorithms. In reinforcement learning, action control is governed by an action selection policy that maximizes the expected future reward in light of a predictive world model. In this paper we argue that RL provides a way to explicate the so-called action-oriented views of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  13.  25
    Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning.Xiaoyi Long, Zheng He & Zhongyuan Wang - 2021 - Complexity 2021:1-7.
    This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network -based reinforcement learning method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  14.  80
    Online Supervised Learning with Distributed Features over Multiagent System.Xibin An, Bing He, Chen Hu & Bingqi Liu - 2020 - Complexity 2020:1-10.
    Most current online distributed machine learning algorithms have been studied in a data-parallel architecture among agents in networks. We study online distributed machine learning from a different perspective, where the features about the same samples are observed by multiple agents that wish to collaborate but do not exchange the raw data with each other. We propose a distributed feature online gradient descent algorithm and prove that local solution converges to the global minimizer with a sublinear rate O 2 T. Our (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  15.  15
    Dynamic Large-Scale Server Scheduling for IVF Queuing Network in Cloud Healthcare System.Yafei Li, Hongfeng Wang, Li Li & Yaping Fu - 2021 - Complexity 2021:1-15.
    As one of the most effective medical technologies for the infertile patients, in vitro fertilization has been more and more widely developed in recent years. However, prolonged waiting for IVF procedures has become a problem of great concern, since this technology is only mastered by the large general hospitals. To deal with the insufficiency of IVF service capacity, this paper studies an IVF queuing network in an integrated cloud healthcare system, where the two key medical services, that is, egg retrieval (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  16.  20
    Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning.Rui Wang, Xianghua Gan, Qing Li & Xiao Yan - 2021 - Complexity 2021:1-17.
    We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  17.  37
    Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning.Daniel J. Schad, Elisabeth Jünger, Miriam Sebold, Maria Garbusow, Nadine Bernhardt, Amir-Homayoun Javadi, Ulrich S. Zimmermann, Michael N. Smolka, Andreas Heinz, Michael A. Rapp & Quentin J. M. Huys - 2014 - Frontiers in Psychology 5:117016.
    Theories of decision-making and its neural substrates have long assumed the existence of two distinct and competing valuation systems, variously described as goal-directed vs. habitual, or, more recently and based on statistical arguments, as model-free vs. model-based reinforcement-learning. Though both have been shown to control choices, the cognitive abilities associated with these systems are under ongoing investigation. Here we examine the link to cognitive abilities, and find that individual differences in processing speed covary with a shift from model-free to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  18.  12
    Combination of fuzzy control and reinforcement learning for wind turbine pitch control.J. Enrique Sierra-Garcia & Matilde Santos - forthcoming - Logic Journal of the IGPL.
    The generation of the pitch control signal in a wind turbine (WT) is not straightforward due to the nonlinear dynamics of the system and the coupling of its internal variables; in addition, they are subjected to the uncertainty that comes from the random nature of the wind. Fuzzy logic has proved useful in applications with changing system parameters or where uncertainty is relevant as in this one, but the tuning of the fuzzy logic controller (FLC) parameters is neither straightforward nor (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19.  17
    罰を回避する合理的政策の学習.坪井 創吾 宮崎 和光 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (2):185-192.
    Reinforcement learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to rewards. In general, the purpose of reinforcement learning system is to acquire an optimum policy that can maximize expected reward per an action. However, it is not always important for any environment. Especially, if we apply reinforcement learning system to engineering, environments, we expect the agent to avoid all penalties. In Markov Decision Processes, a (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  20. Current cases of AI misalignment and their implications for future risks.Leonard Dung - 2023 - Synthese 202 (5):1-23.
    How can one build AI systems such that they pursue the goals their designers want them to pursue? This is the alignment problem. Numerous authors have raised concerns that, as research advances and systems become more powerful over time, misalignment might lead to catastrophic outcomes, perhaps even to the extinction or permanent disempowerment of humanity. In this paper, I analyze the severity of this risk based on current instances of misalignment. More specifically, I argue that contemporary large language models and (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  21.  22
    Emotional State and Feedback-Related Negativity Induced by Positive, Negative, and Combined Reinforcement.Shuyuan Xu, Yuyan Sun, Min Huang, Yanhong Huang, Jing Han, Xuemei Tang & Wei Ren - 2021 - Frontiers in Psychology 12:647263.
    Reinforcement learning relies on the reward prediction error (RPE) signals conveyed by the midbrain dopamine system. Previous studies showed that dopamine plays an important role in both positive and negative reinforcement. However, whether various reinforcement processes will induce distinct learning signals is still unclear. In a probabilistic learning task, we examined RPE signals in different reinforcement types using an electrophysiology index, namely, the feedback-related negativity (FRN). Ninety-four participants were randomly assigned into four groups: base (no (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  22.  28
    CortexVR: Immersive analysis and training of cognitive executive functions of soccer players using virtual reality and machine learning.Christian Krupitzer, Jens Naber, Jan-Philipp Stauffert, Jan Mayer, Jan Spielmann, Paul Ehmann, Noel Boci, Maurice Bürkle, André Ho, Clemens Komorek, Felix Heinickel, Samuel Kounev, Christian Becker & Marc Erich Latoschik - 2022 - Frontiers in Psychology 13.
    GoalThis paper presents an immersive Virtual Reality system to analyze and train Executive Functions of soccer players. EFs are important cognitive functions for athletes. They are a relevant quality that distinguishes amateurs from professionals.MethodThe system is based on immersive technology, hence, the user interacts naturally and experiences a training session in a virtual world. The proposed system has a modular design supporting the extension of various so-called game modes. Game modes combine selected game mechanics with specific simulation content to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  23.  65
    Evolutionary psychology, learning, and belief signaling: design for natural and artificial systems.Eric Funkhouser - 2021 - Synthese 199 (5-6):14097-14119.
    Recent work in the cognitive sciences has argued that beliefs sometimes acquire signaling functions in virtue of their ability to reveal information that manipulates “mindreaders.” This paper sketches some of the evolutionary and design considerations that could take agents from solipsistic goal pursuit to beliefs that serve as social signals. Such beliefs will be governed by norms besides just the traditional norms of epistemology. As agents become better at detecting the agency of others, either through evolutionary history or individual (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  24. HCI Model with Learning Mechanism for Cooperative Design in Pervasive Computing Environment.Hong Liu, Bin Hu & Philip Moore - 2015 - Journal of Internet Technology 16.
    This paper presents a human-computer interaction model with a three layers learning mechanism in a pervasive environment. We begin with a discussion around a number of important issues related to human-computer interaction followed by a description of the architecture for a multi-agent cooperative design system for pervasive computing environment. We present our proposed three- layer HCI model and introduce the group formation algorithm, which is predicated on a dynamic sharing niche technology. Finally, we explore the cooperative reinforcement learning (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  25.  13
    Optimization of the Rapid Design System for Arts and Crafts Based on Big Data and 3D Technology.Haihan Zhou - 2021 - Complexity 2021:1-10.
    In this paper, to solve the problem of slow design of arts and crafts and to improve design efficiency and aesthetics, the existing big data and 3D technology are used to conduct an in-depth analysis of the optimization of the rapid design system of arts and crafts machine salt baking. In the system requirement analysis, the functional modules of this system are identified as nine functional modules such as design terminology management system and external information import (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  26.  30
    Qdsega による多足ロボットの歩行運動の獲得.Matsuno Fumitoshi Ito Kazuyuki - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:363-372.
    Reinforcement learning is very effective for robot learning. Because it does not need priori knowledge and has higher capability of reactive and adaptive behaviors. In our previous works, we proposed new reinforcement learning algorithm: “Q-learning with Dynamic Structuring of Exploration Space Based on Genetic Algorithm (QDSEGA)”. It is designed for complicated systems with large action-state space like a robot with many redundant degrees of freedom. And we applied it to 50 link manipulator and effective behavior is acquired. However (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  27. An Analysis of the Interaction Between Intelligent Software Agents and Human Users.Christopher Burr, Nello Cristianini & James Ladyman - 2018 - Minds and Machines 28 (4):735-774.
    Interactions between an intelligent software agent and a human user are ubiquitous in everyday situations such as access to information, entertainment, and purchases. In such interactions, the ISA mediates the user’s access to the content, or controls some other aspect of the user experience, and is not designed to be neutral about outcomes of user choices. Like human users, ISAs are driven by goals, make autonomous decisions, and can learn from experience. Using ideas from bounded rationality, we frame these interactions (...)
    Direct download (8 more)  
     
    Export citation  
     
    Bookmark   40 citations  
  28.  81
    Reinforcement learning and artificial agency.Patrick Butlin - 2024 - Mind and Language 39 (1):22-38.
    There is an apparent connection between reinforcement learning and agency. Artificial entities controlled by reinforcement learning algorithms are standardly referred to as agents, and the mainstream view in the psychology and neuroscience of agency is that humans and other animals are reinforcement learners. This article examines this connection, focusing on artificial reinforcement learning systems and assuming that there are various forms of agency. Artificial reinforcement learning systems satisfy plausible conditions for minimal agency, and those which (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  29.  91
    Précis of the brain and emotion.Edmund T. Rolls - 2000 - Behavioral and Brain Sciences 23 (2):177-191.
    The topics treated in The brain and emotion include the definition, nature, and functions of emotion (Ch. 3); the neural bases of emotion (Ch. 4); reward, punishment, and emotion in brain design (Ch. 10); a theory of consciousness and its application to understanding emotion and pleasure (Ch. 9); and neural networks and emotion-related learning (Appendix). The approach is that emotions can be considered as states elicited by reinforcers (rewards and punishers). This approach helps with understanding the functions of (...)
    Direct download (6 more)  
     
    Export citation  
     
    Bookmark   25 citations  
  30.  28
    Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.
    In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when it gets (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  31.  73
    Distributed Coordination for a Class of High-Order Multiagent Systems Subject to Actuator Saturations by Iterative Learning Control.Nana Yang & Suoping Li - 2022 - Complexity 2022:1-18.
    This paper investigates a distributed coordination control for a class of high-order uncertain multiagent systems. Under the framework of iterative learning control, a novel fully distributed learning protocol is devised for the coordination problem of MASs including time-varying parameter uncertainties as well as actuator saturations. Meanwhile, the learning updating laws of various parameters are proposed. Utilizing Lyapunov theory and combining with Graph theory, the proposed algorithm can make each follower track a leader completely over a limited time interval even (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  32.  24
    Passively learned spatial navigation cues evoke reinforcement learning reward signals.Thomas D. Ferguson, Chad C. Williams, Ronald W. Skelton & Olave E. Krigolson - 2019 - Cognition 189 (C):65-75.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  33.  13
    Learning reward machines: A study in partially observable reinforcement learning.Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Ethan Waldie & Sheila A. McIlraith - 2023 - Artificial Intelligence 323 (C):103989.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  34.  57
    Supporting Acquisition of Spelling Skills in Different Orthographies Using an Empirically Validated Digital Learning Environment.Heikki Juhani Lyytinen, Margaret Semrud-Clikeman, Hong Li, Kenneth Pugh & Ulla Richardson - 2021 - Frontiers in Psychology 12.
    This paper discusses how the association learning principle works for supporting acquisition of basic spelling and reading skills using digital game-based learning environment with the Finland-based GraphoLearn technology. This program has been designed and validated to work with early readers of different alphabetic writing systems using repetition and reinforcing connections between spoken and written units. Initially GL was developed and found effective in training children at risk of reading disorders in Finland. Today GL training has been shown to support learning (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  35.  25
    強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
    Reinforcement Learning is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process into the standard RL agent model, which consists (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  36.  24
    Evolutionary Reinforcement Learning for Adaptively Detecting Database Intrusions.Seul-Gi Choi & Sung-Bae Cho - 2020 - Logic Journal of the IGPL 28 (4):449-460.
    Relational database management system is the most popular database system. It is important to maintain data security from information leakage and data corruption. RDBMS can be attacked by an outsider or an insider. It is difficult to detect an insider attack because its patterns are constantly changing and evolving. In this paper, we propose an adaptive database intrusion detection system that can be resistant to potential insider misuse using evolutionary reinforcement learning, which combines reinforcement learning and evolutionary learning. (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  37.  19
    Object‐Label‐Order Effect When Learning From an Inconsistent Source.Timmy Ma & Natalia L. Komarova - 2019 - Cognitive Science 43 (8):e12737.
    Learning in natural environments is often characterized by a degree of inconsistency from an input. These inconsistencies occur, for example, when learning from more than one source, or when the presence of environmental noise distorts incoming information; as a result, the task faced by the learner becomes ambiguous. In this study, we investigate how learners handle such situations. We focus on the setting where a learner receives and processes a sequence of utterances to master associations between objects and their labels, (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  38.  17
    Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance.W. Bradley Knox & Peter Stone - 2015 - Artificial Intelligence 225 (C):24-50.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  39. When, What, and How Much to Reward in Reinforcement Learning-Based Models of Cognition.Christian P. Janssen & Wayne D. Gray - 2012 - Cognitive Science 36 (2):333-358.
    Reinforcement learning approaches to cognitive modeling represent task acquisition as learning to choose the sequence of steps that accomplishes the task while maximizing a reward. However, an apparently unrecognized problem for modelers is choosing when, what, and how much to reward; that is, when (the moment: end of trial, subtask, or some other interval of task performance), what (the objective function: e.g., performance time or performance accuracy), and how much (the magnitude: with binary, categorical, or continuous values). (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  40. Reinforcement learning: A brief guide for philosophers of mind.Julia Haas - 2022 - Philosophy Compass 17 (9):e12865.
    In this opinionated review, I draw attention to some of the contributions reinforcement learning can make to questions in the philosophy of mind. In particular, I highlight reinforcement learning's foundational emphasis on the role of reward in agent learning, and canvass two ways in which the framework may advance our understanding of perception and motivation.
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  41.  21
    A Brief Overview of Optimal Robust Control Strategies for a Benchmark Power System with Different Cyberphysical Attacks.Bo Hu, Hao Wang, Yan Zhao, Hang Zhou, Mingkun Jiang & Mofan Wei - 2021 - Complexity 2021:1-10.
    Security issue against different attacks is the core topic of cyberphysical systems. In this paper, optimal control theory, reinforcement learning, and neural networks are integrated to provide a brief overview of optimal robust control strategies for a benchmark power system. First, the benchmark power system models with actuator and sensor attacks are considered. Second, we investigate the optimal control issue for the nominal system and review the state-of-the-art RL methods along with the NN implementation. Third, we propose several robust (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  42. Authors' Response: What to Do Next: Applying Flexible Learning Algorithms to Develop Constructivist Communication.B. Porr & P. Di Prodi - 2014 - Constructivist Foundations 9 (2):218-222.
    Upshot: We acknowledge that our model can be implemented with different reinforcement learning algorithms. Subsystem formation has been successfully demonstrated on the basal level, and in order to show full subsystem formation in the communication system at least both intentional utterances and acceptance/rejection need to be implemented. The comments about intrinsic vs extrinsic rewards made clear that this distinction is not helpful in the context of the constructivist paradigm but rather needs to be replaced by a critical reflection on (...)
     
    Export citation  
     
    Bookmark  
  43. Bidding in Reinforcement Learning: A Paradigm for Multi-Agent Systems.Chad Sessions - unknown
    The paper presents an approach for developing multi-agent reinforcement learning systems that are made up of a coalition of modular agents. We focus on learning to segment sequences (sequential decision tasks) to create modular structures, through a bidding process that is based on reinforcements received during task execution. The approach segments sequences (and divides them up among agents) to facilitate the learning of the overall task. Notably, our approach does not rely on a priori knowledge or a priori structures. (...)
     
    Export citation  
     
    Bookmark   1 citation  
  44.  60
    Reinforcement Learning and Counterfactual Reasoning Explain Adaptive Behavior in a Changing Environment.Yunfeng Zhang, Jaehyon Paik & Peter Pirolli - 2015 - Topics in Cognitive Science 7 (2):368-381.
    Animals routinely adapt to changes in the environment in order to survive. Though reinforcement learning may play a role in such adaptation, it is not clear that it is the only mechanism involved, as it is not well suited to producing rapid, relatively immediate changes in strategies in response to environmental changes. This research proposes that counterfactual reasoning might be an additional mechanism that facilitates change detection. An experiment is conducted in which a task state changes over time and (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  45.  23
    Predictive Movements and Human Reinforcement Learning of Sequential Action.Roy Kleijn, George Kachergis & Bernhard Hommel - 2018 - Cognitive Science 42 (S3):783-808.
    Sequential action makes up the bulk of human daily activity, and yet much remains unknown about how people learn such actions. In one motor learning paradigm, the serial reaction time (SRT) task, people are taught a consistent sequence of button presses by cueing them with the next target response. However, the SRT task only records keypress response times to a cued target, and thus it cannot reveal the full time‐course of motion, including predictive movements. This paper describes a mouse movement (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  46.  72
    Novelty and Inductive Generalization in Human Reinforcement Learning.Samuel J. Gershman & Yael Niv - 2015 - Topics in Cognitive Science 7 (3):391-415.
    In reinforcement learning, a decision maker searching for the most rewarding option is often faced with the question: What is the value of an option that has never been tried before? One way to frame this question is as an inductive problem: How can I generalize my previous experience with one set of options to a novel option? We show how hierarchical Bayesian inference can be used to solve this problem, and we describe an equivalence between the Bayesian model (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  47.  16
    Iterative Learning Consensus Control for Nonlinear Partial Difference Multiagent Systems with Time Delay.Cun Wang, Xisheng Dai, Kene Li & Zupeng Zhou - 2021 - Complexity 2021:1-15.
    This paper considers the consensus control problem of nonlinear spatial-temporal hyperbolic partial difference multiagent systems and parabolic partial difference multiagent systems with time delay. Based on the system’s own fixed topology and the method of generating the desired trajectory by introducing virtual leader, using the consensus tracking error between the agent and the virtual leader agent and neighbor agents in the last iteration, an iterative learning algorithm is proposed. The sufficient condition for the system consensus error to converge (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  48.  81
    Altricial self-organising information-processing systems ∗.Aaron Sloman - unknown
    It is often thought that there is one key design principle or at best a small set of design principles, underlying the success of biological organisms. Candidates include neural nets, ‘swarm intelligence’, evolutionary computation, dynamical systems, particular types of architecture or use of a powerful uniform learning mechanism, e.g. reinforcement learning. All of those support types of self-organising, self-modifying behaviours. But we are nowhere near understanding the full variety of powerful information-processing principles ‘discovered’ by evolution. By attending (...)
    Direct download  
     
    Export citation  
     
    Bookmark   2 citations  
  49.  26
    Leader-Following Consensus for Second-Order Nonlinear Multiagent Systems with Input Saturation via Distributed Adaptive Neural Network Iterative Learning Control.Xiongfeng Deng, Xiuxia Sun, Shuguang Liu & Boyang Zhang - 2019 - Complexity 2019:1-13.
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  50.  21
    Enforcing ethical goals over reinforcement-learning policies.Guido Governatori, Agata Ciabattoni, Ezio Bartocci & Emery A. Neufeld - 2022 - Ethics and Information Technology 24 (4):1-19.
    Recent years have yielded many discussions on how to endow autonomous agents with the ability to make ethical decisions, and the need for explicit ethical reasoning and transparency is a persistent theme in this literature. We present a modular and transparent approach to equip autonomous agents with the ability to comply with ethical prescriptions, while still enacting pre-learned optimal behaviour. Our approach relies on a normative supervisor module, that integrates a theorem prover for defeasible deontic logic within the control loop (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
1 — 50 / 971