マルチエージェント連続タスクにおける報酬設計の実験的考察: RoboCup Soccer Keepaway タスクを例として

Transactions of the Japanese Society for Artificial Intelligence 21 (6):537-546 (2006)
  Copy   BIBTEX

Abstract

In this paper, we discuss guidelines for a reward design problem that defines when and what amount of reward should be given to the agent/s, within the context of reinforcement learning approach. We would like to take keepaway soccer as a standard task of the multiagent domain which requires skilled teamwork. The difficulties of designing reward for this task are due to its features as follows: i) since it belongs to the continuing task which has no explicit goal to achieve, it is hard to tell when reward should be given to the agent/s. ii) since it is a multiagent cooperative task, it is hard to decide what is a fair share of reward for each agent's contribution to achieve the goal. Through some experiments, we show that the reward design have a major effect on the agent's behavior, and introduce the successful reward function that makes agents perform keepaway better and more interesting than the conventional one does. Finally, we explore the relationship between `reward design' and `acquired behaviors' from the viewpoint of teamwork.

Other Versions

No versions found

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 101,423

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

環境状況に応じて自己の報酬を操作する学習エージェントの構築.沼尾 正行 森山 甲一 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:676-683.
Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.
罰を回避する合理的政策の学習.坪井 創吾 宮崎 和光 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (2):185-192.
Reward-based distractor interference: associative learning and interference stage.Bing Li - 2021 - Dissertation, Ludwig Maximilians Universität, München
尤度情報に基づく温度分布を用いた強化学習法.鈴木 健嗣 小堀 訓成 - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:297-305.

Analytics

Added to PP
2014-03-19

Downloads
23 (#949,443)

6 months
4 (#1,269,568)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references