基于内省推理的多agent在线学习方法

摘要
图/表
参考文献
相关文章 (9)

全文: PDF (422 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要多agent环境下agent的最优策略取决于其它agent的策略，这使得学习目标不易被清晰定义.基于客观观察行为建模的方法并不能很好体现智能体的个体理性.本文提出基于内省推理方法的多智能体环境下智能体高效在线学习方法，将基于对手模型的客观观察行为与基于换位思考推理的主观意图推测结合起来，智能体通过内省推理能够更多地得到对手的信息.针对经典协调博弈进行仿真实验，结果表明能取得较好的协调性能.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	韩伟
	陈优广
	姜昌华

关键词 ：多智能体系统, 在线学习, 内省推理, 电子市场

Abstract：In multiagent environment, the optimal policy of an agent depends on the policies of the others, which makes the learning more problematic. Previous algorithms based on the observed behavior of opponents can not fully present individual rationality. An efficient online learning algorithm based on the internal inference is proposed, which integrates the observed objective behavior and the subjective inferential intention of the opponents. By the internal inference, agents can obtain more information about opponents, and thus learn more efficiently. Simulations results prove that the proposed algorithm performs well in classical coordination game.

Key words： Multiagent System OnlineLearning Internal Inference Electronic Market

收稿日期: 2005-05-16

ZTFLH:

TP181.1

作者简介: 韩伟，男，1975年生，博士，主要研究方向为多智能体系统、自治计算.Email:dallashw@gmail.com.陈优广，男，1971年生，博士，主要研究方向为模式识别与图像处理.姜昌华，男，1973年生，博士，主要研究方向为智能算法.

引用本文:

韩伟，陈优广，姜昌华. 基于内省推理的多agent在线学习方法[J]. 模式识别与人工智能, 2007, 20(2): 254-260. HAN Wei , CHEN YouGuang , JIANG ChangHua. An InternalInference Based Multiagent Learning Method. , 2007, 20(2): 254-260.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2007/V20/I2/254

[1] Littman M L. Markov Games as a Framework for MultiAgent Reinforcement Learning // Cohen W W, Hirsh H, eds. Proc of the 11th International Conference on Machine Learning. New Brunswick, USA, 1994: 157163
[2] Hu Junling, Wellman M P. Multiagent Reinforcement Learning: Theoretical Framework and Algorithm // Proc of the 15th International Conference on Machine Learning. Madison, USA, 1998: 242250
[3] Bowling M, Veloso M. Rational and Convergent Learning in Stochastic Games // Proc of the 17th International Joint Conference of Artificial Intelligence. Seattle, USA, 2001: 10211026
[4] Bowling M, Veloso M. Multiagent Learning Using a Variable Learning Rate. Artificial Intelligence, 2002, 136(2): 215250
[5] Shapley L S. Stochastic Games. Proc of the National Academy of Sciences,1953, 39: 10951100
[6] Weiss G, Sen S. Adaptation and Learning in Multiagent Systems // Weiss G, Sen S, eds. Lecture Notes in Artificial Intelligence. Berlin, Germany: SpringerVerlag, 1996, 1042: 221229
[7] Stone P, Veloso M. Multiagent Systems: A Survey from a Machine Learning Perspective. Autonomous Robotics, 2002, 8(3): 345383
[8] Sutton R S, Barto A G. Reinforcement Learning. Cambridge, USA: MIT Press, 1998
[9] Claus C, Boutilier C. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems // Proc of the 15th National Conference on Artificial Intelligence. Cambridge, USA: MIT Press, 1997: 235262
[10] Fudenberg D, Levine D K. The Theory of Learning in Games. Cambridge, USA: MIT Press, 1998
[11] Brafman R I, Tennenholtz M. Learning to Coordinate Efficiently: A Model Based Approach. Journal of Artificial Intelligence Research, 2003, 19(1): 1123
[12] Copper R W. Coordination Games: Complementarities and Macroeconomics. Cambridge, UK: Cambridge University Press, 1998
[13] Mataric M J. Interaction and Intelligent Behavior. Ph.D Dissertation. Cambridge, USA: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, 1994: 2223