An InternalInference Based Multiagent Learning Method
HAN Wei1,2, CHEN YouGuang2, JIANG ChangHua2
1.Information Science and Engineering College, Nanjing University of Financial and Economics, Nanjing 210046 2.Information Science and Technology College, East China Normal University, Shanghai 200062
Abstract:In multiagent environment, the optimal policy of an agent depends on the policies of the others, which makes the learning more problematic. Previous algorithms based on the observed behavior of opponents can not fully present individual rationality. An efficient online learning algorithm based on the internal inference is proposed, which integrates the observed objective behavior and the subjective inferential intention of the opponents. By the internal inference, agents can obtain more information about opponents, and thus learn more efficiently. Simulations results prove that the proposed algorithm performs well in classical coordination game.
[1] Littman M L. Markov Games as a Framework for MultiAgent Reinforcement Learning // Cohen W W, Hirsh H, eds. Proc of the 11th International Conference on Machine Learning. New Brunswick, USA, 1994: 157163 [2] Hu Junling, Wellman M P. Multiagent Reinforcement Learning: Theoretical Framework and Algorithm // Proc of the 15th International Conference on Machine Learning. Madison, USA, 1998: 242250 [3] Bowling M, Veloso M. Rational and Convergent Learning in Stochastic Games // Proc of the 17th International Joint Conference of Artificial Intelligence. Seattle, USA, 2001: 10211026 [4] Bowling M, Veloso M. Multiagent Learning Using a Variable Learning Rate. Artificial Intelligence, 2002, 136(2): 215250 [5] Shapley L S. Stochastic Games. Proc of the National Academy of Sciences,1953, 39: 10951100 [6] Weiss G, Sen S. Adaptation and Learning in Multiagent Systems // Weiss G, Sen S, eds. Lecture Notes in Artificial Intelligence. Berlin, Germany: SpringerVerlag, 1996, 1042: 221229 [7] Stone P, Veloso M. Multiagent Systems: A Survey from a Machine Learning Perspective. Autonomous Robotics, 2002, 8(3): 345383 [8] Sutton R S, Barto A G. Reinforcement Learning. Cambridge, USA: MIT Press, 1998 [9] Claus C, Boutilier C. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems // Proc of the 15th National Conference on Artificial Intelligence. Cambridge, USA: MIT Press, 1997: 235262 [10] Fudenberg D, Levine D K. The Theory of Learning in Games. Cambridge, USA: MIT Press, 1998 [11] Brafman R I, Tennenholtz M. Learning to Coordinate Efficiently: A Model Based Approach. Journal of Artificial Intelligence Research, 2003, 19(1): 1123 [12] Copper R W. Coordination Games: Complementarities and Macroeconomics. Cambridge, UK: Cambridge University Press, 1998 [13] Mataric M J. Interaction and Intelligent Behavior. Ph.D Dissertation. Cambridge, USA: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, 1994: 2223