Abstract:Interactive Dynamic Influence Diagrams(I-DIDs), as graphic models based on probabilistic graphical theory, are proposed to represent, the sequential decision-making problem over multiple time steps in the presence of other interacting agents. The algorithms for solving I-DIDs are haunted by the challenge of an exponentially growing space of candidate models ascribed to other agents over time. In this paper, in order to reduce the candidate model space according the behaviorally equivalent theory, a more efficient way to construct Epsilon behavior equivalence classes is discussed that using belief-behavior graph (BBG). A method of solving I-DIDs approximately is presented, which avoids solving all candidate models by clustering models with beliefs that are spatially close and selecting a representative one from each cluster. The simulation results show the validity of the improved algorithm.
[1] Tatman J A,Shachter R D.Dynamic Programming and Influence Diagrams.IEEE Trans on Systems,Man and Cybernetics,1990,20: 365-379 [2] Yao Hongliang,Wang Hao,Zhang Yousheng,et al.Multi-Agent Dynamic Influence Diagrams and Its Approximation of Probability Distribution.Pattern Recognition and Artificial Intelligence,2007,20(4): 521-532 (in Chinese) (姚宏亮,王 浩,张佑生,等.多Agent动态影响图及其概率分布的近似方法.模式识别与人工智能,2007,20(4): 521-532) [3] Yao Hongliang,Wang Hao,Wang Ronggui,et al.Approximate Computation of Multi-Agent Dynamic Influence Diagrams.Journal of Computer Research and Development,2008,45(3): 487-495 (in Chinese) (姚宏亮,王 浩,汪荣贵,等.多Agent动态影响图的近似计算方 法.计算机研究与发展,2008,45(3): 487-495) [4] Gmytrasiewicz P J,Doshi P.A Framework for Sequential Planning in Multi-Agent Settings.Journal of Artificial Intelligence Research,2005,24(1): 49-79 [5] Doshi P,Zeng Y F,Chen Q Y.Graphical Models for Interactive POMDPs: Representation and Solutions.Journal of Autonomous Agents and Multi-Agent Systems,2009,18(3): 376-416 [6] Polich K,Gmytrasiewicz P J.Interactive Dynamic Influence Diagrams // Proc of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems,New York,USA: ACM Press,2007: 147-149 [7] Zeng Y F,Doshi P,Chen Q Y.Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering // Proc of the 22nd International Conference on Association for the Advancement of Artificial Intelligence.Vancouver,Canada: AAAI Press,2007: 782-787 [8] Zeng Y F,Doshi P.Speeding up Exact Solutions of Interactive Dynamic Influence Diagrams Using Action Equivalence // Proc of the 21st International Joint Conference on Artificial Intelligence.Pasadena,USA,2009: 1996-2001 [9] Doshi P,Zeng Y F.Improved Approximation of Interactive Dynamic Influence Diagrams Using Discriminative Model Updates // Proc of the 8th International Conference on Autonomous Agents and Multi-Agent Systems.Budapest,Hungray,2009: 907-914 [10] Smallwood R D,Sondik E J.The Optimal Control of Partially Observable Markov Decision Processes over a Finite Horizon.Operations Research,1973,21(5): 1071-1088 [11] Pynadath D V,Marsella S C.Minimal Mental Models // Proc of the 22nd International Conference on Association for the Advancement of Artificial Intelligence.Vancouver,Canada,2007: 1038-1044 [12] Geng S Y,Qun W L.Discrete Mathematics.Beijing: Higher Education Press,1998 (耿素云,屈婉玲.离散数学.北京:高等教育出版社,1998)