将多Agent影响图(MAIDs)在时间上进行扩展,提出一种决策模型:多Agent动态影响图(MADIDs),用于表示动态环境中多Agent协作的结构关系.为了有效计算MADIDs的概率分布,以Agents之间的策略偏序关系为指导,给出概率分布的一种分解近似方法,进而讨论概率分布在推理中的近似.对MADIDs概率分布计算的复杂性、误差以及误差在时间上的传播进行分析,进而基于KL差分,给出一个可对近似分布的精度和复杂性进行均衡的函数.最后,针对一个表示协作关系的MADID模型,进行实验和算法比较,实验结果显示该概率分布近似方法的有效性.
Abstract
MultiAgent dynamic influence diagrams (MADIDs) are presented by extending MultiAgent Influence Diagrams (MAIDs) over time. Thus the structural relationships of coordination can be represented in dynamic environment. With the guidance of the strategic relevance, a decomposition approximation method of probability distribution and the approximation of probability distribution in inference are discussed to compute the probability distribution of MADIDs efficiently. The complexity, inducing error and error propagation over time are analyzed. Furthermore, based on the KLdivergence, a function is introduced to establish equilibrium between the precision and the complexity of approximate distribution. Finally, the experimental results on a dynamic coordination model show the validity of the probability distribution approximation method.
关键词
多Agent动态影响图(MADIDs) /
KL差分 /
联合树 /
扩展BK(EBK)算法
{{custom_keyword}} /
Key words
MultiAgent Dynamic Influence Diagrams (MADIDs) /
Kullbak Leibler (KL)Divergence /
Junction Tree /
Extensive Boyen Kollen (EBK) Algorithm
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Oliver N M, Rosario B, Pentland A P. A Bayesian Computer Vision System for Modeling Human Interactions. IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22(8): 831843
[2] Wang Hongwei, Li Shen, Liu Huixin. Entropic Measurements of Complexity for Markov Decision Processes. Control and Decision, 2004, 19(9): 983987,993 (in Chinese)
(王红卫,李 琛,刘会新.马尔可夫决策过程复杂性的熵测度.控制与决策, 2004, 19(9): 983987,993)
[3] Boutilier C, Poole D. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations // Proc of the 13th National Conference on Artificial Intelligence. Portland, USA, 1996: 11681175
[4] Barto A G, Mahadevan S. Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems, 2003, 13(1/2): 4177
[5] Dagum P, Luby M. Approximating Probabilistic Inference Using Bayesian Networks Is NPHard. Artificial Intelligence, 1993, 60(1): 141153
[6] Howard R A, Matheson J E. Influence Diagrams. Readings on the Principles and Applications of Decision Analysis, 1984, 11(2): 719762
[7] Koller D, Milch B. MultiAgent Influence Diagrams for Representing and Solving Games. Games and Economic Behavior, 2003, 45(1): 181221
[8] Gal Y, Pfeffer A. A Language for Modeling Agents Decision Making Processes in Games // Proc of the 2nd International Joint Conference on Autonomous Agents and Multiagent Systems. Melbourne, Australia, 2003: 265272
[9] Boyen X, Kollen D. Tractable Inference for Complex Stochastic Processes // Proc of the 14th Annual Conference on Uncertainty in Artificial Intelligence. Madison, USA, 1998: 3342
[10] Frick M, Groiie M. Deciding FirstOrder Properties of Locally TreeDecomposable Graphs. Journal of the ACM, 2001, 48(6): 11841206
[11] Draper D. Clustering without (Thinking about) Triangulation // Proc of the 11th Annual Conference on Uncertainty in Artificial Intelligence. Montreal, Canada, 1995: 125133
[12] Rached Z, Alajaji F, Campbell L L. The KullbackLeibler Divergence Rate between Markov Sources Information Theory. IEEE Trans on Information Theory, 2004, 50(5): 917921
[13] Kjaerulff U. Reduction of Computational Complexity in Bayesian Networks through Removal of Weak Dependences // Proc of the 10th Annual Conference on Uncertainty in Artificial Intelligence. Seattle, USA, 1994: 374382
[14] Paskin M A. Thin Junction Tree Filters Frontier for Simultaneous Localization and Mapping // Proc of the 18th International Joint Conference on Artificial Intelligence. Acapulco, Mexico, 2003: 11571164
[15] Burkhard H D, Duhaut D, Fujita M, et al. The Road to RoboCup 2050. Robotics & Automation Magazinge, 2002, 9(2): 3138
[16] Tian Fengzhan, Zhang Hongwei, Lu Yuchang, et al. Simplification of Inferences in Multiply Sectioned Bayesian Networks. Journal of Computer Research and Development, 2003, 40(8): 12301237 (in Chinese)
(田凤占,张宏伟,陆玉昌,等.多模块贝叶斯网络中推理的简化.计算机研究与发展, 2003, 40(8): 12301237)
[17] Murphy K. The Bayes Net Toolbox for Matlab. Computing Science Statics, 2001, 33(2): 331351
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金(No.60575023)、教育部博士点基金(No.20050359012)资助项目
{{custom_fund}}