1.合肥工业大学 计算机科学与信息学院 合肥 230009
2.常州纺织服装职业技术学院 计算机科学与技术系 常州 213164
3.Department of Computer Science and Technology, University of Vermont, Burlington, USA, 05405
Learning Bayesian Networks Using a Parallel EM Approach
YU Kui1,2, WANG Hao1, WU Xin-Dong1,3, YAO Hong-Liang1
1.Department of Computer Science and Technology, Hefei University of Technology, Hefei 2300092.
Department of Computer Science and Technology, Institute of Textile and Garment of Changzhou, Changzhou 2131643.
Department of Computer Science and Technology, University of Vermont, Burlington, USA, 05405
Computing the expected statistics is the main bottleneck in learning Bayesian networks. Firstly, a parallel expectation-maximization (PL-EM) algorithm for leaning Bayesian network parameters is presented. The PL-EM algorithm parallelizes the E-step and M-step and the greatly reduces the time complexity of the parameter learning. Then PL-EM algorithm is applied to learning Bayesian networks structure, and a parallel learning algorithm is proposed for learning Bayesian networks based on an existing structural EM algorithm (SEM), called PL-SEM. PL-SEM exploits PL-EM algorithm to compute the expected statistics at the structural E_Step. Thus, it can implement the parallel computation of the expected statistics and greatly reduce the time complexity of learning Bayesian networks.
[1] Ghahramani Z. An Introduction to Hidden Markov Models and Bayesian Networks. Journal of Pattern Recognition and Artificial Intelligence, 2001, 15(1): 9-42 [2] Chickering D M, Heckerman D. Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables. Machine Learning, 1997, 29(2/3): 181-221 [3] Wang Shuangcheng, Yuan Senmiao. Research on Learning Bayesian Networks Structure with Missing Data. Journal of Software, 2004, 15(7): 1042-1048 (in Chinese) (王双成,苑森淼.具有丢失数据的贝叶斯网络结构学习研究. 软件学报, 2004, 15(7): 1042-1048) [4] Friedman N. The Bayesian Structural EM Algorithm // Proc of the 14th Conference on Uncertainty in Artificial Intelligence. Madison, USA, 1998: 129-168 [5] Neal R M, Hinton G E. A View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants // Proc of the NATO Advanced Study Institute on Learning in Graphical Models. Norwell, USA, 1998: 355-368 [6] Thiesson B, Meek C, Heckerman D. Accelerating EM for Large Databases. Machine Learning, 2001, 45(3): 279-299 [7] Myers J W, Laskey K B, de Tong K A. Learning Bayesian Networks from Incomplete Data Using Evolutionary Algorithms // Proc of the Conference on Genetic and Evolutionary Computation. Orlando, USA, 1999: 458-465 [8] Tian Fengzhen, Lu Yuchang, Shi Chunyi. Learning Bayesian Networks with Hidden Variables Using the Combination of EM and Evolutionary Algorithms // Proc of the Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Hongkong, China, 2001: 568-574 [9] Luna J E O, Zanusso M B, Revisited E M. Algorithms for Learning Structure and Parameters in Bayesian Networks // Proc of the International Conference on Artificial Intelligence. Las Vegas, USA, 2005: 572-578 [10] Jeng W M, Huang S H S. Inter-Iteration Optimization of Parallel EM Algorithm on Message-Passing Multicomputers // Proc of the IEEE International Conference on Parallel Processing. Minneapolis, USA, 1998: 245-252 [11] Chen C M, Lee S Y. Optimal Data Replication: A New Approach to Optimizing Parallel EM Algorithms on a Mesh-Connected Multiprocessor for 3D PET Image Reconstruction. IEEE Trans on Nuclear Science, 1995, 42(4): 1235-1245 [12] Kruengkrai C, Jaruskulchai C. A Parallel Learning Algorithm for Text Classification // Proc of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, 2002: 201-206 [13] Leonid G, Gagan A. Parallelizing EM Clustering Algorithm on a Cluster of SMPs // Proc of the International Euro-Par Conference. Pisa, Italy, 2004: 372-380 [14] Chu Tongsheng, Xiang Yang. Exploring Parallelism in Learning Belief Networks // Proc of the 13th Conference on Uncertainty in Artificial Intelligence. Providence, USA, 1997: 90-98 [15] Lam W, Segre A M. A Distributed Learning Algorithm for Bayesian Inference Networks. IEEE Trans on Knowledge and Data Engineering, 2002, 14(1): 93-105 [16]Munetomo M, Murao N, Akama K. Empirical Studies on Parallel Network Construction of Bayesian Optimization Algorithms // Proc of the IEEE Congress on Evolutionary Computation. Edinburgh, UK, 2005: 1524-1531 [17] Gropp W, Lusk E, Skjellum A. Using MPI: Portable Parallel Programming with the Message-Passing Interface. Cambridge, USA: MIT Press, 1999 [18] Norsys Software Corp. Netica [DB/OL]. [2007-03-27]. http://www.norsys.com