Chinese Geographic Entity Resolution Based on Markov Logic Network
HU Yi-Min1,2,SONG Liang-Tu 1,2 ,CHEN Peng1,2,WEI Yuan-Yuan1 ,SU Ya-Ru 1,2
1.Research Center for Intelligent Information Systems,Institute of Intelligent Machines,Chinese Academy of Science,Hefei 230031 2.School of Information Science and Technology,University of Science and Technology of China,Hefei 230026
Abstract:Markov Logic Network has the ability to handling the complex representation and the uncertainty of first-order logic and probabilistic graphical models. An entity resolution method based on Markov logic network and property extraction algorithm employing ontology and web search is proposed to improve the performance of named entity resolution for unstructured data based on Markov logic network. The method is then applied to the resolution of Chinese geographic names. The experimental result shows that the proposed method is effective in geographic entity resolution.
[1] Parag S,Pedro D. Entity Resolution with Markov Logic // Proc of the 6th International Conference on Data Mining. Hong Kong,China,2006: 572-582 [2] Batini C,Scannapieco M. Data Quality: Concepts,Methodologies and Techniques. New York,USA: Springer,2006 [3] Kpcke H,Rahm E. Frameworks for Entity Matching: A Comparison. Data Knowledge Engineering,2010,69(2): 197-210 [4] Koudas N,Sarawagi S,Srivastava D. Record linkage: Similarity Measures and Algorithms // Proc of the ACM SIGMOD International Conference on Management of Data. Chicago,USA,2006: 802-803 [5] Artiles J,Gonzalo J,Sekine S. The Semeval-2007 WEPS Evaluation: Establishing a Benchmark for the Web People Search Task // Proc of the 4th International Workshop on Semantic Evaluations. Prague,Czech Republic,2007: 64-69 [6] McNamee P,Simpson H,Dang T H. Overview of the Tac 2009 Knowledge Base Population Track. [2011-11-30]. http://nlp.cs.qc.cuny.edu/kbp2010overview.pdf [7] Klapaftis I P,Manandhar S. Unsupervised Named Entity Resolution // Proc of the 3rd IEEE International Conference on Multimedia Communications,Services and Security. Krakow,Poland,2010:1-6 [8] Bagga A,Baldwin B. Entity-Based Cross-Document Coreferencing Using the Vector Space Model // Proc of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Montreal,Canada,1998: 79-85 [9] Gooi C H,Allan J. Cross-Document Coreference on a Large Scale Corpus // Proc of the North American Chapter of the Association for Computational Linguistics. Boston,USA,2004: 9-16 [10] Silviu C. Large-Scale Named Entity Disambiguation Based on Wikipedia Data // Proc of the Joint Conference on EMNLP and CNLL. Prague,The Czech Republic,2007: 708-716 [11] Ulli W,Alexander M. Who is it Context Sensitive Named Entity and Instance Recognition by Means of Wikipedia // Proc of the IEEE/WIC/ACM International Conference on Web Intelligence. Sydney,Australia,2008,I: 381-384 [12] Indrajit B,Lise G. Collective Entity Resolution in Relational Data. ACM Trans on Knowledge Discovery from Data,2007,1(1): 1-35 [13] Richardson M,Domingos P. Markov Logic Networks. Machine Learning,2006,62(1/2): 107-136 [14] Richardson M,Domingos P. Markov Logic Networks. Seattle,USA: University of Washington,2004 [15] Parag S,Pedro D. Joint Inference in Information Extraction // Proc of the 22nd National Conference on Artificial Intelligence. Vancouver,Canada,2007: 913-918 [16] Lou Junjie,Xu Congfu,Hao Chunliang. Improvement of Entity Resolution Based on Markov Logic Networks. Computer Science,2010,37(8): 243-247 (in Chinese) (楼俊杰,徐从富,郝春亮.基于马尔科夫逻辑网络的实体解析改进算法.计算机科学,2010,37(8): 243-247) [17] Hu Yimin,Song Liangtu,Wei Yuanyuan,et al. Agricultural Market Name Geo-Locating System Based on an Administrative Ontology and Web Search Engine. Journal of Integrative Agriculture,2012,11(5): 849-857 [18] Zhang Yufang,Huang Tao,Ai Dongmei,et al. Markov Logic Networks with Its Application in De-duplication. Journal of Chongqing University,2010,33(8): 36-41 (in Chinese) (张玉芳,黄 涛,艾东梅,等.Markov逻辑网在重复数据删除中的应用.重庆大学学报,2010,33(8): 36-41) [19] Zhang Yufang,Huang Tao,Ai Dongmei,et al. Markov Logic Network and Its Application in Text Classification. Journal of Computer Applications,2009,29(10): 2729-2732 (in Chinese) (张玉芳,黄 涛,艾东梅,等.Markov逻辑网及其在文本分类中的应用.计算机应用,2009,29(10): 2729-2732) [20] Xu Congfu,Hao ChunLiang,Su BaoJun,et al. Research on Markov Logic Networks. Journal of Software,2011,22(8): 1699-1713 (in Chinese) (徐从富,郝春亮 ,苏保君,等.马尔可夫逻辑网络研究.软件学报,2011,22(8): 1699-1713) [21] Poon H,Domingos P. Sound and Efficient Inference with Probabilistic and Deterministic Dependencies // Proc of the AAAI Conference on Artificial Intelligence. Boston,USA,2006,Ⅰ: 458-463 [22] Kok S,Singia P,Richardson M,et al. The Alchemy System for Statistical Relational AI. [2011-11-28]. http://alchemy.cs.washington.edu [23] Singa P,Domings P. Discriminative Training of Markov Logic Networks // Proc of the 20th National Conference on Artificial intelligence. Pittsburgh,USA,2005,II: 868-873 [24] Christopher D M,Hinrich S. Foundations of Statical Language Processing. Cambridge,USA: MIT Press,1999