A Deep Web Query Interface Matching Approach Based on Evidence Theory and Task Assignment
DONG Yong-Quan1,2, LI Qing-Zhong1, DING Yan-Hui1, Zhang Yong-Xin1
1School of Computer Science and Technology, Shandong University, Jinan 250101 2School of Computer Science and Technology, Xuzhou Normal University, Xuzhou 221006
Abstract:To solve the limitations of existing query interface matching which have the difficulties of weight setting of the matcher and the absence of the efficient processing of matching decision, a deep web query interface matching approach based on evidence theory and task assignment is proposed called evidence theory and task assignment based query interface matching approach(ETTA-IM). Firstly, an improved D-S evidence theory is used to automatically combine multiple matchers. Thus, the weight of each matcher is not required to be set by hand and human involvement is reduced. Then, a method is used to select a proper attribute correspondence of each source attribute from target query interface, which converts one-to-one matching decision to the extended task assignment problem. Finally, based on one-to-one matching results, some heuristic rules of tree structure are used to perform one-to-many matching decision. Experimental results show that ETTA-IM approach has high precision and recall measure.
董永权,李庆忠,丁艳辉,张永新. 一种基于证据理论和任务分配的DeepWeb查询接口匹配方法[J]. 模式识别与人工智能, 2011, 24(2): 262-271.
DONG Yong-Quan, LI Qing-Zhong, DING Yan-Hui, Zhang Yong-Xin. A Deep Web Query Interface Matching Approach Based on Evidence Theory and Task Assignment. , 2011, 24(2): 262-271.
[1] Miller R J, Ioannidis Y E, Ramakrishnan R. Schema Equivalence in Heterogeneous Systems: Bridging Theory and Practice. Information System, 1994, 19(1): 3-31 [2] Erhard R, Philip A B. A Survey of Approaches to Automatic Schema Matching. The International Journal on Very Large Data Bases, 2001, 10(4): 334-350 [3] Do Honghai D, Erhard R. COMA: A System for Flexible Combination of Schema Matching Approaches // Proc of the 28th International Conference on Very Large Data Bases. Hongkong, China, 2002: 610-621 [4] He Zhongtian, Hong Jun, Bell D. Schema Matching across Query Interfaces on the Deep Web // Proc of the 25th British National Conference on Databases. Cardiff, UK, 2008: 51-62 [5] He Bin, Chang K C, Han Jiawei. Discovering Complex Matchings across web Query Interfaces: A Correlation Mining Approach // Proc of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, USA, 2004: 148-157 [6] Doan A, Domingos P, Halvey A Y. Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. ACM SIGMOD Record, 2001, 30(2): 509-520 [7] Dhamankar R, Lee Y, Doan A, et al. iMAP: Discovering Complex Semantic Matches between Database Schemas // Proc of the ACM SIGMOD International Conference on Management of Data. Paris, France, 2004: 383-394 [8] Sergey M, Hector G, Erhard R. Similarity Flooding: A Versatile Graph Matching Algorithm // Proc of 18th International Conference on Data Engineering. Los Alamitos, USA, 2002: 117-128 [9] Wu Wensheng, Yu C, Doan A, et al. An Interactive Clustering-Based Approach to Integrating Source Query Interfaces on the Deep Web // Proc of the ACM SIGMOD International Conference on Management of Data. Paris, France, 2004: 95-106 [10] Dempster A P. Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 1967, 38(2): 325-339 [11] Shafer G. A Mathematical Theory of Evidence. Princeton, USA: Princeton University Press, 1976 [12] Zadeh L A. Review of Shafers a Mathematical Theory of Evidence. AI Magazine, 1984, (5): 81-83 [13] Deng Yong, Shi Wenkang, Zhu Zhenfu, et al. Combining Belief Functions Based on Distance of Evidence. Decision Support Systems, 2004, 38(3): 489-493 [14] Jousselme A, Grenier D, Bosse E. A New Distance Between Two Bodies of Evidence. Information Fusion, 2001, 2(2): 91-101 [15] Kuhn H W. The Hungarian Method for the Assignment Problem. Naval Research Logistics, 1955, (2): 83-97 [16] Hall P, Dowling G. Approximate String Matching. ACM Computing Surveys, 1980, 12(4): 381-402 [17] Cohen W W, Ravikumar P, Fienberg S E. A Comparison of String Distance Metrics for Name-Matching Tasks // Proc of the 2nd International Workshop on Information Integration on the Web. Acapulco, Mexico, 2003: 73-78