A Deep Web Query Interface Matching Approach Based on Evidence Theory and Task Assignment
DONG Yong-Quan1,2, LI Qing-Zhong1, DING Yan-Hui1, Zhang Yong-Xin1
1School of Computer Science and Technology, Shandong University, Jinan 250101 2School of Computer Science and Technology, Xuzhou Normal University, Xuzhou 221006
Abstract To solve the limitations of existing query interface matching which have the difficulties of weight setting of the matcher and the absence of the efficient processing of matching decision, a deep web query interface matching approach based on evidence theory and task assignment is proposed called evidence theory and task assignment based query interface matching approach(ETTA-IM). Firstly, an improved D-S evidence theory is used to automatically combine multiple matchers. Thus, the weight of each matcher is not required to be set by hand and human involvement is reduced. Then, a method is used to select a proper attribute correspondence of each source attribute from target query interface, which converts one-to-one matching decision to the extended task assignment problem. Finally, based on one-to-one matching results, some heuristic rules of tree structure are used to perform one-to-many matching decision. Experimental results show that ETTA-IM approach has high precision and recall measure.
DONG Yong-Quan,LI Qing-Zhong,DING Yan-Hui等. A Deep Web Query Interface Matching Approach Based on Evidence Theory and Task Assignment[J]. , 2011, 24(2): 262-271.
[1] Miller R J, Ioannidis Y E, Ramakrishnan R. Schema Equivalence in Heterogeneous Systems: Bridging Theory and Practice. Information System, 1994, 19(1): 3-31 [2] Erhard R, Philip A B. A Survey of Approaches to Automatic Schema Matching. The International Journal on Very Large Data Bases, 2001, 10(4): 334-350 [3] Do Honghai D, Erhard R. COMA: A System for Flexible Combination of Schema Matching Approaches // Proc of the 28th International Conference on Very Large Data Bases. Hongkong, China, 2002: 610-621 [4] He Zhongtian, Hong Jun, Bell D. Schema Matching across Query Interfaces on the Deep Web // Proc of the 25th British National Conference on Databases. Cardiff, UK, 2008: 51-62 [5] He Bin, Chang K C, Han Jiawei. Discovering Complex Matchings across web Query Interfaces: A Correlation Mining Approach // Proc of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, USA, 2004: 148-157 [6] Doan A, Domingos P, Halvey A Y. Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. ACM SIGMOD Record, 2001, 30(2): 509-520 [7] Dhamankar R, Lee Y, Doan A, et al. iMAP: Discovering Complex Semantic Matches between Database Schemas // Proc of the ACM SIGMOD International Conference on Management of Data. Paris, France, 2004: 383-394 [8] Sergey M, Hector G, Erhard R. Similarity Flooding: A Versatile Graph Matching Algorithm // Proc of 18th International Conference on Data Engineering. Los Alamitos, USA, 2002: 117-128 [9] Wu Wensheng, Yu C, Doan A, et al. An Interactive Clustering-Based Approach to Integrating Source Query Interfaces on the Deep Web // Proc of the ACM SIGMOD International Conference on Management of Data. Paris, France, 2004: 95-106 [10] Dempster A P. Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 1967, 38(2): 325-339 [11] Shafer G. A Mathematical Theory of Evidence. Princeton, USA: Princeton University Press, 1976 [12] Zadeh L A. Review of Shafers a Mathematical Theory of Evidence. AI Magazine, 1984, (5): 81-83 [13] Deng Yong, Shi Wenkang, Zhu Zhenfu, et al. Combining Belief Functions Based on Distance of Evidence. Decision Support Systems, 2004, 38(3): 489-493 [14] Jousselme A, Grenier D, Bosse E. A New Distance Between Two Bodies of Evidence. Information Fusion, 2001, 2(2): 91-101 [15] Kuhn H W. The Hungarian Method for the Assignment Problem. Naval Research Logistics, 1955, (2): 83-97 [16] Hall P, Dowling G. Approximate String Matching. ACM Computing Surveys, 1980, 12(4): 381-402 [17] Cohen W W, Ravikumar P, Fienberg S E. A Comparison of String Distance Metrics for Name-Matching Tasks // Proc of the 2nd International Workshop on Information Integration on the Web. Acapulco, Mexico, 2003: 73-78