Abstract:Skyline query plays an important role in multi-decision and data mining. However, with the growth of data dimension, Skyline set becomes very large. Skyline representative point query is studied to overcome this shortcoming. A new evaluation function is proposed to improve the score-computing of Skyline points so as to select k representative Skyline points. A dynamic programming based algorithm(DPBA) in two-dimensional space is presented. The Eulerian distance between representative point and non-representative point is determined by the cover circle. k representative points are got by computing the evaluation function iteratively. In high-dimensional space, an approximate solution based on aR-tree index is proposed to solve the NP-hard problem. The index tree is traversed to judge whether it is dominated by the candidate Skyline sets. If it is dominated, it should be pruned to reduce the computation cost. The experiments of synthetic and real data show that the proposed algorithms are effective and efficient.
[1] B rzs nyi S, Kossmann D, Stocker K. The Skyline Operator // Proc of the 17th International Conference on Data Engineering. Heidelberg, Germany, 2001: 421-430 [2] Chan C Y, Jagadish H V, Tan K L, et al. Finding k-Dominant Skylines in High Dimensional Space // Proc of the ACM SIGMOD International Conference on Management of Data. Chicago, USA, 2006: 503-514 [3] Yin J, Yao S Y, Xue S E, et al. An Index Based Efficient k-Dominant Skyline Algorithm. Chinese Journal of Computers, 2010, 33(7): 1236-1245 (in Chinese) (印 鉴,姚树宇,薛少锷,等.一种基于索引的高效k-支配Skyline算法.计算机学报, 2010, 33(7): 1236-1245) [4] Yiu M L, Mamoulis N. Efficient Processing of Top-k Dominating Queries on Multi-dimensional Data // Proc of the 33rd International Conference on Very Large Data Bases. Vienna, Austria, 2007: 483-494 [5] Lin X M, Yuan Y D, Zhang Q, et al. Selecting Stars: The k Most Representative Skyline Operator // Proc of the 23rd IEEE International Conference on Data Engineering. Istanbul, Turkey, 2007: 86-95 [6] Tao Y F, Ding L, Lin X M, et al. Distance-Based Representative Skyline // Proc of the 25th IEEE International Conference on Data Engineering. Shanghai, China, 2009: 892-903 [7] Chen L, Wu J, Deng S G, et al. Service Recommendation: Similarity-Based Representative Skyline // Proc of the 6th World Congress on Services. Miami, USA, 2010: 360-366 [8] Das Sarma A, Lall A, Nanongkai D, et al. Representative Skylines Using Threshold-Based Preference Distributions // Proc of the 27th IEEE International Conference on Data Engineering. Hannover, Germany, 2011: 387-398 [9] Magnani M, Assent I, Mortensen M L. Taking the Big Picture: Representative Skylines Based on Significance and Diversity. The VLDB Journal, 2014, 23(5): 795-815 [10] Papadias D, Tao Y F, Fu G, et al. Progressive Skyline Computation in Database Systems. ACM Transactions on Database Systems, 2005, 30(1): 41-82