Abstract:Spatial clustering is one of the most important spatial data mining techniques. In this paper, an improved spatial clustering algorithm (AISCA) based on DBSCAN is proposed. In order to cluster largescale spatial databases effectively, the proposed algorithm adopts a new sampling technique. In addition, it considers not only spatial attributes but also nonspatial attributes by introducing the concept of the matching neighborhood. Experimental results of 2D spatial datasets show that the proposed algorithm is feasible and efficient.
[1] Han Jiawei, Kamber M. Data Mining: Concepts and Techniques. Orlando, USA: Morgan Kaufmann Publishers, 2001 [2] Ng R T, Han Jiawei. CLARANS:A Method for Clustering Objects for Spatial Data Mining. IEEE Trans on Knowledge and Data Engineering, 2002, 14(5): 10031016 [3] Guha S, Rastogi R, Shim K. CURE: An Efficient Clustering Algorithm for Large Databases // Proc of the ACM SIGMOD International Conference on Management of Data. Seattle, USA, 1998: 7384 [4] Zhang T, Ramakrishna R, Livny M. BIRCH: An Efficient Data Clustering Method for Very Large Databases // Proc of the ACM SIGMOD International Conference on Management of Data. Montreal, Canada, 1996:103114 [5] Ester M, Kriegel H, Sander J, et al. A DensityBased Algorithm for Discovering Clusters in Large Spatial Databases with Noise // Proc of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, USA, 1996: 226231 [6] Ankerst M, Breunig M, Kriegel H, et al.OPTICS: Ordering Points to Identify the Clustering Structure // Proc of the ACM SIGMOD International Conference on Management of Data Mining. Philadelphia, USA, 1999: 4960 [7] Sander J, Ester M, Kriegel H, et al. DensityBased Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications. Data Mining and Knowledge Discovery, 1998, 2(2): 169194 [8] Wang Xin, Hamilton H J. DBRS: A DensityBased Spatial Clustering Method with Random Sampling // Proc of the 7th PacificAsia Conference on Knowledge Discovery and Data Mining. Seoul, Korea, 2003: 563575 [9] Wang W, Yang J, Muntz R. STING: A Statistical Information Grid Approach to Spatial Data Mining // Proc of the 23rd International Conference on Very Large Data Bases. Athens, Greece, 1997: 186195 [10] Sheikholeslami G, Chatterjee S, Zhang A. Wave Cluster: A MultiResolution Clustering Approach for Very Large Spatial Databases // Proc of the 24th International Conference on Very Large Data Bases. New York, USA, 1998: 428439 [11] Beckmann N, Kriegel H P, Schneider R, et al. The R*Tree: An Efficient and Robust Access Method for Points and Rectangles // Proc of the ACM SIGMOD International Conference on Management of Data. Atlantic City, USA, 1990: 322331