Advanced Search
Volume 32 Issue 11
Dec.  2010
Turn off MathJax
Article Contents
Zong Yu, Li Ming-Chu, Xu Guan-Dong, Zhang Yan-Chun. High Dimensional Clustering Algorithm Based on Local Significant Units[J]. Journal of Electronics & Information Technology, 2010, 32(11): 2707-2712. doi: 10.3724/SP.J.1146.2009.01589
Citation: Zong Yu, Li Ming-Chu, Xu Guan-Dong, Zhang Yan-Chun. High Dimensional Clustering Algorithm Based on Local Significant Units[J]. Journal of Electronics & Information Technology, 2010, 32(11): 2707-2712. doi: 10.3724/SP.J.1146.2009.01589

High Dimensional Clustering Algorithm Based on Local Significant Units

doi: 10.3724/SP.J.1146.2009.01589
  • Received Date: 2009-12-11
  • Rev Recd Date: 2010-05-20
  • Publish Date: 2010-11-19
  • High dimensional clustering algorithm based on equal or random width density grid cannot guarantee high quality clustering results in complicated data sets. In this paper, a High dimensional Clustering algorithm based on Local Significant Unit (HC_LSU) is proposed to deal with this problem, based on the kernel estimation and spatial statistical theory. Firstly, a structure, namely Local Significant Unit (LSU) is introduced by local kernel density estimation and spatial statistical test; secondly, a greedy algorithm named Greedy Algorithm for LSU (GA_LSU) is proposed to quickly find out the local significant units in the data set; and eventually, the single-linkage algorithm is run on the local significant units with the same attribute subset to generate the clustering results. Experimental results on 4 synthetic and 6 real world data sets showed that the proposed high-dimensional clustering algorithm, HC_LSU, could effectively find out high quality clustering results from the highly complicated data sets.
  • loading
  • 孙吉贵, 刘杰, 赵连宇. 聚类算法研究[J].软件学报.2008, 19(1):48-61Sun Ji-gui, Liu Jie, and Zhao Lian-yu. Clustering algorithms research[J].Journal of Software.2008, 19(1):48-61[2]Hinneburg A and Keim D A. An efficient approach to clustering in large multimedia databases with noise [C]. Processing of the 4th International Conference on Knowledge Discovery and Data Mining, New York: AAAI Press, 1998: 58-68.[3]Hinneburg A and Gabriel H H. DENCLUS2.0: Fast Clustering based on kernel density estimation[C]. IDA, 2007, LNCS 4723: 70-80.[4]Vineet C J, Mohammad A H, and Saeed S, et al.. SPARCL: Efficient and effective shape-based clustering[C]. Proceedings of 8th IEEE International Conference on Data Mining, Pisa, Italy, 2008: 93-102.[5]Tao P, Ajay J, and David J H, et al.. DECODE: A new method for discovering clusters of different densities in spatial data [J].Data Mining and Knowledge Discovery.2009, 18(3):337-369[6]Hans H P, Peer K, and Arthir Z. Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering [J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2009, 3(1): 1-58.[7]Ng K, Fu A, and Wong C W. Projective clustering by histograms [J].IEEE Transactions on Knowledge and Data Engineering.2005, 17(3):369-383[8]Moise G, Sander J, and Ester M. Robust projected clustering [J]. Knowledge Information System, 2008, (14): 273-298.[9]Moise G and Sander J. Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projective and subspace clustering[C]. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD,08) Lasvegas, 2008: 533-541.[10]Liu H, Lafferty J, and Wasserman L. Sparse nonparametric density estimation in high dimensions using the rodeo[C]. 11th International Conference on Artificial Intelligence and Statistics, AISTATS, Florida, 2007: 1049-1062.Lafferty J D and Wasserman L A. Rodeo: Sparse nonparametric regression in high dimensions [J]. Advances in Neural Information Processing System, 2007(18): 1-45.[11]Baddeley A. Spatial point processes and their applications[J].Lecture Notes in Mathematics.2007, 1892:1-75[12]Aggarwal C C, Procopiuc C, and Wolf J, et al.. Fast algorithms for projected clustering[C]. Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD,99) (ACM SIGKDD, Philadelphia,1999), 1999: 61-72.[13]Muller E, Assent I, and Krieger R, et al.. Density estimation for data mining in high dimensional spaces[C]. SDM, Nevada, USA, 2009: 173-184.[14]Alon U, Barkai N, and Notterman K, et al.. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays[J]. PNAS, 1999(96): 6745-6750.[15]Zhang Y C and Xu G D. On web communities mining and recommendation [J].Concurrency and Computation: Practice and Experience.2009, 21(5):561-582
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3364) PDF downloads(681) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return