基于覆盖的粗糙聚类算法
doi: 10.3724/SP.J.1146.2007.00450
Overlapping-Based Rough Clustering Algorithm
-
摘要: 传统的聚类算法大都得到了样本集的一个划分,类之间是严格的互斥关系,而现实世界中类与类之间往往没有明确的边界。该文将粗糙集理论引入到聚类分析中,提出了一种基于覆盖的粗糙聚类算法KMMRSC,它用多个中心点代表一个类,并用上、下近似来刻画样本的归属,类与类之间是一种覆盖关系。实验结果表明,该算法聚类质量优于k-均值算法,且能发现非球状簇。
-
关键词:
- 粗糙聚类; 覆盖; 多中心点
Abstract: Most of traditional clustering algorithms get a partition of sample set with mutually exclusive classes, while there is no explicit boundary between classes mostly in the real world. Introducing rough set theory into clustering analysis, this paper proposes a kind of overlapping-based rough clustering algorithm called KMMRSC which represents a class with multiple centroids and describes the belongingness of samples with the concepts of upper approximation and lower approximation, thus there is overlapping relationship between classes. Experiments show that the algorithm KMMRSC, which can find non-spherical clusters, outperforms classic k-means. -
Berkhin Pavel. Survey of clustering data mining techniques.Technical report, Accrue Software, San Jose, CA, 2002.[2]Han Jiawei and Kamber M. Data Mining: Concepts andTechniques. San Francisco: Morgan Kaufmann Publishers,2000, chapter 8.[3]Grabmeier J and Rudolph A. Techniques of clusteralgorithms in data mining[J].Data Mining and KnowledgeDiscovery.2002, 6(4):303-360[4]Jain A K, Murty M N, and Flynn P J. Data clustering: Areview[J].ACM Computing Surveys.1999, 31(3):264-323[5]Lingras P. Unsupervised roughset classification using GAs[J].Journal of Intelligent Information Systems.2001, 16(3):215-228[6]Lingras P and West C. Interval set clustering of web userwith rough K-means[J].Journal of Intelligent InformationSystems.2004, 23(1):5-16[7]Peters Georg. Some refinements of k-means clustering[J].Pattern Recognition.2006, 39(8):1481-1491[8]Asharaf S, Murty M N, and Shevade S K. Rough set basedincremental clustering of interval data[J].Pattern RecognitionLetters.2006, 27(6):515-519
计量
- 文章访问数: 3296
- HTML全文浏览量: 69
- PDF下载量: 666
- 被引次数: 0