Most of traditional clustering algorithms get a partition of sample set with mutually exclusive classes, while there is no explicit boundary between classes mostly in the real world. Introducing rough set theory into clustering analysis, this paper proposes a kind of overlapping-based rough clustering algorithm called KMMRSC which represents a class with multiple centroids and describes the belongingness of samples with the concepts of upper approximation and lower approximation, thus there is overlapping relationship between classes. Experiments show that the algorithm KMMRSC, which can find non-spherical clusters, outperforms classic k-means.
Berkhin Pavel. Survey of clustering data mining techniques.Technical report, Accrue Software, San Jose, CA, 2002.[2]Han Jiawei and Kamber M. Data Mining: Concepts andTechniques. San Francisco: Morgan Kaufmann Publishers,2000, chapter 8.[3]Grabmeier J and Rudolph A. Techniques of clusteralgorithms in data mining[J].Data Mining and KnowledgeDiscovery.2002, 6(4):303-360[4]Jain A K, Murty M N, and Flynn P J. Data clustering: Areview[J].ACM Computing Surveys.1999, 31(3):264-323[5]Lingras P. Unsupervised roughset classification using GAs[J].Journal of Intelligent Information Systems.2001, 16(3):215-228[6]Lingras P and West C. Interval set clustering of web userwith rough K-means[J].Journal of Intelligent InformationSystems.2004, 23(1):5-16[7]Peters Georg. Some refinements of k-means clustering[J].Pattern Recognition.2006, 39(8):1481-1491[8]Asharaf S, Murty M N, and Shevade S K. Rough set basedincremental clustering of interval data[J].Pattern RecognitionLetters.2006, 27(6):515-519