Advanced Search
Volume 27 Issue 4
Apr.  2005
Turn off MathJax
Article Contents
LI Yubo, CHEN Miao. Construction of Nearly Perfect Gaussian Integer Sequences[J]. Journal of Electronics & Information Technology, 2018, 40(7): 1752-1758. doi: 10.11999/JEIT170844
Citation: Jiang Yuan, Zhang Zhao-yang, Qiu Pei-liang, Zhou Dong-fang. Clustering Algorithms Used in Data Mining[J]. Journal of Electronics & Information Technology, 2005, 27(4): 655-662.

Clustering Algorithms Used in Data Mining

  • Received Date: 2003-12-22
  • Rev Recd Date: 2004-04-26
  • Publish Date: 2005-04-19
  • Data mining is used to draw interesting information from Very Large DataBases (VLDB). Clustering plays an outstanding role in data mining applications. Clustering is a division of databases into groups of similar objects based on the similarity. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning. There are tens of clustering algorithms used in various fields such as statistics, pattern recognition and machine learning now. This paper concludes the clustering algorithms used in data mining and assorts them into 7 classes. Seven types of algorithms are summarized and their performances are analyzed here.
  • Guha S, Rastogi R, Sim K. CURE: An efficient clustering algorithm for large databases. In Proc. of the ACM SIGMOD Conference, Seattle, WA, 1998:73 - 84.[2]Karypis G, Han E H, Kumar V. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling.[J]. Computer.1999,32:68-[3]Boley D L. Principal direction divisive partitioning[J].Data Mining and Knowledge Discovery.1998, 2(4):325-[4]Fisher D. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 1987, 23(2): 139 - 172.[5]Mclachlan G, Krishnan T. The EM Algorithm and Extensions[J].New York, NY: John Wiley Sons.1997, http:-[6]Wallace C, Dowe D. Intrinsic classification by MML - the Snob program. In the Proc. of the 7th Australian Joint Conference on Artificial Intelligence, UNE, Armidale, Australia, World Scientific Publishing Co., 1994:37 - 44.[7]Cheeseman P, Stutz J. Bayesian classification (AutoClass): theory and results. Fayyad U M., Piatetsky-Shapiro G, Smyth P, and Uthurusamy R, (Eds.) Advances in Knowledge Discovery and Data Mining, AAAI Press/MIT Press, 1996:95 - 164.[8]Fraley C, Raftery A. MCLUST: Software for model-based cluster and discriminant analysis, Tech. Report 342, Dept. Statistics,Univ. of Washington, 1999.[9]高新波,裴继红,谢维信.基于统计检验指导的聚类分析方法.电子科学学刊,2000,22(1):6-12.[10]邢永康,马少平.一种基于Markov链模型的动态聚类方法.计算机研究与发展,2003,40(2):34-39.[11]杨岳湘,田艳芳,王韶红.基于模糊聚类和Naive Bayes方法的文本分类器,计算机工程与科学,2002,24(5):20-23.[12]Kaufman L, Rousseeuw P. Finding Groups in Data: An Introduction to Cluster Analysis. New York, John Wiley and Sons,NY, 1990: 145- 193.[13]Ng R, Hah J. Efficient and effective clustering methods for spatial data mining. In Proc. of the 20th Conference on VLDB, Santiago,Chile, 1994:144- 155.[14]Ian Davidson. Understanding K-Means No-hierarchical Clustering.Suny Albany-Technical Report 02-2, http:∥www.cs.alb any.edu/~davidson/courses/CSI635/UnderstandingK-MeansClustering.pdf.[15]Vance Faber. Clustering and the Continuous k-Means Algorithm.Los Alamos Science Number 22 1994, http:∥www.c3. lanl.gov/~kelly/ml/pubs/1994_concept/sidebar.pdf.[16]Bradley P S, Fayyad U M. Refining initial points for k-means clustering. In Proc. of the 15th ICML, Madison, WI, 1998:91-99.[17]Aristidis Likas, Nokos Vlassis, Jakob Verbeek. The global k-means clustering algorithm, http:∥iris. usc.edu/ Vision-Notes/bibliography/pattern623.html, 2003:451 - 461.[18]Babu G P, Murty M N. A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm[J].Pattern Recogn.Lell.1993, 14(10):763-[19]Brown D, Huntley C. A practical application of simulated annealing to clustering. Technical Report IPC-TR-91-003,University of Virginia, 1991.[20]Zhang B. Generalized k-harmonic means-dynamic weighting of data in unsupervised learning. In Proc. of the 1st SIAM International Conference on Data Mining, Chicago, IL, 2001:1- 13.[21]Pelleg D, Moore A. X-means: Extending K-means with efficient estimation of the number of clusters. In Proc. 17th ICML, Stanford University, 2000:89 - 97.[22]刘健庄,谢维信,等.聚类分析的遗传算法[J].电子学报,1995,23(11):81-83.[23]李碧,雍正正.一种改进的基于遗传算法的聚类分析方法.电路与系统学报,2002,7(3):96-99.[24]刘静,钟伟才,刘芳,焦李成.免疫进化聚类算法.电子学报,2001,29(12A):1868-1872.[25]高新波,裴继红,谢维信.模糊c均值聚类算法中加权指数m的研究.电子学报,2000,28(4):1-4.[26]张志华,郑南宁,史罡.极大熵聚类算法及其全局收敛性分析.中国科学(E辑),2001,31(1):59-70.[27]沈越泓,益晓新,徐发强,李兴国.模糊聚类和模糊模式识别技术在通信设备抗干扰性能评估系统中的应用.电子科学学刊,2000, 22(2): 210 - 217.[28]Ester M, Kriegel H P, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. of the 2nd ACM SIGKDD, Portland, 1996:226 - 231.[29]Sander J, Ester M, Kriegel H P, Xu X. Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications[J].Data Mining and Knowledge Discovery.1998, 2(2):169-[30]Ankerst M, Breunig M, Kriegel H P, Sander J. OPTICS: Ordering points to identify clustering structure. In Proc. of the ACM SIGMOD Conference, Philadelphia, PA, 1999:49 - 60.[31]Xu X, Ester M, Kiegel H P, Sander J. A distribution-based clustering algorithm for mining in large spatial databases. In Proc.of the 14th ICDE, Orlando, FL, 1998:324 - 331.Hinneburg A, Keim D. An efficient approach to clustering large multimedia databases with noise. In Proc. of the 4th ACM SIGKDD, New York, NY, 1998:58 - 65.Agrawal R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. of the ACM SIGMOD Conference, Seattle,WA, 1998:94 - 105.[32]Wang W, Yang J, Muntz R. STING: a statistical information grid approach to spatialdata mining. In Proc. of the 23rd Conference on VLDB, Athens, Greece, 1997:186 - 195.[33]Wang W, Yang J, Muntz R. STING+: An approach to active spatial data mining. In Proc. 15th ICDE, Sydney, Australia, 1999:116 - 125.[34]Sheikholeslami G, Chatterjee S, Zhang A. WaveCluster: A multi-resolution clustering approach for very large spatial databases. In Proc. of the 24th Conference on VLDB, New York,NY, 1998:428 - 439.[35]Barbara D, Chen P. Using the fractal dimension to cluster datasets.In Proc. of the 6th ACM SIGKDD, Boston, MA, 2000:260 - 264.[36]Guha S, Rastogi R, Shim K. ROCK: A robust clustering algorithm for categorical attributes. In Proc. of the 15th ICDE,Sydney, Australia, 1999:512 - 521.[37]Ertoz L, Steinbach M, Kumar V. Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data,Department of Computer Science, University of Minnesota,Minneapolis, MN, USA Technical Report, 2002, www-users.cs.umn.edu/~kumar/papers/kdd02 snn 28.pdf.[38]Ganti V, Gehrke J, Ramakrishnan R. CACTUS-clustering categorical data using summaries. In Proc. of the 5th ACM SIGKDD, San Diego, CA, 1999:73 - 83.[39]Gibson D, Kleinberg J, Raghavan P. Clustering categorical data:An approach based on dynamic systems. In Proc. of the 24thInternational Conference on Very Large Databases, New York,NY, 1998:311 - 323.[40]Cheng C, Fu A, Zhang Y. Entropy-based subspace clustering for mining numerical data. In Proc. of the 5th ACM SIGKDD, San Diego, CA, 1999:84 - 93.Hinneburg A, Keim D. Optimal grid-clustering: Towards breading the curse of dimensionality in high-dimensional clustering. In Proc. of the 25th Coference on VLDB, Edinburgh,Scotland, 1999:506 - 517.Aggarwal C C, Procopiuc C, Wolf J L, Yu P S, Park J S. Fast algorithms for projected clustering. In Proc. of the ACM SIGMOD Conference Philadelphia, PA, 1999:61 - 72,.[41]Aggarwal C C, Yu P S. Finding generalized projected clusters in high dimension spaces. In Proc. ACM SIGMOD Int. Conf. 2000,http:∥citeseer. ist.psu.edu/aggarwal00finding.html.[42]Kohonen T, The self-organizing map. Proc[J].IEEE.1990, 78(9):1464-[43]钱云涛,谢维信.一种由模糊逻辑神经元网络实现的聚类分析方法.西安电子科技大学学报,1995,22(1):1-7.[44]钱云涛,谢维信.聚类神经网络的通用设计方法.西安电子科技大学学报,1997,24(1):15-21.[45]黄敏超,张育林,陈启智.模糊超球神经网络在模式聚类中的应用.自动化学报,1997,23(2):279-282.[46]魏立梅,谢维信.聚类分析中竞争学习的一种新算法.电子科学学刊,2000,22(1):13-18.[47]黄凤岗,宋克欧.一种集成模糊聚类神经网络.哈尔滨工程大学学报,1997,18(3):82-85.[48]宋爱国,陆佶人.基于进化规划的Kohonen网络用于被动声纳目标聚类研究.电子学报,1998,26(7):128-132[49]张艳宁,赵荣椿,梁怡.一种有效的大规模数据的分类方法.电子学报,2002,30(10):1533-1535.[50]杨志荣,李磊.用SOM聚类实现多级高维点数据索引.计算机研究与发展,2003,40(1):100-106.[51]王莉,王正欧.TGSOM:一种用于数据聚类的动态自组织映射神经网络[J].电子与信息学报.2003,25(3):313-319浏览
  • Cited by

    Periodical cited type(19)

    1. 孙顺远,魏志涛. 基于二次移动平均法估计背景光照的二值化方法. 计算机与数字工程. 2024(06): 1830-1836 .
    2. 赵孔卫,徐广标. 基于像素分析的针织面料卷边性评价研究. 针织工业. 2024(10): 11-14 .
    3. 卢晓波,徐海,朱俊召,张宇,谭健,高冠男,胡军华,林龙. 基于机器视觉的加热卷烟烟支端部质量检测系统设计. 轻工学报. 2024(06): 101-107+115 .
    4. 韩海豹,化荣,张虎,陈杰. 量产活禽(肉鸡)智能化运输装备控制系统的设计. 农业技术与装备. 2023(01): 20-22 .
    5. 支亚京,汤宁,吴兴洋,汪华,胡兴炜,张军. 基于支持向量机的气温自记纸图像数字化. 计算机技术与发展. 2023(10): 216-220 .
    6. 魏兴凯,蒋峥,傅呈勋,刘斌. 基于光照影响因子的动态Niblack算法研究及应用. 计算机工程与设计. 2022(04): 1066-1073 .
    7. 徐浩,章明希. 高精密齿轮小缺陷的智能视觉测量. 兵器材料科学与工程. 2021(01): 83-87 .
    8. 贺欢,吐尔洪江·阿布都克力木,何笑. 一种基于MALLAT算法的图像去雾方法. 新疆师范大学学报(自然科学版). 2020(01): 23-27 .
    9. 赵琛,张血琴,刘凯,郭裕钧. 基于正则化的多光谱图像二值化处理. 计算机仿真. 2020(04): 436-440 .
    10. 杜炤鑫,谢海宁,宋杰,周德生,邹晓峰,陈冉,曾平. 基于图像处理和深度学习的配网跳闸故障识别方法. 中国科学技术大学学报. 2020(01): 39-48 .
    11. 蒋鹏程,熊礼治,韩啸. 一种基于内容保护与优化识别的二维码方案. 软件导刊. 2019(02): 119-122 .
    12. 安建尧,李金新,孙双平. 基于Prewitt算子的红外图像边缘检测改进算法. 杭州电子科技大学学报(自然科学版). 2018(05): 18-23+39 .
    13. 陈志伟,徐世许,刘云鹏,曾祥晓. 基于视觉筛选的并联机器人平面抓取系统设计. 制造业自动化. 2018(05): 44-47 .
    14. 熊炜,徐晶晶,赵诗云,王改华,刘敏,赵楠,刘聪. 基于支持向量机的低质量文档图像二值化. 计算机应用与软件. 2018(02): 218-223+241 .
    15. 李昌利,周晓晓,张振,樊棠怀. Retinex模型下基于融合策略的雾霾图像增强. 工程科学与技术. 2018(05): 202-208 .
    16. 于晓,闫振雷,周子杰. 指纹识别网页登录器设计. 实验室研究与探索. 2018(10): 85-88+128 .
    17. 宋巧君,张东. 基于双边滤波和Black-hat变换的OSTU裂缝分割算法. 信息技术. 2017(12): 90-92 .
    18. 谢芳娟,曾萍萍,谭菊华. 低分辨率灰度图像传输真实度优化仿真研究. 计算机仿真. 2017(12): 183-186 .
    19. 田敬波. 基于模板算子边缘检测的图像二值化算法. 信息技术与信息化. 2017(09): 98-101 .

    Other cited types(33)

  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (4731) PDF downloads(5018) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return