高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

簇间可分的鲁棒模糊C均值聚类算法

高云龙 杨程宇 王志豪 罗斯哲 潘金艳

高云龙, 杨程宇, 王志豪, 罗斯哲, 潘金艳. 簇间可分的鲁棒模糊C均值聚类算法[J]. 电子与信息学报, 2019, 41(5): 1114-1121. doi: 10.11999/JEIT180604
引用本文: 高云龙, 杨程宇, 王志豪, 罗斯哲, 潘金艳. 簇间可分的鲁棒模糊C均值聚类算法[J]. 电子与信息学报, 2019, 41(5): 1114-1121. doi: 10.11999/JEIT180604
Yunlong GAO, Chengyu YANG, Zhihao WANG, Sizhe LUO, Jinyan PAN. Robust Fuzzy C-means Clustering Algorithm Integrating Between-cluster Information[J]. Journal of Electronics & Information Technology, 2019, 41(5): 1114-1121. doi: 10.11999/JEIT180604
Citation: Yunlong GAO, Chengyu YANG, Zhihao WANG, Sizhe LUO, Jinyan PAN. Robust Fuzzy C-means Clustering Algorithm Integrating Between-cluster Information[J]. Journal of Electronics & Information Technology, 2019, 41(5): 1114-1121. doi: 10.11999/JEIT180604

簇间可分的鲁棒模糊C均值聚类算法

doi: 10.11999/JEIT180604
基金项目: 国家自然科学基金(61203176),福建省自然科学基金(2013J05098, 2016J01756)
详细信息
    作者简介:

    高云龙:男,1979年生,副教授,研究方向为机器学习、时间序列分析和生产制造系统优化与调度

    杨程宇:男,1996年生,本科生,研究方向为机器学习

    王志豪:男,1993年生,硕士生,研究方向为模式识别和机器学习

    罗斯哲:男,1995年生,硕士生,研究方向为维数约简、模式识别和机器学习

    潘金艳:女,1978年生,副教授,研究方向为人工智能和机器学习理论与方法

    通讯作者:

    潘金艳 gaoyl@xmu.edu.cn

  • 中图分类号: TP311.13

Robust Fuzzy C-means Clustering Algorithm Integrating Between-cluster Information

Funds: The National Natural Science Foundation of China (61203176), The Natural Science Foundation of Fujian Province (2013J05098, 2016J01756)
  • 摘要:

    与经典的K均值聚类算法相比,模糊C均值(FCM)聚类算法通过引入模糊因子,考虑不同聚类数据簇之间的相互关系,得到可分性更好的聚类结果。但是模糊因子的引入,使得任意一个样本点都存在模糊性,造成FCM极易受到噪声和离群点的影响,聚类结果泛化性能较差。因此,该文提出一种簇间可分的鲁棒FCM算法(RBI-FCM)。RBI-FCM利用K均值算法对模糊隶属度的稀疏特征,降低不同数据簇之间的相互作用,突出不同数据簇相邻区域的可分性;另外,RBI-FCM在极小化数据簇内部散布度的条件下,考虑不同数据簇之间的可分性,可提高聚类模型的泛化性能。该文设计了有效的模型求解迭代算法。实验结果表明,RBI-FCM算法提高了FCM的鲁棒性,有效降低FCM对数据簇分布差异性和抽样不均衡的敏感性,得到理想的聚类结果。

  • 图  1  聚类结果最大隶属度值曲线分布情况

    图  2  人造样本疏密分布数据集

    图  3  聚类结果正确率曲线

    图  4  人造样本容量分布不均数据集

    图  5  聚类结果正确率曲线

    图  6  人造非球形样本数据集及聚类结果

    表  1  实验1:人造样本数据集主要参数

    样本集类中心协方差矩阵各类样本数
    1(5, 5), (15, 15)[1 0; 0 1], [1 0; 0 1]50, 50
    2(5, 5), (15, 15)[1 0; 0 1], [2 0; 0 2]50, 50
    $\vdots $$\vdots $$\vdots $$\vdots $
    10(5, 5), (15, 15)[1 0; 0 1], [10 0; 0 10]50, 50
    下载: 导出CSV

    表  2  实验2:人造样本数据集主要参数

    样本集样本随机分布的圆心各类样本数
    1(5, 5), (15, 15)50, 50
    2(5, 5), (15, 15)50, 51
    $\vdots $$\vdots $  $\vdots $$\vdots $
    151(5, 5), (15, 15)50, 200
    下载: 导出CSV

    表  3  UCI数据集聚类实验的NMI正确率和RI正确率

    UCI数据集FCMPFCMGIFP-FCMRBI-FCMUCI数据集FCMPFCMGIFP-FCMRBI-FCM
    Auto-mgp0.51900.51670.50080.5443Wine0.41690.41680.39460.4911
    0.75340.75370.75050.78950.71040.71050.67000.7287
    Zoo0.67600.68240.62840.6873Balance Scale0.12230.12320.12930.1326
    0.83810.84000.82360.84640.58870.59000.58060.5947
    Parkinsons0.09260.09360.05260.1071House Votes0.47430.47430.29170.4948
    0.59340.59340.56930.62660.77520.77520.66880.7890
    Credit Approval0.03040.03040.03650.1020Vowel0.30190.31270.33570.3737
    0.50480.50480.52070.54480.77550.79880.82750.8153
    Banknote Authentication0.02920.02920.11450.5249Mammographic Masses0.10540.10650.10200.1130
    0.52360.52360.55550.80530.56760.56830.55240.5746
    注:每个数据集实验结果的第1行为NMI正确率,第2行为RI正确率
    下载: 导出CSV
  • 陈新泉, 周灵晶, 刘耀中. 聚类算法研究综述[J]. 集成技术, 2017, 6(3): 41–49. doi: 10.3969/j.issn.2095-3135.2017.03.004

    CHEN Xinquan, ZHOU Lingjing, and LIU Yaozhong. Review on clustering algorithms[J]. Journal of Integrati on Technology, 2017, 6(3): 41–49. doi: 10.3969/j.issn.2095-3135.2017.03.004
    张传锦, 李璐璐. 基于模糊C均值聚类的无线传感器网络节点定位算法[J]. 电子设计工程, 2016, 24(8): 58–60. doi: 10.14022/j.cnki.dzsjgc.2016.08.017

    ZHANG Chuanjin and LI Lulu. Improving multilateration algorithm based on fuzzy C-means cluster in WSN[J]. Electronic Design Engineering, 2016, 24(8): 58–60. doi: 10.14022/j.cnki.dzsjgc.2016.08.017
    池桂英, 王忠华. 基于分层的直觉模糊C均值聚类图像分割算法[J]. 计算机工程与设计, 2017(12): 3368–3373. doi: 10.16208/j.issn1000-7024.2017.12.031

    CHI Guiying and WANG Zhonghua. Intuitionistic fuzzy C-means clustering algorithm based on hierarchy for image segmentation[J]. Computer Engineering and Design, 2017(12): 3368–3373. doi: 10.16208/j.issn1000-7024.2017.12.031
    黄艳国, 罗云鹏. 基于改进模糊C均值聚类算法的城市道路状态判别方法[J]. 科学技术与工程, 2018, 18(9): 335–342. doi: 10.3969/j.issn.1671-1815.2018.09.052

    HUANG Yanguo and LUO Yunpeng. Identification method of urban road condition based on improved fuzzy C-means method clustering algorithm[J]. Science Technology and Engineering, 2018, 18(9): 335–342. doi: 10.3969/j.issn.1671-1815.2018.09.052
    赵泉华, 刘晓燕, 赵雪梅, 等. 基于可变类FCM算法的多光谱遥感影像分割[J]. 电子与信息学报, 2018, 40(1): 157–165. doi: 10.11999/JEIT170397

    ZHAO Quanhua, LIU Xiaoyan, ZHAO Xuemei, et al. Multispectral remote sensing image segmentation based on FCM algorithm with unknown number of clusters[J]. Journal of Electronics &Information Technology, 2018, 40(1): 157–165. doi: 10.11999/JEIT170397
    XU Rui and WUNSCH D. Survey of clustering algorithms[J]. IEEE Transactions on Neural Networks, 2005, 16(3): 645–678. doi: 10.1109/tnn.2005.845141
    陈海鹏, 申铉京, 龙建武, 等. 自动确定聚类个数的模糊聚类算法[J]. 电子学报, 2017, 45(3): 687–694. doi: 10.3969/j.issn.0372-2112.2017.03.028

    CHEN Haipeng, SHEN Xuanjing, LONG Jianwu, et al. Fuzzy clustering algorithm for automatic identification of clusters[J]. Acta Electronica Sinica, 2017, 45(3): 687–694. doi: 10.3969/j.issn.0372-2112.2017.03.028
    YANG MiinShen and NATALIANI Y. Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters[J]. Pattern Recognition, 2017, 71: 45–59. doi: 10.1109/nafips.2010.5548175
    PAL N R, PAL K, KELLER J M, et al. A possibilistic fuzzy C-means clustering algorithm[J]. IEEE Transactions on Fuzzy Systems, 2005, 13(4): 517–530. doi: 10.1109/tfuzz.2004.840099
    肖满生, 肖哲, 文志诚, 等. 一种空间相关性与隶属度平滑的FCM改进算法[J]. 电子与信息学报, 2017, 39(5): 1123–1129. doi: 10.11999/JEIT160710

    XIAO Mansheng, XIAO Zhe, WEN Zhicheng, et al. Improved FCM clustering algorithm based on spatial correlation and membership smoothing[J]. Journal of Electronics &Information Technology, 2017, 39(5): 1123–1129. doi: 10.11999/JEIT160710
    LIU Yun, HOU Tao, and LIU Fu. Improving fuzzy c-means method for unbalanced dataset[J]. Electronics Letters, 2015, 51(23): 1880–1882. doi: 10.1049/el.2015.1541
    史慧峰, 马晓宁. 一种自适应的模糊C均值聚类算法[J]. 无线通信技术, 2016, 25(3): 40–45. doi: 10.3969/j.issn.1003-8329.2016.03.009

    SHI Huifeng and MA Xiaoning. An adaptive fuzzy C-means clustering algorithm[J]. Wireless Communication Technology, 2016, 25(3): 40–45. doi: 10.3969/j.issn.1003-8329.2016.03.009
    曲福恒. 模糊聚类算法及应用[M]. 北京: 国防工业出版社, 2011.

    QU Fuheng. Fuzzy clustering algorithm and its application[M]. Beijing, National Defense Industry Press, 2011.
    DUNN J C. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters[J]. Journal of Cybernetics, 1974, 3(3): 32–57. doi: 10.1080/01969727308546046
    BEZDEK J C. Pattern Recognition with Fuzzy Objective Function Algorithms[J]. Springer US, 1981. doi: 10.1007/978-1-4757-0450-1
    ZHU Lin, CHUNG FuLai, and WANG Shitong. Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions[J]. IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics A, 2009, 39(3): 578–591. doi: 10.3724/sp.j.1087.2013.02355
    HÖPPNER F and KLAWONN F. Improved fuzzy partitions for fuzzy regression models[J]. International Journal of Approximate Reasoning, 2003, 32(2): 85–102. doi: 10.1016/s0888-613x(02)00078-6
    DENG Zhaohong, CHOI K S, CHUNG Fulai, et al. Enhanced soft subspace clustering integrating within-cluster and between-cluster information[J]. Pattern Recognition, 2010, 43(3): 767–781. doi: 10.1016/j.patcog.2009.09.010
  • 加载中
图(6) / 表(3)
计量
  • 文章访问数:  2270
  • HTML全文浏览量:  906
  • PDF下载量:  90
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-20
  • 修回日期:  2018-12-24
  • 网络出版日期:  2018-12-28
  • 刊出日期:  2019-05-01

目录

    /

    返回文章
    返回