Advanced Search
Volume 42 Issue 8
Aug.  2020
Turn off MathJax
Article Contents
Mansheng XIAO, Longxin ZHANG, Xiaoli ZHANG, Yongxiang HU. An Improved Fuzzy Clustering Method for Interval Uncertain Data[J]. Journal of Electronics & Information Technology, 2020, 42(8): 1968-1974. doi: 10.11999/JEIT190591
Citation: Mansheng XIAO, Longxin ZHANG, Xiaoli ZHANG, Yongxiang HU. An Improved Fuzzy Clustering Method for Interval Uncertain Data[J]. Journal of Electronics & Information Technology, 2020, 42(8): 1968-1974. doi: 10.11999/JEIT190591

An Improved Fuzzy Clustering Method for Interval Uncertain Data

doi: 10.11999/JEIT190591
Funds:  The National Natural Science Foundation of China (61702178), The Natural Science Foundation of Hunan Provierce (2018554068), The Research Project of Hunan Provincial Department of Education (18C0499)
  • Received Date: 2019-08-06
  • Rev Recd Date: 2020-02-19
  • Available Online: 2020-03-14
  • Publish Date: 2020-08-18
  • An Improved Fuzzy C-Means clustering algorithm (IU-IFCM) is proposed in this study in accordance with the characteristics of Interval Uncertain data. First, the interval data is transformed into real data composed of 2p dimension feature, which is mapped from that of p dimension feature. Second, a method for calculating sample distance, which realizes the interval sample clustering by fuzzy c-mean algorithm, is designed while considering the relationship between interval median value and interval size. Theoretical analysis and comparison experiments show that the presented algorithm surpaes the compared algorithms by more than 10% on average in terms of the Partition Coefficient (PC) and Correct Rank(CR) value. These results indicate that the algorithm presents in this study has better clustering accuracy and provides a new solution for the classification of uncertain data in current big data environments.

  • loading
  • JIANG Bin, PEI Jian, TAO Yufei, et al. Clustering uncertain data based on probability distribution similarity[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(4): 751–763. doi: 10.1109/TKDE.2011.221
    GULLO F and TAGARELLI A. Uncertain centroid based partitional clustering of uncertain data[J]. Proceedings of the VLDB Endowment, 2012, 5(7): 610–621. doi: 10.14778/2180912.2180914
    DALLACHIESA M, JACQUES-SILVA G, GEDIK B, et al. Sliding windows over uncertain data streams[J]. Knowledge and Information Systems, 2015, 45(1): 159–190. doi: 10.1007/s10115-014-0804-5
    彭宇, 罗清华, 彭喜元. UIDK-means: 多维不确定性测量数据聚类算法[J]. 仪器仪表学报, 2011, 32(6): 1201–1207. doi: 10.19650/j.cnki.cjsi.2011.06.001

    PENG Yu, LUO Qinghua, and PENG Xiyuan. UIDK-means: A Multi-dimensional uncertain measurement data clustering algorithm[J]. Chinese Journal of Scientific Instrument, 2011, 32(6): 1201–1207. doi: 10.19650/j.cnki.cjsi.2011.06.001
    BAO Chaozheng, PENG Hongming, HE Di, et al. Adaptive fuzzy c-means clustering algorithm for interval data type based on interval-dividing technique[J]. Pattern Analysis and Applications, 2018, 21(3): 803–812. doi: 10.1007/s10044-017-0663-2
    D’URSO P, MASSARI R, DE GIOVANNI L, et al. Exponential distance-based fuzzy clustering for interval-valued data[J]. Fuzzy Optimization and Decision Making, 2017, 16(1): 51–70. doi: 10.1007/s10700-016-9238-8
    BRITO P, SILVA A P D, and DIAS J G. Probabilistic clustering of interval data[J]. Intelligent Data Analysis, 2015, 19(2): 293–313. doi: 10.3233/IDA-150718
    HAMDAN H. Maximum likelihood estimation from interval-valued data. Application to fuzzy clustering[C]. The 13th International Conference on Theory and Application of Fuzzy Systems and Soft Computing -ICAFS-2018. Istanbul, Turkey, 2019: 3–10. doi: 10.1007/978-3-030-04164-9_3.
    谢志伟, 王志明. 一种区间型数据的自适应模糊C均值聚类算法[J]. 计算机工程与应用, 2012, 48(17): 193–198, 237. doi: 10.3778/j.issn.1002-8331.2012.17.038

    XIE Zhiwei and WANG Zhiming. Self-adapting fuzzy c means clustering algorithm for interval data[J]. Computer Engineering and Applications, 2012, 48(17): 193–198, 237. doi: 10.3778/j.issn.1002-8331.2012.17.038
    GAO Xinbo, JI Hongbing, and XIE Weixin. A novel FCM clustering algorithm for interval-valued data and fuzzy-valued data[C]. The 5th International Conference on Signal Processing Proceedings. The 16th World Computer Congress 2000, Beijing, China, 2000: 1551–1555. doi: 10.1109/ICOSP.2000.893395.
    MACIEL L, BALLINI R, GOMIDE F, et al. Participatory learning fuzzy clustering for interval-valued data[C]. The 16th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Eindhoven, The Netherlands, 2016: 687–698. doi: 10.1007/978-3-319-40596-4_57.
    兰蓉. 模糊信息距离及其若干应用[D]. [博士论文], 西安电子科技大学, 2013: 61–73.

    LAN Rong. Fuzzy information distances and their some applications[D]. [Ph.D. dissertation], Xidian University, 2013: 61–73.
    金萍, 宗瑜, 屈世超, 等. 面向不确定数据的近似骨架启发式聚类算法[J]. 南京大学学报: 自然科学, 2015, 51(1): 197–205. doi: 10.13232/j.cnki.jnju.2015.01.027

    JIN Ping, ZONG Yu, QU Shichao, et al. Approximate backbone guided heuristic clustering algorithm for uncertain data[J]. Journal of Nanjing University:Natural Sciences, 2015, 51(1): 197–205. doi: 10.13232/j.cnki.jnju.2015.01.027
    魏方圆, 黄德才. 基于区间数的多维不确定性数据UID-DBSCAN聚类算法[J]. 计算机科学, 2017, 44(11A): 442–447. doi: 10.11896/j.issn.1002-137X.2017.11A.094

    WEI Fangyuan and HUANG Decai. UID-DBSCAN clustering algorithm of multi-dimensional uncertain data based on interval number[J]. Computer Science, 2017, 44(11A): 442–447. doi: 10.11896/j.issn.1002-137X.2017.11A.094
    ZHANG Qin, FANG Zhigeng, LIU Sifeng, et al. On variable weight clustering model of generalized interval grey numbers for multiple uncertain data[J]. Journal of Grey System, 2019, 31(1): 84–99.
    陆亿红, 任胜亮. 基于区间数的不确定数据流2κ近邻聚类算法[J]. 浙江工业大学学报, 2018, 46(3): 321–326. doi: 10.3969/j.issn.1006-4303.2018.03.015

    LU Yihong and REN Shengliang. The clustering algorithm of uncertain data stream 2κ-near neighbors based on interval number[J]. Journal of Zhejiang University of Technology, 2018, 46(3): 321–326. doi: 10.3969/j.issn.1006-4303.2018.03.015
    张新猛, 蒋盛益. 一种基于相似度概率的不确定分类数据聚类算法[J]. 山东大学学报: 工学版, 2011, 41(3): 12–16.

    ZHANG Xinmeng and JIANG Shengyi. An algorithm for clustering uncertain categorical data based on similarity probability[J]. Journal of Shandong University:Engineering Science, 2011, 41(3): 12–16.
    TRAN L and DUCKSTEIN L. Comparison of fuzzy numbers using a fuzzy distance measure[J]. Fuzzy Sets and Systems, 2002, 130(3): 331–341. doi: 10.1016/s0165-0114(01)00195-6
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(4)  / Tables(3)

    Article Metrics

    Article views (2263) PDF downloads(110) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return