高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于加权正则化协同表示的非均衡分类方法

李艳婷 王帅 金军委 马江涛 陈雪艳 陈俊龙

李艳婷, 王帅, 金军委, 马江涛, 陈雪艳, 陈俊龙. 基于加权正则化协同表示的非均衡分类方法[J]. 电子与信息学报, 2023, 45(7): 2571-2579. doi: 10.11999/JEIT220753
引用本文: 李艳婷, 王帅, 金军委, 马江涛, 陈雪艳, 陈俊龙. 基于加权正则化协同表示的非均衡分类方法[J]. 电子与信息学报, 2023, 45(7): 2571-2579. doi: 10.11999/JEIT220753
LI Yanting, WANG Shuai, JIN Junwei, MA Jiangtao, CHEN Xueyan, CHEN Junlong. Imbalanced Classification Based on Weighted Regularization Collaborative Representation[J]. Journal of Electronics & Information Technology, 2023, 45(7): 2571-2579. doi: 10.11999/JEIT220753
Citation: LI Yanting, WANG Shuai, JIN Junwei, MA Jiangtao, CHEN Xueyan, CHEN Junlong. Imbalanced Classification Based on Weighted Regularization Collaborative Representation[J]. Journal of Electronics & Information Technology, 2023, 45(7): 2571-2579. doi: 10.11999/JEIT220753

基于加权正则化协同表示的非均衡分类方法

doi: 10.11999/JEIT220753
基金项目: 国家自然科学基金(62106233, 62106068),河南省科技攻关项目(222102210058, 222102210027, 202102210122)
详细信息
    作者简介:

    李艳婷:女,博士,讲师,研究方向为模式识别、人工智能

    王帅:男,硕士生,研究方向为模式识别、机器学习

    金军委:男,博士,讲师,研究方向为模式识别、人工智能

    马江涛:男,博士,副教授,研究方向为知识图谱、人工智能

    陈雪艳:女,博士,讲师,研究方向为通信工程、人工智能

    陈俊龙:男,教授,博士生导师,研究方向为宽度学习、人工智能等

    通讯作者:

    金军委 jinjunwei24@163.com

  • 中图分类号: TP391.4

Imbalanced Classification Based on Weighted Regularization Collaborative Representation

Funds: The National Natural Science Foundation of China (62106233, 62106068), The Science and Technology Research Project of Henan Province (222102210058, 222102210027, 202102210122)
  • 摘要: 协同表示分类器及其变种在模式识别领域展现出优越的识别性能。然而,其成功很大程度上依赖于类别的平衡分布,高度非均衡的类别分布可能会严重影响其有效性。为弥补这一不足,该文把补子空间诱导的正则项引入到协同表示模型框架,使得改进后的正则化模型更具判别性。进一步,为提高非均衡数据集上少数类的识别准确率,根据每类训练样本的表示能力提出一种基于最近子空间的类权学习算法。该算法根据原始数据的先验信息自适应地获取每类的权重并且能够赋予少数类更大的权重,使得最终的分类结果对少数类更加公平。所提模型具有闭式解,这展示了该方法的计算效率。在权威公开的两类和多类非均衡数据集上的实验结果表明所提方法显著优于其他主流非均衡分类算法。
  • 图  1  CRC在两个非均衡数据集上的混淆矩阵

    图  2  测试样本在各类训练集中的重构误差占总体重构误差的比重

    图  3  基于CRC的不同方法在10个非均衡数据集上的对比

    图  4  WRCR在两个数据集上的混淆矩阵

    表  1  16个非均衡数据集的详细信息

    数据集类别样本总数维度类别分布不平衡率
    Wine31781359: 71: 481.48
    Glass5221499: 20522.78
    Glass62214929: 1856.38
    Newthyroid12215535: 1805.14
    Newthyroid32155150: 35: 305.00
    Ecoli32336735: 3018.60
    Ecoli83367143: 77: 2: 2: 35: 20:5:5271.51
    Dermatology636633111: 60: 71: 48: 48: 205.55
    Penbased10110016115: 114: 114: 106: 114: 106: 105: 115: 105: 1061.10
    Shuttle0218299123: 170613.87
    Ecoli0vs12220777: 1431.86
    Balance-scale3625449: 288: 2885.88
    ShuttleC0vsC4218299123: 170613.86
    Glass42214913: 20115.46
    Glass3163470: 76: 174.47
    Glass016vs22192917: 17510.29
    下载: 导出CSV

    表  2  不同方法在Glass5数据集上的运行时间 (s)

    NSCCRCCCRCWRCR
    运行时间 (s)3.19 × 10–34.72 × 10–37.35 × 10–38.07 × 10–3
    下载: 导出CSV

    表  3  WRCR与经典非均衡算法在16个数据集上的F-measure (%) 值对比

    数据集ADASYNSMOTEENNWELMRUSSMOTEMWMOTEEasyEnsembleWRCR
    Wine89.0187.1288.6389.0589.0389.8289.51100.00
    Glass577.4477.8664.3187.1568.7279.2288.42100.00
    Glass688.6189.2382.7282.5183.1483.5285.4290.04
    Newthyroid197.5297.9397.0594.5295.4692.1794.34100.00
    Newthyroid92.5592.6190.4493.2691.7292.8193.2294.77
    Ecoli387.6186.6388.6284.1387.4681.1388.2298.35
    Ecoli29.9138.9230.1435.3233.9034.8227.1453.10
    Dermatology92.8189.9191.3392.3792.2492.1178.7296.25
    Penbased95.6397.5297.8597.3198.4095.8290.5298.40
    Shuttle088.4284.6297.4180.4382.7281.3289.4197.87
    Ecoli0vs195.7294.1798.5191.3494.6996.5697.75100.00
    Balance-scale54.2652.4751.3847.5950.5854.6355.7661.70
    ShuttleC0vsC493.9689.3596.4791.2585.1993.4281.3897.89
    Glass490.3393.6691.3492.4890.3394.1694.4296.18
    Glass48.5951.3654.8148.7549.6550.2351.4856.06
    Glass016vs258.1159.1983.7762.4761.3669.8266.7884.09
    下载: 导出CSV

    表  4  WRCR与经典非均衡算法在16个数据集上的G-mean (%) 值对比

    数据集ADASYNSMOTEENNWELMRUSSMOTEMWMOTEEasyEnsembleWRCR
    Wine84.1180.6394.5183.1583.4184.5388.62100.00
    Glass588.1390.5288.9291.2487.5289.7488.64100.00
    Glass688.6489.2282.7382.5383.1683.0185.4183.33
    Newthyroid195.6598.2397.4496.8295.0794.4294.33100.00
    Newthyroid90.5390.4289.9187.2391.7492.4389.1493.63
    Ecoli383.0282.5184.8382.3284.2382.8384.6387.49
    Ecoli62.3146.5438.9236.7460.0560.2233.8650.07
    Dermatology87.3281.4387.2576.1386.3489.7374.1493.98
    Penbased91.8395.5295.3691.5194.3493.1587.9297.18
    Shuttle087.6197.2197.4197.6584.8185.2092.4197.87
    Ecoli0vs191.5490.3498.5589.2391.3894.7694.84100.00
    Balance-scale52.8354.3750.6548.9854.7652.4255.7861.68
    ShuttleC0vsC492.5186.7692.3690.1883.5791.8387.3897.87
    Glass454.4751.8561.1853.3252.4957.3959.4266.66
    Glass42.3640.4639.4736.5339.7638.7641.0444.01
    Glass016vs245.4847.6947.8945.8349.6751.2853.8966.66
    下载: 导出CSV

    表  5  WRCR与先进非均衡算法的G-mean (%) 值对比

    数据集GDOVW-ELMGEPGMBSCLGSEWRCR
    Glass584.1097.5195.8591.50100.00
    Newthyroid189.9999.5297.33100.00
    Ecoli0vs195.1698.6498.3298.3197.58100.00
    Ecoli388.6791.2092.5788.5398.35
    下载: 导出CSV
  • [1] SHU Ting, ZHANG B, and TANG Yuanyan. Sparse supervised representation-based classifier for uncontrolled and imbalanced classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(8): 2847–2856. doi: 10.1109/TNNLS.2018.2884444
    [2] JIN Junwei, LI Yanting, and CHEN C L P. Pattern classification with corrupted labeling via robust broad learning system[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(10): 4959–4971. doi: 10.1109/TKDE.2021.3049540
    [3] JIN Junwei, LI Yanting, YANG Tiejun, et al. Discriminative group-sparsity constrained broad learning system for visual recognition[J]. Information Sciences, 2021, 576: 800–818. doi: 10.1016/j.ins.2021.06.008
    [4] JIN Junwei, QIN Zhenhao, YU Dengxiu, et al. Regularized discriminative broad learning system for image classification[J]. Knowledge-Based Systems, 2022, 251: 109306. doi: 10.1016/j.knosys.2022.109306
    [5] ZHU Zonghai, WANG Zhe, LI Dongdong, et al. Globalized multiple balanced subsets with collaborative learning for imbalanced data[J]. IEEE Transactions on Cybernetics, 2022, 52(4): 2407–2417. doi: 10.1109/TCYB.2020.3001158
    [6] ZHU Zonghai, WANG Zhe, LI Dongdong, et al. Geometric structural ensemble learning for imbalanced problems[J]. IEEE Transactions on Cybernetics, 2020, 50(4): 1617–1629. doi: 10.1109/TCYB.2018.2877663
    [7] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: Synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16: 321–357. doi: 10.1613/jair.953
    [8] HE Haibo, BAI Yang, GARCIA E A, et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning[C]. Proceedings of the International Joint Conference on Neural Networks, Hong Kong, China, 2008: 1322–1328.
    [9] BARUA S, ISLAM M M, YAO Xin, et al. MWMOTE: Majority weighted minority oversampling technique for imbalanced data set learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(2): 405–425. doi: 10.1109/TKDE.2012.232
    [10] BATISTA G E A P A, PRATI R C, and MONARD M C. A study of the behavior of several methods for balancing machine learning training data[J]. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 20–29. doi: 10.1145/1007730.1007735
    [11] DOUZAS G and BACAO F. Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE[J]. Information Sciences, 2019, 501: 118–135. doi: 10.1016/j.ins.2019.06.007
    [12] WANG Xinyue, XU Jian, ZENG Tieyong, et al. Local distribution-based adaptive minority oversampling for imbalanced data classification[J]. Neurocomputing, 2021, 422: 200–213. doi: 10.1016/j.neucom.2020.05.030
    [13] CHEN Baiyun, XIA Shuyin, CHEN Zizhong, et al. RSMOTE: A self-adaptive robust SMOTE for imbalanced problems with label noise[J]. Information Sciences, 2021, 553: 397–428. doi: 10.1016/j.ins.2020.10.013
    [14] XIE Yuxi, QIU Min, ZHANG Haibo, et al. Gaussian distribution based oversampling for imbalanced data classification[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(2): 667–679. doi: 10.1109/TKDE.2020.2985965
    [15] CAO Changjie, CUI Zongyong, WANG Liying, et al. Cost-sensitive awareness-based SAR automatic target recognition for imbalanced data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1–16. doi: 10.1109/TGRS.2021.3068447
    [16] ZONG Weiwei, HUANG Guangbin, and CHEN Yiqiang. Weighted extreme learning machine for imbalance learning[J]. Neurocomputing, 2013, 101: 229–242. doi: 10.1016/j.neucom.2012.08.010
    [17] LIU Zheng, JIN Wei, and MU Ying. Variances-constrained weighted extreme learning machine for imbalanced classification[J]. Neurocomputing, 2020, 403: 45–52. doi: 10.1016/j.neucom.2020.04.052
    [18] ZHANG Lei, YANG Meng, and FENG Xiangchu. Sparse representation or collaborative representation: Which helps face recognition[C]. Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 2011: 471–478.
    [19] YUAN Haoliang, LI Xuecong, XU Fangyuan, et al. A collaborative-competitive representation based classifier model[J]. Neurocomputing, 2018, 275: 627–635. doi: 10.1016/j.neucom.2017.09.022
    [20] LI Yanting, JIN Junwei, ZHAO Liang, et al. A neighborhood prior constrained collaborative representation for classification[J]. International Journal of Wavelets, Multiresolution and Information Processing, 2021, 19(2): 2050073. doi: 10.1142/S0219691320500733
    [21] KHAN M M R, ARIF R B, SIDDIQUE M A B, et al. Study and observation of the variation of accuracies of KNN, SVM, LMNN, ENN algorithms on eleven different datasets from UCI machine learning repository[C]. Proceedings of the 4th International Conference on Electrical Engineering and Information & Communication Technology. Dhaka, Bangladesh, 2018: 124–129.
    [22] LIU Xuying, WU Jianxin, and ZHOU Zhihua. Exploratory undersampling for class-imbalance learning[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39(2): 539–550. doi: 10.1109/TSMCB.2008.2007853
    [23] JEDRZEJOWICZ J and JEDRZEJOWICZ P. GEP-based classifier for mining imbalanced data[J]. Expert Systems with Applications, 2021, 164: 114058. doi: 10.1016/j.eswa.2020.114058
  • 加载中
图(4) / 表(5)
计量
  • 文章访问数:  345
  • HTML全文浏览量:  212
  • PDF下载量:  76
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-06-27
  • 修回日期:  2023-03-30
  • 网络出版日期:  2023-03-31
  • 刊出日期:  2023-07-10

目录

    /

    返回文章
    返回