高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于标记密度分类间隔面的组类属属性学习

王一宾 裴根生 程玉胜

王一宾, 裴根生, 程玉胜. 基于标记密度分类间隔面的组类属属性学习[J]. 电子与信息学报, 2020, 42(5): 1179-1187. doi: 10.11999/JEIT190343
引用本文: 王一宾, 裴根生, 程玉胜. 基于标记密度分类间隔面的组类属属性学习[J]. 电子与信息学报, 2020, 42(5): 1179-1187. doi: 10.11999/JEIT190343
Yibin WANG, Gensheng PEI, Yusheng CHENG. Group-Label-Specific Features Learning Based on Label-Density Classification Margin[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1179-1187. doi: 10.11999/JEIT190343
Citation: Yibin WANG, Gensheng PEI, Yusheng CHENG. Group-Label-Specific Features Learning Based on Label-Density Classification Margin[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1179-1187. doi: 10.11999/JEIT190343

基于标记密度分类间隔面的组类属属性学习

doi: 10.11999/JEIT190343
基金项目: 安徽省自然科学基金
详细信息
    作者简介:

    王一宾:男,1970年生,教授,研究方向为多标记学习,机器学习,软件安全等

    裴根生:男,1992年生,硕士,研究方向为机器学习,数据挖掘,统计等

    程玉胜:男,1969年生,教授,研究方向为数据挖掘,机器学习等

    通讯作者:

    程玉胜 chengyshaq@163.com

  • 1)http://mulan.sourceforge.net/datasets-mlc.html2)http://cse.seu.edu.cn/PersonalPage/zhangml/files/Image.rar3)http://waikato.github.io/meka/datasets
  • 中图分类号: TP391

Group-Label-Specific Features Learning Based on Label-Density Classification Margin

Funds: The Natural Science Foundation of Anhui Province
  • 摘要:

    类属属性学习避免相同属性预测全部标记,是一种提取各标记独有属性进行分类的一种框架,在多标记学习中得到广泛的应用。而针对标记维度较大、标记分布密度不平衡等问题,已有的基于类属属性的多标记学习算法普遍时间消耗大、分类精度低。为提高多标记分类性能,该文提出一种基于标记密度分类间隔面的组类属属性学习(GLSFL-LDCM)方法。首先,使用余弦相似度构建标记相关性矩阵,通过谱聚类将标记分组以提取各标记组的类属属性,减少计算全部标记类属属性的时间消耗。然后,计算各标记密度以更新标记空间矩阵,将标记密度信息加入原标记中,扩大正负标记的间隔,通过标记密度分类间隔面的方法有效解决标记分布密度不平衡问题。最后,通过将组类属属性和标记密度矩阵输入极限学习机以得到最终分类模型。对比实验充分验证了该文所提算法的可行性与稳定性。

  • 图  1  标记密度间隔曲面

    图  2  算法性能比较

    图  3  类属属性提取系数矩阵对比

    表  1  标记空间虚拟数据集

    标记编号原标记密度标记
    Y1Y2Y3Y4Y1Y2Y3Y4
    1+1–1–1+1+1.333–1.273–1.318+1.278
    2+1–1–1–1+1.333–1.273–1.318–1.227
    3–1+1–1–1–1.182+1.222–1.318–1.227
    4+1–1–1+1+1.333–1.273–1.318+1.278
    5–1–1+1+1–1.182–1.273+1.167+1.278
    6+1–1+1–1+1.333–1.273+1.167–1.227
    7+1+1–1+1+1.333+1.222–1.318+1.278
    8–1+1–1–1–1.182+1.222–1.318–1.227
    9+1–1–1+1+1.333–1.273–1.318+1.278
    10–1+1+1–1–1.182+1.222+1.167–1.227
    下载: 导出CSV

    表  2  GLSFL-LDCM算法步骤

     输入:训练数据集$D = \left\{ {{{{x}}_i},{{{Y}}_i}} \right\}_{i = 1}^N$,测试数据集
     ${D^*} = \left\{ {{{x}}_j^*} \right\}_{j = 1}^{{N^*}}$,RBF核参数γ,惩罚因子C,类属属性参数:
     α, β, μ,聚类数K
     输出:预测标记Y*.
     Training: training data set D
     (1) 用式(1)、式(2)计算余弦相似度,构造标记相关性矩阵LC
     (2) 用式(3)谱聚类将标记分组:G=[G1,G2, ···, GK]
     (3) 用式(5)、式(6)构建类属属性提取矩阵S
     (4) 通过式(7)、式(8)更新标记空间,构造标记密度矩阵:YD
     (5) For k = 1, 2, ···, K do
     ${{\varOmega}} _{{\rm{ELM}}}^k = {{{\varOmega}} _{{\rm{ELM}}}}({{x}}(:,{{{S}}^k} \ne 0))$
     ${\bf{YD}}{^k} = {\bf{YD}}({{{G}}_k})$
     ${ {{\beta} } ^k} = {\left(\dfrac{ {{I} } }{C} + {{\varOmega} } _{ {\rm{ELM} } }^k\right)^{ - 1} }{\bf{YD} }{^k}$
     Prediction: testing data set D*
     (a) For k = 1, 2, ···, K do
     ${{G}}_k^* = {{{\varOmega}} _{{\rm{ELM}}}}({{{x}}^*}(:,{{{S}}^k} \ne 0)){{{\beta}} ^k}$
     (b) ${{{Y}}^*} = \left[ {{{G}}_1^*,{{G}}_2^*,...,{{G}}_K^*} \right]$
    下载: 导出CSV

    表  3  多标记数据描述

    数据集样本数特征数标记数标记基数应用领域
    Emotions1)5937261.869MUSIC
    Genbase1)6621186271.252BIOLOGY
    Medical1)9781449451.245TEXT
    Enron3)17021001534.275TEXT
    Image2)200029451.236IMAGE
    Scene1)240729461.074IMAGE
    Yeast1)2417103144.237BIOLOGY
    Slashdot3)37821079220.901TEXT
    下载: 导出CSV

    表  4  对比算法实验结果

    数据集ML-kNNLIFTFRS-LIFTFRS-SS-LIFTLLSF-DLGLSFL-LDCM
    HL↓
    Emotions0.1998±0.0167●0.1854±0.0260●0.1798±0.0290●0.1809±0.0310●0.2035±0.0082●0.1782±0.0154
    Genbase0.0043±0.0017●0.0011±0.0016●0.0015±0.0009●0.0017±0.0011●0.0008±0.0014●0.0006±0.0005
    Medical0.0158±0.0015●0.0115±0.0013●0.0087±0.00140.0089±0.00130.0092±0.00040.0089±0.0021
    Enron0.0482±0.0043●0.0365±0.0034○0.0341±0.00320.0372±0.0034○0.0369±0.0034○0.0468±0.0021
    Image0.1701±0.0141●0.1567±0.0136●0.1479±0.0103●0.1468±0.0097●0.1828±0.0152●0.1397±0.0133
    Scene0.0852±0.0060●0.0772±0.0047●0.0740±0.0052●0.0751±0.0057●0.1008±0.0059●0.0682±0.0084
    Yeast0.1934±0.0116●0.1919±0.0083●0.1875±0.0114●0.1869±0.0111●0.2019±0.0060●0.1855±0.0079
    Slashdot0.0221±0.0010●0.0159±0.0009○0.0159±0.0011○0.0160±0.0011○0.0158±0.00120.0196±0.0010
    win/tie/loss8/0/06/0/25/0/35/1/25/1/2
    数据集ML-kNNLIFTFRS-LIFTFRS-SS-LIFTLLSF-DLGLSFL-LDCM
    OE↓
    Emotions0.2798±0.0441●0.2291±0.0645●0.2155±0.06080.2223±0.0651●0.2583±0.0201●0.2157±0.0507
    Genbase0.0121±0.0139●0.0015±0.00470.0015±0.00470.0030±0.0094●0.0000±0.00000.0015±0.0048
    Medical0.2546±0.0262●0.1535±0.0258●0.1124±0.02790.1186±0.0231○0.1285±0.0271●0.1226±0.0383
    Enron0.5158±0.0417●0.4279±0.0456●0.3084±0.0444●0.3256±0.0437●0.2704±0.0321●0.2221±0.0227
    Image0.3195±0.0332●0.2680±0.0256●0.2555±0.0334●0.2490±0.0226●0.3180±0.0326●0.2365±0.0224
    Scene0.2185±0.0313●0.1924±0.0136●0.1841±0.0156●0.1836±0.0195●0.2323±0.0267●0.1562±0.0316
    Yeast0.2251±0.0284●0.2177±0.0255●0.2147±0.0171●0.2085±0.0156●0.2267±0.0239●0.2072±0.0250
    Slashdot0.0946±0.0143●0.0898±0.0134●0.0858±0.01620.0864±0.0138○0.0887±0.0123●0.0874±0.0107
    win/tie/loss8/0/07/1/04/2/26/0/27/0/1
    数据集ML-kNNLIFTFRS-LIFTFRS-SS-LIFTLLSF-DLGLSFL-LDCM
    RL↓
    Emotions0.1629±0.0177●0.1421±0.0244●0.1401±0.0299●0.1406±0.0280●0.1819±0.0166●0.1375±0.0226
    Genbase0.0062±0.0082●0.0034±0.0065●0.0043±0.0071●0.0051±0.0077●0.0071±0.0031●0.0017±0.0025
    Medical0.0397±0.0093●0.0262±0.0072●0.0248±0.0108●0.0236±0.0074●0.0218±0.0080●0.0148±0.0096
    Enron0.1638±0.0222●0.1352±0.0190●0.0953±0.0107●0.1046±0.0099●0.0927±0.0069●0.0735±0.0084
    Image0.1765±0.0202●0.1425±0.0169●0.1378±0.0149●0.1323±0.0171●0.1695±0.0162●0.1294±0.0127
    Scene0.0760±0.0100●0.0604±0.0047●0.0601±0.0061●0.0592±0.0072●0.0803±0.0133●0.0515±0.0093
    Yeast0.1666±0.0149●0.1648±0.0121●0.1588±0.0150●0.1560±0.0138●0.1716±0.0145●0.1551±0.0100
    Slashdot0.0497±0.0072●0.0418±0.0062●0.0289±0.0038●0.0311±0.0038●0.0307±0.0058●0.0126±0.0018
    win/tie/loss8/0/08/0/08/0/08/0/08/0/0
    数据集ML-kNNLIFTFRS-LIFTFRS-SS-LIFTLLSF-DLGLSFL-LDCM
    AP↑
    Emotions0.7980±0.0254●0.8236±0.0334●0.8280±0.0411●0.8268±0.0400●0.7504±0.0120●0.8316±0.0265
    Genbase0.9873±0.0121●0.9958±0.0078●0.9944±0.0078●0.9935±0.0085●0.9928±0.0024●0.9962±0.0057
    Medical0.8068±0.0248●0.8784±0.0145●0.9096±0.0176●0.9087±0.0155●0.9028±0.0172●0.9122±0.0281
    Enron0.5134±0.0327●0.5620±0.0321●0.6611±0.0408●0.6481±0.0287●0.6632±0.0182●0.6923±0.0159
    Image0.7900±0.0203●0.8240±0.0169●0.8314±0.0177●0.8364±0.0162●0.7943±0.0177●0.8444±0.0118
    Scene0.8687±0.0164●0.8884±0.0081●0.8913±0.0084●0.8921±0.0101●0.8609±0.0182●0.9082±0.0173
    Yeast0.7659±0.0194●0.7685±0.0148●0.7762±0.0172●0.7790±0.0167●0.7633±0.0160●0.7798±0.0140
    Slashdot0.8835±0.0116●0.8927±0.0091●0.9045±0.0098●0.9038±0.0074●0.9017±0.0095●0.9247±0.0059
    win/tie/loss8/0/08/0/08/0/08/0/08/0/0
    下载: 导出CSV

    表  5  各算法的时耗对比(s)

    数据集123456
    Emotions0.20.454.08.70.10.1
    Genbase1.02.915.01.70.90.2
    Medical4.312.566.314.82.30.4
    Enron6.548.11292.7182.70.60.6
    Image3.48.11805.2320.50.10.2
    Scene5.47.92174.1404.20.10.2
    Yeast3.544.313113.43297.70.20.3
    Slashdot34.184.511895.52650.01.10.8
    平均7.326.13802.0860.00.70.4
    下载: 导出CSV

    表  6  模型分解对比实验

    数据集KELMLSFL-KELMGLSFL-KELMLDCM-KELM
    HL↓
    Emotions0.1840±0.02750.1837±0.02530.1824±0.01960.1802±0.0295
    Genbase0.0010±0.00080.0008±0.00050.0006±0.00060.0007±0.0006
    Medical0.0094±0.00300.0093±0.00170.0091±0.00160.0092±0.0019
    Scene0.0706±0.00510.0693±0.00790.0683±0.00590.0682±0.0062
    数据集KELMLSFL-KELMGLSFL-KELMLDCM-KELM
    AP↑
    Emotions0.8144±0.03690.8223±0.02520.8296±0.02780.8306±0.0429
    Genbase0.9926±0.00460.9928±0.00480.9961±0.00460.9956±0.0038
    Medical0.9077±0.02620.9092±0.02290.9124±0.02050.9126±0.0306
    Scene0.9010±0.01270.9024±0.01860.9059±0.01320.9033±0.0152
    下载: 导出CSV
  • ZHANG Minling and ZHOU Zhihua. ML-KNN: A lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038–2048. doi: 10.1016/j.patcog.2006.12.019
    LIU Yang, WEN Kaiwen, GAO Quanxue, et al. SVM based multi-label learning with missing labels for image annotation[J]. Pattern Recognition, 2018, 78: 307–317. doi: 10.1016/j.patcog.2018.01.022
    ZHANG Junjie, WU Qi, SHEN Chunhua, et al. Multilabel image classification with regional latent semantic dependencies[J]. IEEE Transactions on Multimedia, 2018, 20(10): 2801–2813. doi: 10.1109/TMM.2018.2812605
    AL-SALEMI B, AYOB M, and NOAH S A M. Feature ranking for enhancing boosting-based multi-label text categorization[J]. Expert Systems with Applications, 2018, 113: 531–543. doi: 10.1016/j.eswa.2018.07.024
    ZHANG Minling and ZHOU Zhihua. Multilabel neural networks with applications to functional genomics and text categorization[J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(10): 1338–1351. doi: 10.1109/TKDE.2006.162
    GUAN Renchu, WANG Xu, YANG M Q, et al. Multi-label deep learning for gene function annotation in cancer pathways[J]. Scientific Reports, 2018, 8: No. 267. doi: 10.1038/s41598-017-17842-9
    SAMY A E, EL-BELTAGY S R, and HASSANIEN E. A context integrated model for multi-label emotion detection[J]. Procedia Computer Science, 2018, 142: 61–71. doi: 10.1016/j.procs.2018.10.461
    ALMEIDA A M G, CERRI R, PARAISO E C, et al. Applying multi-label techniques in emotion identification of short texts[J]. Neurocomputing, 2018, 320: 35–46. doi: 10.1016/j.neucom.2018.08.053
    TSOUMAKAS G and KATAKIS I. Multi-label classification: An overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): No. 1. doi: 10.4018/jdwm.2007070101
    ZHANG Minling and ZHOU Zhihua. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837. doi: 10.1109/TKDE.2013.39
    CRAMMER K, DREDZE M, GANCHEV K, et al. Automatic code assignment to medical text[C]. Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, Stroudsburg, USA, 2007: 129–136.
    ZHANG Minling and WU Lei. Lift: Multi-label learning with label-specific features[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120. doi: 10.1109/TPAMI.2014.2339815
    XU Suping, YANG Xibei, YU Hualong, et al. Multi-label learning with label-specific feature reduction[J]. Knowledge-Based Systems, 2016, 104: 52–61. doi: 10.1016/j.knosys.2016.04.012
    SUN Lu, KUDO M, and KIMURA K. Multi-label classification with meta-label-specific features[C]. 2016 IEEE International Conference on Pattern Recognition, Cancun, Mexico, 2016: 1612–1617. doi: 10.1109/ICPR.2016.7899867.
    HUANG Jun, LI Guorong, HUANG Qingming, et al. Joint feature selection and classification for multilabel learning[J]. IEEE Transactions on Cybernetics, 2018, 48(3): 876–889. doi: 10.1109/TCYB.2017.2663838
    WENG Wei, LIN Yaojin, WU Shunxiang, et al. Multi-label learning based on label-specific features and local pairwise label correlation[J]. Neurocomputing, 2018, 273: 385–394. doi: 10.1016/j.neucom.2017.07.044
    HUANG Jun, LI Guorong, HUANG Qingming, et al. Learning label-specific features and class-dependent labels for multi-label classification[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(12): 3309–3323. doi: 10.1109/TKDE.2016.2608339
    HUANG Guangbin, ZHU Qinyu, and SIEW C K. Extreme learning machine: Theory and applications[J]. Neurocomputing, 2006, 70(1/3): 489–501. doi: 10.1016/j.neucom.2005.12.126
    HUANG Guangbin, ZHOU Hongming, DING Xiaojian, et al. Extreme learning machine for regression and multiclass classification[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) , 2012, 42(2): 513–529. doi: 10.1109/TSMCB.2011.2168604
    赵小强, 刘晓丽. 基于公理化模糊子集的改进谱聚类算法[J]. 电子与信息学报, 2018, 40(8): 1904–1910. doi: 10.11999/JEIT170904

    ZHAO Xiaoqiang and LIU Xiaoli. An improved spectral clustering algorithm based on axiomatic fuzzy set[J]. Journal of Electronics &Information Technology, 2018, 40(8): 1904–1910. doi: 10.11999/JEIT170904
    BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends® in Machine learning, 2010, 3(1): 1–122. doi: 10.1561/2200000016
    LIU Xinwang, WANG Lei, HUANG Guangbin, et al. Multiple kernel extreme learning machine[J]. Neurocomputing, 2015, 149: 253–264. doi: 10.1016/j.neucom.2013.09.072
    邓万宇, 郑庆华, 陈琳, 等. 神经网络极速学习方法研究[J]. 计算机学报, 2010, 33(2): 279–287. doi: 10.3724/SP.J.1016.2010.00279

    DENG Wanyu, ZHENG Qinghua, CHEN Lin, et al. Research on extreme learning of neural networks[J]. Chinese Journal of Computers, 2010, 33(2): 279–287. doi: 10.3724/SP.J.1016.2010.00279
    ZHOU Zhihua, ZHANG Minling, HUANG Shengjun, et al. Multi-instance multi-label learning[J]. Artificial Intelligence, 2012, 176(1): 2291–2320. doi: 10.1016/j.artint.2011.10.002
    PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A method for automatic evaluation of machine translation[C]. The 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, USA, 2002: 311–318. doi: 10.3115/1073083.1073135.
  • 加载中
图(3) / 表(6)
计量
  • 文章访问数:  3704
  • HTML全文浏览量:  1243
  • PDF下载量:  75
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-05-18
  • 修回日期:  2019-09-30
  • 网络出版日期:  2020-01-29
  • 刊出日期:  2020-06-04

目录

    /

    返回文章
    返回