高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

带全局噪声增强的多模态超图学习引导用于模态信息缺失情感分析

黄辰 刘会杰 张龑 杨超 宋建华

黄辰, 刘会杰, 张龑, 杨超, 宋建华. 带全局噪声增强的多模态超图学习引导用于模态信息缺失情感分析[J]. 电子与信息学报. doi: 10.11999/JEIT250649
引用本文: 黄辰, 刘会杰, 张龑, 杨超, 宋建华. 带全局噪声增强的多模态超图学习引导用于模态信息缺失情感分析[J]. 电子与信息学报. doi: 10.11999/JEIT250649
HUANG Chen, LIU Huijie, ZHANG Yan, YANG Chao, SONG Jianhua. Multimodal Hypergraph Learning Guidance with Global Noise Enhancement for Sentiment Analysis under Missing Modality Information[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250649
Citation: HUANG Chen, LIU Huijie, ZHANG Yan, YANG Chao, SONG Jianhua. Multimodal Hypergraph Learning Guidance with Global Noise Enhancement for Sentiment Analysis under Missing Modality Information[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250649

带全局噪声增强的多模态超图学习引导用于模态信息缺失情感分析

doi: 10.11999/JEIT250649 cstr: 32379.14.JEIT250649
基金项目: 武汉市知识创新专项项目(202311901251001),湖北省科技计划重大科技专项(2024BAA008),深圳市科技攻关重点项目(2020N061)
详细信息
    作者简介:

    黄辰:男,教授,研究方向为物联网、自动驾驶、脑机接口、机器学习和大数据分析

    刘会杰:男,硕士生,研究方向为机器学习、深度学习、脑机接口和情感分析

    张龑:男,教授,研究方向为信息安全、大数据分析、软件缺陷检测

    杨超:男,教授,研究方向为信息安全、智能计算

    宋建华:女,教授,研究方向为信息安全、网络安全

    通讯作者:

    张龑 zhangyan@hubu.edu.cn

  • 中图分类号: TN911.7; TP391

Multimodal Hypergraph Learning Guidance with Global Noise Enhancement for Sentiment Analysis under Missing Modality Information

Funds: Wuhan Knowledge Innovation Special Project(202311901251001), Hubei Provincial Science and Technology Plan Major Science and Technology Special Project (2024BAA008), The Key Projects of Science and Technology in Shenzhen (2020N061)
  • 摘要: 多模态情感分析(MSA)通过多种模态信息来全面揭示人类情感状态。现有MSA研究在面临现实世界中的复杂场景时,仍然面临两方面的关键挑战:(1)忽略了现实世界复杂场景下的模态信息缺失,以及模型鲁棒性问题。(2)缺乏模态间丰富的高阶语义关联学习和跨模态信息传递机制。为了克服这些问题,该文提出一种带全局噪声增强的多模态超图学习引导情感分析方法(MHLGNE),旨在增强现实世界复杂场景中模态信息缺失条件下的多模态情感分析性能。具体而言,MHLGNE通过专门设计的自适应全局噪声采样模块从全局视角补充缺失的模态信息,从而增强模型的鲁棒性,并提高泛化能力。此外,还提出一个多模态超图学习引导模块来学习模态间丰富的高阶语义关联并引导跨模态信息传递。在公共数据集上的大量实验评估表明,MHLGNE在克服这些挑战方面表现优异。
  • 图  1  现有MSA方法和本文的MHLGNE方法的直观理解

    图  2  MHLGNE的整体架构图

    图  3  MHLGNE在SEED-V数据集上的案例研究

    表  1  实验数据集的统计数据

    数据集训练集验证集测试集总和受试者模态/维度
    SEED-IV425018362041812715(×3)脑电信号/310(62×5)
    SEED-V8234210641021463216(×3)脑电信号/310(62×5)
    DREAMER604814161835929923(×3)脑电信号/70(14×5)
    下载: 导出CSV

    表  2  在完整模态数据设置下,不同方法在SEED-IV, SEED-V和DREAMER数据集上执行MSA任务的结果

    方法SEED-IVSEED-VDREAMER
    Acc(%)Pre(%)Kappa(%)Efficiency(ms)Acc(%)Pre(%)Kappa(%)Efficiency(ms)Acc(%)Pre(%)F1(%)Efficiency(ms)
    BDAE-regressor73.6072.930.7512.79170.0369.240.6214.30278.2579.740.729.183
    BDAE-cGAN75.7274.200.6628.68473.6073.820.6132.49180.2379.930.6320.104
    Uni-Code66.2361.530.629.36250.7549.670.5210.20368.1767.590.635.402
    ECO-FET70.0372.340.7449.27860.0760.420.5151.36173.0975.430.7337.621
    CTFN64.2764.320.7116.39262.7362.560.6519.30273.4872.240.6610.451
    EMMR68.1668.130.60190.35065.0464.350.68203.16671.6271.270.65182.580
    TFR-Net78.4274.100.76247.61076.6075.310.70249.01780.2378.470.82210.096
    MAET77.4975.020.75392.47176.5375.820.72418.71282.3580.920.78370.204
    CAETFN82.7181.900.78529.30681.9381.040.77682.17392.1191.800.83492.035
    HAS-Former84.3683.740.8291.02583.0282.610.78204.72392.0892.850.8783.904
    M2S84.7085.470.83102.39482.7783.520.78172.48093.7494.160.8792.480
    MEDA80.4282.140.75271.10380.2080.360.74256.90285.4284.290.82219.032
    MHLGNE87.9686.700.89362.40285.1285.400.84375.19294.3294.020.88306.271
    下载: 导出CSV

    表  3  在随机模态信息缺失设置下,不同方法在SEED-IV, SEED-V和DREAMER数据集上执行MSA任务的整体性能比较

    方法SEED-IVSEED-VDREAMER
    Acc(%)Pre(%)Kappa(%)Efficiency(ms)Acc(%)Pre(%)Kappa(%)Efficiency(ms)Acc(%)Pre(%)F1(%)Efficiency(ms)
    BDAE-regressor52.4251.290.5028.02348.5748.130.4867.20153.3052.490.6232.503
    BDAE-cGAN50.2150.130.4037.96147.3146.800.4192.28051.2951.130.6052.401
    Uni-Code51.8651.430.4120.27450.2048.970.4149.01458.2057.900.6728.194
    ECO-FET56.9055.680.5162.01753.0151.240.5090.10365.9065.420.7260.209
    CTFN58.7157.110.5220.42055.4453.760.5058.09168.2166.200.7442.501
    EMMR64.3464.230.55186.65260.1259.030.53241.32072.3271.020.82220.590
    TFR-Net68.2068.190.58260.72662.4060.200.57293.65174.2573.970.78248.102
    MAET70.0970.250.60411.10261.3760.820.54453.20472.4572.100.75391.472
    CAETFN74.1673.600.62540.23064.9264.150.60729.60878.0377.820.79521.070
    HAS-Former73.2072.980.62122.91765.0864.590.61237.50282.1982.950.80109.271
    M2S75.4974.020.63124.10366.2565.380.62203.20685.6084.200.85100.726
    MEDA71.3070.840.61291.41065.4264.920.62291.35282.4281.910.78232.910
    MHLGNE76.5274.260.71368.50466.9265.720.63384.10284.1382.920.80312.159
    下载: 导出CSV

    表  4  MHLGNE模态信息完全缺失的研究分析(%)

    方法SEED-IVSEED-VDREAMER
    AccPreKappaAccPreKappaAccPreF1
    脑电信号77.2082.460.6877.6581.490.6983.9790.530.64
    视觉信息75.2681.140.6674.8380.100.6782.4389.140.61
    文本信息79.6383.400.7379.9082.210.7286.4591.420.68
    脑电信号 + 视觉信息80.7684.100.7079.2083.090.7087.9693.430.67
    视觉信息 + 文本信息82.8485.240.7582.0384.360.7388.7092.050.71
    脑电信号 + 文本信息84.5685.420.7683.0484.750.7490.2093.610.73
    全部模态信息(本文)87.9686.700.8985.1285.400.8494.3294.020.88
    下载: 导出CSV

    表  5  MHLGNE关键组件的消融研究分析(%)

    方法 SEED-IV SEED-V DREAMER
    Acc Pre Kappa Acc Pre Kappa Acc Pre F1
    w/o自适应全局噪声采样 82.49 85.92 0.72 80.01 79.85 0.78 87.90 92.73 0.73
    w/o多模态超图学习 82.06 84.05 0.72 79.35 78.02 0.77 87.24 90.84 0.71
    w/o引导信息传递机制 85.72 86.46 0.74 83.29 85.16 0.79 92.18 93.62 0.76
    全部(本文) 87.96 86.70 0.89 85.12 85.40 0.84 94.32 94.02 0.88
    下载: 导出CSV
  • [1] 刘佳, 宋泓, 陈大鹏, 等. 非语言信息增强和对比学习的多模态情感分析模型[J]. 电子与信息学报, 2024, 46(8): 3372–3381. doi: 10.11999/JEIT231274.

    LIU Jia, SONG Hong, CHEN Dapeng, et al. A multimodal sentiment analysis model enhanced with non-verbal information and contrastive learning[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3372–3381. doi: 10.11999/JEIT231274.
    [2] WANG Pan, ZHOU Qiang, WU Yawen, et al. DLF: Disentangled-language-focused multimodal sentiment analysis[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 21180–21188. doi: 10.1609/aaai.v39i20.35416.
    [3] XU Qinfu, WEI Yiwei, WU Chunlei, et al. Towards multimodal sentiment analysis via hierarchical correlation modeling with semantic distribution constraints[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 21788–21796. doi: 10.1609/aaai.v39i20.35484.
    [4] XU Xi, LI Jianqiang, ZHU Zhichao, et al. A comprehensive review on synergy of multi-modal data and AI technologies in medical diagnosis[J]. Bioengineering, 2024, 11(3): 219. doi: 10.3390/bioengineering11030219.
    [5] LIU Huan, LOU Tianyu, ZHANG Yuzhe, et al. EEG-based multimodal emotion recognition: A machine learning perspective[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 4003729. doi: 10.1109/TIM.2024.3369130.
    [6] LIU Zhicheng, BRAYTEE A, ANAISSI A, et al. Ensemble pretrained models for multimodal sentiment analysis using textual and video data fusion[C]. Proceedings of the ACM Web Conference 2024, Singapore, Singapore, 2024: 1841–1848. doi: 10.1145/3589335.3651971.
    [7] SUN Hao, NIU Ziwei, WANG Hongyi, et al. Multimodal sentiment analysis with mutual information-based disentangled representation learning[J]. IEEE Transactions on Affective Computing, 2025, 16(3): 1606–1617. doi: 10.1109/TAFFC.2025.3529732.
    [8] ZHAO Sicheng, YANG Zhenhua, SHI Henglin, et al. SDRS: Sentiment-aware disentangled representation shifting for multimodal sentiment analysis[J]. IEEE Transactions on Affective Computing, 2025, 16(3): 1802–1813. doi: 10.1109/TAFFC.2025.3539225.
    [9] LUO Yuanyi, LIU Wei, SUN Qiang, et al. TriagedMSA: Triaging sentimental disagreement in multimodal sentiment analysis[J]. IEEE Transactions on Affective Computing, 2025, 16(3): 1557–1569. doi: 10.1109/TAFFC.2024.3524789.
    [10] WANG Yuhao, LIU Yang, ZHENG Aihua, et al. Decoupled feature-based mixture of experts for multi-modal object re-identification[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 8141–8149. doi: 10.1609/aaai.v39i8.32878.
    [11] WU Sheng, HE Dongxiao, WANG Xiaobao, et al. Enriching multimodal sentiment analysis through textual emotional descriptions of visual-audio content[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 1601–1609. doi: 10.1609/aaai.v39i2.32152.
    [12] SUN Xin, REN Xiangyu, and XIE Xiaohao. A novel multimodal sentiment analysis model based on gated fusion and multi-task learning[C]. Proceedings of the ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024: 8336–8340. doi: 10.1109/ICASSP48485.2024.10446040.
    [13] LI Meng, ZHU Zhenfang, LI Kefeng, et al. Diversity and balance: Multimodal sentiment analysis using multimodal-prefixed and cross-modal attention[J]. IEEE Transactions on Affective Computing, 2025, 16(1): 250–263. doi: 10.1109/TAFFC.2024.3430045.
    [14] LIU Zhicheng, BRAYTEE A, ANAISSI A, et al. Ensemble pretrained models for multimodal sentiment analysis using textual and video data fusion[C]. Proceedings of the ACM Web Conference 2024, Singapore, Singapore, 2024: 1841–1848. doi: 10.1145/3589335.3651971. (查阅网上资料,本条文献与第6条文献重复,请确认).
    [15] TANG Jiajia, LI Kang, JIN Xuanyu, et al. CTFN: Hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network[C]. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (volume 1: Long Papers), 2021: 5301–5311. doi: 10.18653/v1/2021.acl-long.412. (查阅网上资料,未找到本条文献出版地信息,请确认).
    [16] ZENG Jiandian, ZHOU Jiantao, and LIU Tianyi. Mitigating inconsistencies in multimodal sentiment analysis under uncertain missing modalities[C]. Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 2022: 2924–2934. doi: 10.18653/v1/2022.emnlp-main.189.
    [17] LIU Yankai, CAI Jinyu, LU Baoliang, et al. Multi-to-single: Reducing multimodal dependency in emotion recognition through contrastive learning[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 1438–1446. doi: 10.1609/aaai.v39i2.32134.
    [18] TAO Chuanqi, LI Jiaming, ZANG Tianzi, et al. A multi-focus-driven multi-branch network for robust multimodal sentiment analysis[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 1547–1555. doi: 10.1609/aaai.v39i2.32146.
    [19] BALTRUŠAITIS T, ROBINSON P, and MORENCY L P. OpenFace: An open source facial behavior analysis toolkit[C]. Proceedings of 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, USA, IEEE, 2016: 1–10. doi: 10.1109/WACV.2016.7477553.
    [20] LIU Yinhan, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[EB/OL]. https://arxiv.org/abs/1907.11692, 2019.
    [21] FANG Feiteng, BAI Yuelin, NI Shiwen, et al. Enhancing noise robustness of retrieval-augmented language models with adaptive adversarial training[C]. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 2024: 10028–10039. doi: 10.18653/v1/2024.acl-long.540.
    [22] CHEN Zhuo, GUO Lingbing, FANG Yin, et al. Rethinking uncertainly missing and ambiguous visual modality in multi-modal entity alignment[C]. Proceedings of the 22nd International Semantic Web Conference on the Semantic Web, Athens, Greece, 2023: 121–139. doi: 10.1007/978-3-031-47240-4_7.
    [23] GAO Min, ZHENG Haifeng, FENG Xinxin, et al. Multimodal fusion using multi-view domains for data heterogeneity in federated learning[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 16736–16744. doi: 10.1609/aaai.v39i16.33839.
    [24] ZHOU Yan, FANG Qingkai, and FENG Yang. CMOT: Cross-modal Mixup via optimal transport for speech translation[C]. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Canada, 2023: 7873–7887. doi: 10.18653/v1/2023.acl-long.436.
    [25] ZHENG Weilong, LIU Wei, LU Yifei, et al. EmotionMeter: A multimodal framework for recognizing human emotions[J]. IEEE Transactions on Cybernetics, 2019, 49(3): 1110–1122. doi: 10.1109/TCYB.2018.2797176.
    [26] LIU Wei, QIU Jielin, ZHENG Weilong, et al. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition[J]. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14(2): 715–729. doi: 10.1109/TCDS.2021.3071170.
    [27] KATSIGIANNIS S and RAMZAN N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices[J]. IEEE Journal of Biomedical and Health Informatics, 2018, 22(1): 98–107. doi: 10.1109/JBHI.2017.2688239.
    [28] JIANG Huangfei, GUAN Xiya, ZHAO Weiye, et al. Generating multimodal features for emotion classification from eye movement signals[J]. Australian Journal of Intelligent Information Processing Systems, 2019, 15(3): 59–66.
    [29] YAN Xu, ZHAO Liming, and LU Baoliang. Simplifying multimodal emotion recognition with single eye movement modality[C]. Proceedings of the 29th ACM International Conference on Multimedia, 2021: 1057–1063. doi: 10.1145/3474085.3475701. (查阅网上资料,未找到本条文献出版地信息,请确认).
    [30] XIA Yan, HUANG Hai, ZHU Jieming, et al. Achieving cross modal generalization with multimodal unified representation[C]. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, USA, 2021: 2774.
    [31] JIANG Weibang, LI Ziyi, ZHENG Weilong, et al. Functional emotion transformer for EEG-assisted cross-modal emotion recognition[C]. Proceedings of the ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024: 1841–1845. doi: 10.1109/ICASSP48485.2024.10446937.
    [32] YUAN Ziqi, LI Wei, XU Hua, et al. Transformer-based feature reconstruction network for robust multimodal sentiment analysis[C]. Proceedings of the 29th ACM International Conference on Multimedia, 2021: 4400–4407. doi: 10.1145/3474085.3475585. (查阅网上资料,未找到本条文献出版地信息,请确认).
    [33] JIANG Weibang, LIU Xuanhao, ZHENG Weilong, et al. Multimodal adaptive emotion transformer with flexible modality inputs on a novel dataset with continuous labels[C]. Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, Canada, 2023: 5975–5984. doi: 10.1145/3581783.3613797.
    [34] LI Jiabao, LIU Ruyi, MIAO Qiguang, et al. CAETFN: Context adaptively enhanced text-guided fusion network for multimodal sentiment analysis[J]. IEEE Transactions on Affective Computing, 2025. doi: 10.1109/TAFFC.2025.3590246. (查阅网上资料,未找到本条文献卷期页码信息,请确认).
    [35] HUANG Jiayang, VONG C M, LI Chen, et al. HSA-former: Hierarchical spatial aggregation transformer for EEG-based emotion recognition[J]. IEEE Transactions on Computational Social Systems, 2025. doi: 10.1109/TCSS.2025.3567298. (查阅网上资料,未找到本条文献卷期页码信息,请确认).
    [36] DENG Jiawen and REN Fuji. Multi-label emotion detection via emotion-specified feature extraction and emotion correlation learning[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 475–486. doi: 10.1109/TAFFC.2020.3034215.
  • 加载中
图(3) / 表(5)
计量
  • 文章访问数:  14
  • HTML全文浏览量:  5
  • PDF下载量:  3
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-07-09
  • 修回日期:  2025-10-14
  • 网络出版日期:  2025-10-23

目录

    /

    返回文章
    返回