高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于面部全局抑郁特征局部感知力增强和全局-局部语义相关性特征融合的抑郁强度识别

孙强 李正 何浪

孙强, 李正, 何浪. 基于面部全局抑郁特征局部感知力增强和全局-局部语义相关性特征融合的抑郁强度识别[J]. 电子与信息学报. doi: 10.11999/JEIT231330
引用本文: 孙强, 李正, 何浪. 基于面部全局抑郁特征局部感知力增强和全局-局部语义相关性特征融合的抑郁强度识别[J]. 电子与信息学报. doi: 10.11999/JEIT231330
SUN Qiang, LI Zheng, HE Lang. Depression Intensity Recognition Based on Perceptually Locally-enhanced Global Depression Features and Fused Global-local Semantic Correlation Features on Faces[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT231330
Citation: SUN Qiang, LI Zheng, HE Lang. Depression Intensity Recognition Based on Perceptually Locally-enhanced Global Depression Features and Fused Global-local Semantic Correlation Features on Faces[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT231330

基于面部全局抑郁特征局部感知力增强和全局-局部语义相关性特征融合的抑郁强度识别

doi: 10.11999/JEIT231330
基金项目: 国家自然科学基金(62370215),西安市科技计划项目(22GXFW0086),西安市碑林区科技计划项目(GX2243)
详细信息
    作者简介:

    孙强:男,博士,副教授,研究方向为情感计算、智慧气象与人机交互

    李正:男,硕士生,研究方向为人脸图像的抑郁症强度识别

    何浪:男,博士,副教授,研究方向为多模态抑郁症识别与分析、维度情感估计

    通讯作者:

    孙强 qsun@xaut.edu.cn

  • 中图分类号: TN957.52

Depression Intensity Recognition Based on Perceptually Locally-enhanced Global Depression Features and Fused Global-local Semantic Correlation Features on Faces

Funds: The National Natural Science Foundation of China (62370215), The Science and Technology Project of Xi’an City (22GXFW0086), The Science and Technology Project of Beilin District in Xi’an City (GX2243)
  • 摘要: 现有基于深度学习的大多数方法在实现患者抑郁程度自动识别的过程中,主要存在两大挑战:(1)难以利用深度模型自动地从面部表情有效学习到抑郁强度相关的全局上下文信息,(2)往往忽略抑郁强度相关的全局和局部信息之间的语义一致性。为此,该文提出一种全局抑郁特征局部感知力增强和全局-局部语义相关性特征融合(PLEGDF-FGLSCF)的抑郁强度识别深度模型。首先,设计了全局抑郁特征局部感知力增强(PLEGDF)模块,用于提取面部局部区域之间的语义相关性信息,促进不同局部区域与抑郁相关的信息之间的交互,从而增强局部抑郁特征驱动的全局抑郁特征表达力。然后,提出了全局-局部语义相关性特征融合(FGLSCF)模块,用于捕捉全局和局部语义信息之间的关联性,实现全局和局部抑郁特征之间的语义一致性描述。最后,在AVEC2013和AVEC2014数据集上,利用PLEGDF-FGLSCF模型获得的识别结果在均方根误差(RMSE)和平均绝对误差(MAE)指标上的值分别是7.75/5.96和7.49/5.99,优于大多数已有的基准模型,证实了该方法的合理性和有效性。
  • 图  1  PLEGDF-FGLSCF网络结构示意图

    图  2  PLEGDF模块结构示意图

    图  3  FGLSCF模块架构示意图

    图  4  在AVEC2013数据集上不同抑郁等级的原始图像和对应的激活图

    图  5  在AVEC2014数据集上不同抑郁等级的原始图像和对应的激活图

    图  6  模型在不同数据集上的测试结果

    图  7  模型在不同数据集上的混淆矩阵

    表  1  不同数量的PLEGDF在AVEC2013数据集上的比较结果

    方法 RMSE MAE
    ResNet50 8.57 6.44
    ResNet50+layer1嵌入PLEGDF 8.11 6.19
    ResNet50+layer1,2嵌入PLEGDF 7.91 6.07
    PLEGDF- FGLSCF 7.75 5.96
    下载: 导出CSV

    表  2  不同数量的PLEGDF在AVEC2014数据集上的比较结果

    方法 RMSE MAE
    ResNet50 8.34 6.37
    ResNet50+layer1嵌入PLEGDF 8.09 6.16
    ResNet50+layer1,2嵌入PLEGDF 7.88 6.10
    PLEGDF- FGLSCF 7.49 5.99
    下载: 导出CSV

    表  3  在AVEC2013数据集上的比较结果

    方法 RMSE MAE
    拼接全连接层 8.02 6.14
    不同分支识别分数求平均 8.25 6.28
    仅使用人脸分支 8.32 6.46
    仅使用眼睛区域分支 8.29 6.39
    PLEGDF-FGLSCF 7.75 5.96
    下载: 导出CSV

    表  4  在AVEC2014数据集上的比较结果

    方法 RMSE MAE
    拼接全连接层 8.49 6.27
    不同分支识别分数求平均 8.36 6.28
    仅使用人脸分支 8.28 6.16
    仅使用眼睛区域分支 8.07 6.21
    PLEGDF-FGLSCF 7.49 5.99
    下载: 导出CSV

    表  5  在 AVEC2013 数据集上识别抑郁强度的方案比较

    模型是否预训练 方法 RMSE MAE
    未预训练 Valstar等人[40] 13.61 10.88
    预训练 Zhu等人[18] 9.82 7.58
    预训练 Jazaery等人[30] 9.28 7.37
    预训练 Melo等人[24] 7.90 5.98
    预训练 Melo等人[25] 7.55 6.24
    预训练 Pan等人[26] 7.98 6.15
    预训练 Zhou等人[28] 8.28 6.20
    预训练 Melo等人[29] 8.25 6.30
    预训练 孙浩浩等人[34] 8.70 6.74
    未预训练 Uddin等人[19] 8.93 7.04
    未预训练 He等人[31] 8.39 6.59
    未预训练 Shang等人[35] 8.20 6.38
    未预训练 PLEGDF-FGLSCF 7.75 5.96
    下载: 导出CSV

    表  6  在AVEC2014 数据集上识别抑郁强度的方案比较

    模型是否预训练 方法 RMSE MAE
    未预训练 Valstar等人[41] 10.86 8.86
    预训练 Zhu等人[18] 9.55 7.47
    预训练 Jazaery等人[30] 9.20 7.22
    预训练 Melo等人[24] 7.61 5.82
    预训练 Melo等人[25] 7.65 6.06
    预训练 Pan等人[26] 7.75 6.00
    预训练 Zhou等人[28] 8.39 6.21
    预训练 Melo等人[29] 8.23 6.15
    预训练 孙浩浩等人[34] 8.56 6.65
    未预训练 Uddin等人[19] 8.78 6.86
    未预训练 He等人[31] 8.30 6.51
    未预训练 Shang等人[35] 7.84 6.08
    未预训练 PLEGDF-FGLSCF 7.49 5.99
    下载: 导出CSV

    表  7  对于无抑郁样本和不同抑郁程度样本模型产生的PCC结果

    抑郁程度 AVEC2013 AVEC2014
    无抑郁(0~13分) 0.575 0.497
    轻度抑郁(14~19分) 0.772 0.689
    中度抑郁(20~28分) 0.661 0.782
    重度抑郁(29~63分) 0.867 0.855
    下载: 导出CSV

    表  8  模型在跨数据集上的测试结果

    训练集 测试集 RMSE MAE
    实验1 AVEC2013 AVEC2014 7.84 5.79
    实验2 AVEC2014 AVEC2013 8.57 6.48
    实验3 AVEC2013 北风任务 7.78 5.96
    实验4 AVEC2013 自由任务 7.91 5.94
    实验5 北风任务 AVEC2013 8.76 6.65
    实验6 自由任务 AVEC2013 8.32 6.27
    实验7 北风任务 自由任务 7.99 6.02
    实验8 自由任务 北风任务 8.81 6.77
    下载: 导出CSV
  • [1] ETTMAN C K, ABDALLA S M, COHEN G H, et al. Prevalence of depression symptoms in us adults before and during the COVID-19 pandemic[J]. JAMA Network Open, 2020, 3(9): e2019686. doi: 10.1001/jamanetworkopen.2020.19686.
    [2] HYLAND P, SHEVLIN M, MCBRIDE O, et al. Anxiety and depression in the republic of Ireland during the COVID-19 pandemic[J]. Acta Psychiatrica Scandinavica, 2020, 142(3): 249–256. doi: 10.1111/acps.13219.
    [3] CHASE T N. Apathy in neuropsychiatric disease: Diagnosis, pathophysiology, and treatment[J]. Neurotoxicity Research, 2011, 19(2): 266–278. doi: 10.1007/s12640-010-9196-9.
    [4] 王文亚. 基于步态中骨架数据抑郁风险识别和面部图像人物识别的应用研究[D]. [硕士论文], 兰州大学, 2023.

    WANG Wenya. Application research on skeleton data depression risk recognition and facial image character recognition based on gait[D]. [Master dissertation], Lanzhou University, 2023.
    [5] ZIMMERMAN M, MARTINEZ J H, YOUNG D, et al. Severity classification on the Hamilton depression rating scale[J]. Journal of Affective Disorders, 2013, 150(2): 384–388. doi: 10.1016/j.jad.2013.04.028.
    [6] BECK A T, STEER R A, BALL R, et al. Comparison of beck depression inventories-IA and-II in psychiatric outpatients[J]. Journal of Personality Assessment, 1996, 67(3): 588–597. doi: 10.1207/s15327752jpa6703_13.
    [7] KROENKE K, STRINE T W, SPITZER R L, et al. The PHQ-8 as a measure of current depression in the general population[J]. Journal of Affective Disorders, 2009, 114(1/3): 163–173. doi: 10.1016/j.jad.2008.06.026.
    [8] THOMBS B, TURNER K A, and SHRIER I. Defining and evaluating overdiagnosis in mental health: A meta-research review[J]. Psychotherapy and Psychosomatics, 2019, 88(4): 193–202. doi: 10.1159/000501647.
    [9] 瞿伟, 谷珊珊. 抑郁症治疗研究新进展[J]. 第三军医大学学报, 2014, 36(11): 1113–1117. doi: 10.16016/j.1000-5404.2014.11.022.

    QU Wei and GU Shanshan. New progress in treatment of depression[J]. Journal of Third Military Medical University, 2014, 36(11): 1113–1117. doi: 10.16016/j.1000-5404.2014.11.022.
    [10] 赵健, 周莉芸, 武孟青, 等. 基于人工智能的抑郁症辅助诊断方法[J]. 西北大学学报:自然科学版, 2023, 53(3): 325–335. doi: 10.16152/j.cnki.xdxbzr.2023-03-002.

    ZHAO Jian, ZHOU Liyun, WU Mengqing, et al. AI-based assisted diagnostic methods for depression[J]. Journal of Northwest University:Natural Science Edition, 2023, 53(3): 325–335. doi: 10.16152/j.cnki.xdxbzr.2023-03-002.
    [11] 郭威彤. 利用深度学习从面部表情和语音识别抑郁症方法的研究[D]. [博士论文], 兰州大学, 2022. doi: 10.27204/d.cnki.glzhu.2022.003611.

    GUO Weitong. Research on deep learning-based depression recognition from facial expression and speech[D]. [Ph. D. dissertation], Lanzhou University, 2022. doi: 10.27204/d.cnki.glzhu.2022.003611.
    [12] 陈坤林, 胡德锋, 陈楠楠. 基于面部表情分析的抑郁症识别研究[J]. 计算机时代, 2023(10): 70–74. doi: 10.16644/j.cnki.cn33-1094/tp.2023.10.015.

    CHEN Kunlin, HU Defeng, and CHEN Nannan. Research on depression identification based on facial expression analysis[J]. Computer Era, 2023(10): 70–74. doi: 10.16644/j.cnki.cn33-1094/tp.2023.10.015.
    [13] MORTENSEN C D. Communication Theory[M]. 2nd ed. New York: Routledge, 2017: 193–200.
    [14] PAMPOUCHIDOU A, SIMOS P G, MARIAS K, et al. Automatic assessment of depression based on visual cues: A systematic review[J]. IEEE Transactions on Affective Computing, 2019, 10(4): 445–470. doi: 10.1109/taffc.2017.2724035.
    [15] GIRARD J M, COHN J F, MAHOOR M H, et al. Social risk and depression: Evidence from manual and automatic facial expression analysis[C]. The 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, Shanghai, China, 2013: 1–8. doi: 10.1109/FG.2013.6553748.
    [16] GIRARD J M, COHN J F, MAHOOR M H, et al. Nonverbal social withdrawal in depression: Evidence from manual and automatic analyses[J]. Image and Vision Computing, 2014, 32(10): 641–647. doi: 10.1016/j.imavis.2013.12.007.
    [17] GUR R C, ERWIN R J, GUR R E, et al. Facial emotion discrimination: II. Behavioral findings in depression[J]. Psychiatry Research, 1992, 42(3): 241–251. doi: 10.1016/0165-1781(92)90116-K.
    [18] ZHU Yu, SHANG Yuanyuan, SHAO Zhuhong, et al. Automated depression diagnosis based on deep networks to encode facial appearance and dynamics[J]. IEEE Transactions on Affective Computing, 2018, 9(4): 578–584. doi: 10.1109/TAFFC.2017.2650899.
    [19] UDDIN M A, JOOLEE J B, and LEE Y K. Depression level prediction using deep spatiotemporal features and multilayer Bi-LTSM[J]. IEEE Transactions on Affective Computing, 2022, 13(2): 864–870. doi: 10.1109/TAFFC.2020.2970418.
    [20] DE MELO W C, GRANGER E, and LOPEZ M B. Encoding temporal information for automatic depression recognition from facial analysis[C]. ICASSP 2020-2020 the IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain, 2020: 1080–1084. doi: 10.1109/ICASSP40776.2020.9054375.
    [21] CHEN Qian, CHATURVEDI I, JI Shaoxiong, et al. Sequential fusion of facial appearance and dynamics for depression recognition[J]. Pattern Recognition Letters, 2021, 150: 115–121. doi: 10.1016/j.patrec.2021.07.005.
    [22] 何浪. 基于3D-CNN和时空注意力-卷积LSTM的抑郁症识别研究[J]. 首都师范大学学报:自然科学版, 2021, 42(2): 17–25. doi: 10.19789/j.1004-9398.2021.02.004.

    HE Lang. Automatic depression estimation using 3D-CNN and STA-ConvLSTM from videos[J]. Journal of Capital Normal University:Natural Sciences Edition, 2021, 42(2): 17–25. doi: 10.19789/j.1004-9398.2021.02.004.
    [23] NIU Mingyue, HE Lang, LI Ya, et al. Depressioner: Facial dynamic representation for automatic depression level prediction[J]. Expert Systems with Applications, 2022, 204: 117512. doi: 10.1016/j.eswa.2022.117512.
    [24] DE MELO W C, GRANGER E, and HADID A. A deep multiscale spatiotemporal network for assessing depression from facial dynamics[J]. IEEE Transactions on Affective Computing, 2022, 13(3): 1581–1592. doi: 10.1109/TAFFC.2020.3021755.
    [25] DE MELO W C, GRANGER E, and LÓPEZ M. MDN: A deep maximization-differentiation network for spatio-temporal depression detection[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 578–590. doi: 10.1109/TAFFC.2021.3072579.
    [26] PAN Yuchen, SHANG Yuanyuan, LIU Tie, et al. Spatial-temporal attention network for depression recognition from facial videos[J]. Expert Systems with Applications, 2024, 237: 121410. doi: 10.1016/j.eswa.2023.121410.
    [27] 安昳, 曲珍, 许宁, 等. 面部动态特征描述的抑郁症识别[J]. 中国图象图形学报, 2020, 25(11): 2415–2427. doi: 10.11834/jig.200322.

    AN Yi, QU Zhen, XU Ning, et al. Automatic depression estimation using facial appearance[J]. Journal of Image and Graphics, 2020, 25(11): 2415–2427. doi: 10.11834/jig.200322.
    [28] ZHOU Xiuzhang, JIN Kai, SHANG Yuanyaun, et al. Visually interpretable representation learning for depression recognition from facial images[J]. IEEE Transactions on Affective Computing, 2020, 11(3): 542–552. doi: 10.1109/TAFFC.2018.2828819.
    [29] DE MELO W C, GRANGER E, and HADID A. Combining global and local convolutional 3D networks for detecting depression from facial expressions[C]. The 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition, Lille, France, 2019: 1–8. doi: 10.1109/FG.2019.8756568.
    [30] AL JAZAERY M and GUO Guodong. Video-based depression level analysis by encoding deep spatiotemporal features[J]. IEEE Transactions on Affective Computing, 2021, 12(1): 262–268. doi: 10.1109/TAFFC.2018.2870884.
    [31] HE Lang, CHAN J C W, and WANG Zhongmin. Automatic depression recognition using CNN with attention mechanism from videos[J]. Neurocomputing, 2021, 422: 165–175. doi: 10.1016/j.neucom.2020.10.015.
    [32] NIU Mingyue, TAO Jianhua, and LIU Bin. Multi-scale and multi-region facial discriminative representation for automatic depression level prediction[C]. The ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 2021: 1325–1329. doi: 10.1109/ICASSP39728.2021.9413504.
    [33] ACHARYA R and DASH S P. Automatic depression detection based on merged convolutional neural networks using facial features[C]. The 2022 IEEE International Conference on Signal Processing and Communications, Bangalore, India, 2022: 1–5. doi: 10.1109/SPCOM55316.2022.9840812.
    [34] 孙浩浩, 邵珠宏, 尚媛园, 等. 融合通道层注意力机制的多支路卷积网络抑郁症识别[J]. 中国图象图形学报, 2022, 27(11): 3292–3302. doi: 10.11834/jig.210397.

    SUN Haohao, SHAO Zhuhong, SHANG Yuanyuan, et al. Channel-wise attention mechanism-relevant multi-branches convolutional network-based depressive disorder recognition[J]. Journal of Image and Graphics, 2022, 27(11): 3292–3302. doi: 10.11834/jig.210397.
    [35] SHANG Yuanyuan, PAN Yuchen, JIANG Xiao, et al. LQGDNet: A local quaternion and global deep network for facial depression recognition[J]. IEEE Transactions on Affective Computing, 2023, 14(3): 2557–2563. doi: 10.1109/TAFFC.2021.3139651.
    [36] 江筱, 邵珠宏, 尚媛园, 等. 基于级联深度神经网络的抑郁症识别[J]. 计算机应用与软件, 2019, 36(10): 117–122,150. doi: 10.3969/j.issn.1000-386x.2019.10.021.

    JIANG Xiao, SHAO Zhuhong, SHANG Yuanyuan, et al. Depression recognition based on cascaded deep neural networks[J]. Computer Applications and Software, 2019, 36(10): 117–122,150. doi: 10.3969/j.issn.1000-386x.2019.10.021.
    [37] SANCHEZ A, VAZQUEZ C, MARKER C, et al. Attentional disengagement predicts stress recovery in depression: An eye-tracking study[J]. Journal of Abnormal Psychology, 2013, 122(2): 303–313. doi: 10.1037/a0031529.
    [38] EISENBARTH H and ALPERS G W. Happy mouth and sad eyes: Scanning emotional facial expressions[J]. Emotion, 2011, 11(4): 860–865. doi: 10.1037/a0022758.
    [39] SCHWARTZ G E, FAIR P L, SALT P, et al. Facial muscle patterning to affective imagery in depressed and nondepressed subjects[J]. Science, 1976, 192(4238): 489–491. doi: 10.1126/science.1257786.
    [40] VALSTAR M, SCHULLER B, SMITH K, et al. AVEC 2013: The continuous audio/visual emotion and depression recognition challenge[C]. The 3rd ACM International Workshop on Audio/Visual Emotion Challenge, Barcelona Spain, 2013: 3–10. doi: 10.1145/2512530.2512533.
    [41] VALSTAR M, SCHULLER B, SMITH K, et al. AVEC 2014: 3D dimensional affect and depression recognition challenge[C]. Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, Orlando, USA, 2014: 3–10. doi: 10.1145/2661806.2661807.
    [42] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. The 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626. doi: 10.1109/ICCV.2017.74.
    [43] LUO Wenjie, LI Yujia, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 4905–4913.
    [44] BORJI A and ITTI L. State-of-the-Art in visual attention modeling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 185–207. doi: 10.1109/TPAMI.2012.89.
    [45] BALTRUSAITIS T, ZADEH A, LIM Y C, et al. OpenFace 2.0: Facial behavior analysis toolkit[C]. 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition, Xi’an, China, 2018: 59–66. doi: 10.1109/FG.2018.00019.
    [46] ZHANG Kaipeng, ZHANG Zhanpeng, LI Zhifeng, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499–1503. doi: 10.1109/LSP.2016.2603342.
    [47] WANG Fei, JIANG Mengqing, QIAN Chen, et al. Residual attention network for image classification[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6450–6458. doi: 10.1109/CVPR.2017.683.
  • 加载中
图(7) / 表(8)
计量
  • 文章访问数:  70
  • HTML全文浏览量:  39
  • PDF下载量:  22
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-12-01
  • 修回日期:  2024-02-26
  • 网络出版日期:  2024-03-08

目录

    /

    返回文章
    返回