高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于跨语种声学分析的帕金森病检测方法

季薇 王传瑜 吴迪 李云 郑慧芬

季薇, 王传瑜, 吴迪, 李云, 郑慧芬. 基于跨语种声学分析的帕金森病检测方法[J]. 电子与信息学报, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981
引用本文: 季薇, 王传瑜, 吴迪, 李云, 郑慧芬. 基于跨语种声学分析的帕金森病检测方法[J]. 电子与信息学报, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981
JI Wei, WANG Chuanyu, WU Di, LI Yun, ZHENG Huifen. Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis[J]. Journal of Electronics & Information Technology, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981
Citation: JI Wei, WANG Chuanyu, WU Di, LI Yun, ZHENG Huifen. Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis[J]. Journal of Electronics & Information Technology, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981

基于跨语种声学分析的帕金森病检测方法

doi: 10.11999/JEIT230981
基金项目: 江苏省高校基础科学(自然科学)重大项目(21KJA520003)
详细信息
    作者简介:

    季薇:女,博士,教授,硕士生导师,研究方向为机器学习与信号处理的交叉研究、无线通信与通信信号处理等

    王传瑜:男,硕士生,研究方向为机器学习与信号处理的交叉研究

    吴迪:男,硕士生,研究方向为机器学习与信号处理的交叉研究

    李云:男,博士,教授,博士生导师,研究方向为机器学习、特征选择、信息安全等

    郑慧芬:女,博士,主任医师,研究方向帕金森病及相关运动障碍性疾病

    通讯作者:

    李云 liyun@njupt.edu.cn

  • 中图分类号: TN911.7; TP391.4

Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis

Funds: The Basic Scientific (Natural Science) Major Program of the Higher Education Institutions of Jiangsu Province, China (21KJA520003)
  • 摘要: 基于语音的帕金森病检测具有非介入式、成本较低和无创等优点。当前公开的帕金森病语音数据集大多来源于单一语种,存在数据容量不够大、受试者母语发音特点差异小等特点。单一语种数据集上训练的帕金森病检测模型在面对跨语种语音数据时,将出现性能下降。为避免语种差异带来的影响,提升模型在跨语种场景下的检测性能,该文引入对抗迁移学习和特征解耦的思想,提出一种帕金森病跨语种声学分析模型(CLSAM)。首先,将基于多头自注意力机制的Transformer编码块和多层神经网络级联,组成特征提取器模块,用于将从源域和目标域语音中提取的原始Fbank语音特征初步解耦为两个向量,即域不变病理信息表征向量和域信息表征向量;设计了目标任务不一致的双重对抗训练模块,显式地分离域不变病理信息和域信息;最终,提取跨语种语音数据中的域不变病理信息用于帕金森病检测。该文在公开的MaxLittle帕金森病语音数据集以及自采的帕金森病语音数据集上,采用十折交叉验证的方法验证了所提方法的有效性。实验结果表明:与传统机器学习方法以及现有的迁移学习算法相比,所提模型在跨语种场景中的检测准确率、敏感度和F1分数等性能均有明显提升。
  • 图  1  跨语种声学分析模型总体框架图

    图  2  基于多头自注意力机制的Transformer编码块

    算法1 基于对抗迁移学习的跨语种帕金森病检测算法
     输入:源域数据集$ {{D}}_{\mathrm{s}} $和目标域数据集$ {{D}}_{\mathrm{t}} $
     输出:可学习参数$ {\tilde {\boldsymbol{\theta}} _{{\text{Te}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e2}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d2}}}} $
     Repeat
      //特征学习阶段
      For 从源域数据中选取一个批次的样本:
      计算损失$ {L_{\mathrm{s}}}({{\boldsymbol{\theta}} _{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}1}},{{\boldsymbol{\theta}} _{{\mathrm{d}}1}}) $;
      计算损失$ {L_{\mathrm{d}}}({{\boldsymbol{\theta}} _{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}2}},{{\boldsymbol{\theta}} _{{\mathrm{d}}2}}) $;
      计算损失$ {L_{{\text{diff}}}}({\boldsymbol{V}}_{\mathrm{s}}^{\mathrm{e}};{\boldsymbol{V}}_{\mathrm{s}}^{\mathrm{d}}) $;
      根据式(10)计算梯度,并更新$ {\tilde {\boldsymbol{\theta}} _{{\text{Te}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e2}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d2}}}} $
      End
      For 从目标域域数据中选取一个批次的样本:
      计算损失$ {L_{\mathrm{s}}}({{\boldsymbol{\theta}} _{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}1}},{{\boldsymbol{\theta}} _{{\mathrm{d}}1}}) $;
      计算损失$ {L_{\mathrm{d}}}({{\boldsymbol{\theta}} _{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}2}},{{\boldsymbol{\theta}} _{{\mathrm{d}}2}}) $;
      计算损失$ {{L}}_{\mathrm{d}\mathrm{i}\mathrm{f}\mathrm{f}}({\boldsymbol{V}}_{\mathrm{t}}^{\mathrm{e}};{\boldsymbol{V}}_{\mathrm{t}}^{\mathrm{d}}) $;
      根据式(10)计算梯度,并更新$ {\tilde {\boldsymbol{\theta}} _{{\text{Te}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e2}}}},{\tilde {\boldsymbol{\theta}} _{{\text{d2}}}} $
      End
      //对抗迁移阶段
      For 对源域或目标域的每一个样本,固定参数$ {{\boldsymbol{\theta}} _{{\mathrm{Te}}}} $、参数
      $ {{\boldsymbol{\theta}} _{{\mathrm{e}}1}} $、参数$ {{\boldsymbol{\theta}} _{{\mathrm{d}}2}} $
      计算损失$ {L_{\rm{s}}}({{\boldsymbol{\theta}} _{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}1}},{{\boldsymbol{\theta}} _{{\mathrm{d}}1}}) $;
      计算损失$ {L_{\mathrm{d}}}({{\boldsymbol{\theta }}_{{\mathrm{Te}}}},{{\boldsymbol{\theta}} _{{\mathrm{e}}2}},{{\boldsymbol{\theta}} _{{\mathrm{d}}2}}) $;
      计算损失$ {L_{{\text{diff}}}}({\boldsymbol{V}}_{\text{s}}^{\mathrm{e}};{\boldsymbol{V}}_{\text{s}}^{\mathrm{d}}) $或$ {L_{{\text{diff}}}}({\boldsymbol{V}}_{\text{t}}^{\mathrm{e}};{\boldsymbol{V}}_{\text{t}}^{\mathrm{d}}) $;
      根据式(11)计算梯度,并更新$ {\tilde {\boldsymbol{\theta}} _{{\text{d1}}}},{\tilde {\boldsymbol{\theta}} _{{\text{e2}}}} $
      End
     Until模型收敛
    下载: 导出CSV

    表  1  MaxLittle数据集的统计信息

    男性女性合计
    受试者类别PDHCPDHCPDHC
    受试者人数2241163310
    平均年龄及统计方差67.2 (9.3)61(8.6)67.2(9.3)61(8.6)67.2(9.3)61(8.6)
    年龄分布48~8546~7248~8546~7248~8546~72
    下载: 导出CSV

    表  2  自采帕金森病语音数据集的统计信息

    男性女性合计
    受试者类别PDHCPDHCPDHC
    受试者人数4981996817
    平均年龄及统计方差69.3(9.5)66.5(7.2)69.8(8.2)65.3(6.8)69.4(9.2)65.9(7.0)
    年龄分布46~8858~7756~8453~7446~8853~77
    平均病情持续时间及统计方差5.9 (3.6)05.4 (3.1)05.8 (3.4)0
    HY分期1~401~401~40
    下载: 导出CSV

    表  3  CLSAM模型参数设置

    网络结构参数 参数值
    X_s 361×40
    X_t 361×40
    Transformer编码块Q, K, V向量维度 64
    Transformer编码块多头注意力 2
    Transformer编码块深度 6
    多层前馈神经网络 [32,32]
    domain_vec 16
    p_vec 16
    域鉴别器网络D1 [32,16,2]
    域鉴别器网络D2 [16,2]
    帕金森病检测模块E1 [16,2]
    帕金森病检测模块E2 [32,16,2]
    周期数 120
    学习率 0.001
    批大小 36
    优化器 SGD
    Dropout 0.1
    下载: 导出CSV

    表  4  与传统机器学习模型的性能比较(%)

    模型Acc.Sen.F1.
    CLSAM86.6985.9884.71
    RF(s)79.8677.4178.88
    RF(t)78.6277.3277.26
    RF(s-t)76.8175.2574.75
    RF (t-s)76.3875.3674.46
    RF(st)79.1578.3578.18
    SVM (s)79.5277.5378.35
    SVM (t)77.3477.6178.15
    SVM(s-t)75.7274.4674.26
    SVM (t-s)75.3573.4572.86
    SVM (st)78.9576.8675.68
    下载: 导出CSV

    表  5  与迁移学习模型的性能比较(%)

    模型Acc.Sen.F1.
    CLSAM86.6985.9884.71
    DAN80.8381.8681.56
    DSAN83.6583.8282.61
    DANN82.7882.9882.81
    CADAN84.1083.2283.56
    TFFN85.6484.5883.89
    DSN83.6082.8483.15
    下载: 导出CSV

    表  6  消融实验(%)

    模型Acc.Sen.F1.
    CLSAM86.6985.9884.71
    CLSAM (不含双重对抗训练)82.7882.9882.31
    CLSAM (不含特征正交约束)85.2383.7483.15
    CLSAM(带有HSIC约束)85.9684.8584.17
    下载: 导出CSV
  • [1] GULLAPALLI A S and MITTAL V K. Early detection of Parkinson’s disease through speech features and machine learning: a review[C]. ICT with Intelligent Applications: Proceedings of ICTIS, Singapore, 2022: 203–212. doi: 10.1007/978-981-16-4177-0_22.
    [2] BENBA A, JILBAB A, SANDABAD S, et al. Voice signal processing for detecting possible early signs of Parkinson’s disease in patients with rapid eye movement sleep behavior disorder[J]. International Journal of Speech Technology, 2019, 22(1): 121–129. doi: 10.1007/s10772-018-09588-0.
    [3] 季薇, 杨茗淇, 李云, 等. 基于掩蔽自监督语音特征提取的帕金森病检测方法[J]. 电子与信息学报, 2023, 45(10): 3502–3510. doi: 10.11999/JEIT221041.

    JI Wei, YANG Mingqi, LI Yun, et al. Parkinson's disease detection method based on masked self-supervised speech feature extraction[J]. Journal of Electronics & Information Technology, 2023, 45(10): 3502–3510. doi: 10.11999/JEIT221041.
    [4] SUPHINNAPONG P, PHOKAEWVARANGKUL O, THUBTHONG N, et al. Objective vowel sound characteristics and their relationship with motor dysfunction in Asian Parkinson’s disease patients[J]. Journal of the Neurological Sciences, 2021, 426: 117487. doi: 10.1016/j.jns.2021.117487.
    [5] HSU S C, JIAO Yishan, MCAULIFFE M J, et al. Acoustic and perceptual speech characteristics of native Mandarin speakers with Parkinson's disease[J]. The Journal of the Acoustical Society of America, 2017, 141(3): EL293–EL299. doi: 10.1121/1.4978342.
    [6] KOVAC D, MEKYSKA J, GALAZ Z, et al. Multilingual analysis of speech and voice disorders in patients with Parkinson's Disease[C]. The 44th International Conference on Telecommunications and Signal Processing, Brno, Czech Republic, 2021: 273–277. doi: 10.1109/TSP52935.2021.9522597.
    [7] VÁSQUEZ-CORREA J C, ARIAS-VERGARA T, RIOS-URREGO C D, et al. Convolutional neural networks and a transfer learning strategy to classify Parkinson's Disease from speech in three different languages[C]. 24th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Havana, Cuba, 2019: 697–706. doi: 10.1007/978-3-030-33904-3_66.
    [8] KIM Y and CHOI Y. A cross-language study of acoustic predictors of speech intelligibility in individuals with Parkinson's Disease[J]. Journal of Speech, Language, and Hearing Research, 2017, 60(9): 2506–2518. doi: 10.1044/2017_JSLHR-S-16-0121.
    [9] NISHIO M and NIIMI S. Comparison of speaking rate, articulation rate and alternating motion rate in dysarthric speakers[J]. Folia Phoniatrica et Logopaedica, 2006, 58(2): 114–131. doi: 10.1159/000089612.
    [10] OROZCO-ARROYAVE J R, HöNIG F, ARIAS-LONDOñO J D, et al. Automatic detection of Parkinson's disease in running speech spoken in three different languages[J]. The Journal of the Acoustical Society of America, 2016, 139(1): 481–500. doi: 10.1121/1.4939739.
    [11] YEO E J, CHOI K, KIM S, et al. Cross-lingual dysarthria severity classification for English, Korean, and Tamil[C]. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Chiang Mai, Thailand, 2022: 566–574. doi: 10.23919/APSIPAASC55919.2022.9980124.
    [12] VÁSQUEZ-CORREA J C, RIOS-URREGO C D, ARIAS-VERGARA T, et al. Transfer learning helps to improve the accuracy to classify patients with different speech disorders in different languages[J]. Pattern Recognition Letters, 2021, 150: 272–279. doi: 10.1016/j.patrec.2021.04.011.
    [13] JIANG Junguang, SHU Yang, WANG Jianmin, et al. Transferability in deep learning: A survey[J]. arXiv: 2201.05867, 2022. doi: 10.48550/arXiv.2201.05867.
    [14] GHIFARY M, KLEIJN W B, and ZHANG Mengjie. Domain adaptive neural networks for object recognition[C]. 13th Pacific Rim International Conference on Artificial Intelligence, Gold Coast, QLD, Australia, 2014: 898–904. doi: 10.1007/978-3-319-13560-1_76.
    [15] ZHU Yongchun, ZHUANG Fuzhen, WANG Jindong, et al. Deep subdomain adaptation network for image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(4): 1713–1722. doi: 10.1109/tnnls.2020.2988928.
    [16] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks[J]. The Journal of Machine Learning Research, 2016, 17(1): 2096–2030. doi: 10.1007/978-3-319-58347-1_10.
    [17] LONG Mingsheng, CAO Zhangjie, WANG Jianmin, et al. Conditional adversarial domain adaptation[C]. The 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 1647–1657. doi: 10.5555/3326943.3327094.
    [18] CAI Ruichu, LI Zijian, WEI Pengfei, et al. Learning disentangled semantic representation for domain adaptation[C]. International Joint Conferences on Artificial Intelligence (IJCAI), Macao, China, 2019: 2060–2066. doi: 10.24963/ijcai.2019/285.
    [19] TSANAS A, LITTLE M A, MCSHARRY P E, et al. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease[J]. IEEE Transactions on Biomedical Engineering, 2012, 59(5): 1264–1271. doi: 10.1109/TBME.2012.2183367.
    [20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010. doi: 10.5555/3295222.3295349.
    [21] OROZCO-ARROYAVE J R, VÁSQUEZ-CORREA J C, VARGAS-BONILLA J F, et al. NeuroSpeech: An open-source software for Parkinson’s speech analysis[J]. Digital Signal Processing, 2018, 77: 207–221. doi: 10.1016/j.dsp.2017.07.004.
    [22] CAI D, HE X, HAN J, et al. Orthogonal Laplacianfaces for face recognition[J]. IEEE Transactions on Image Processing, 2006, 15(11): 3608–3614. doi: 10.1109/TIP.2006.881945.
    [23] BOUSMALIS K, TRIGEORGIS G, SILBERMAN N, et al. Domain separation networks[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 343–351. doi: 10.5555/3157096.3157135.
    [24] LI Yiyang, WANG Shengsheng, WANG Bilin, et al. Transferable feature filtration network for multi-source domain adaptation[J]. Knowledge-Based Systems, 2023, 260: 110113. doi: 10.1016/J.KNOSYS.2022.110113.
    [25] SONG L, SMOLA A, GRETTON A, et al. Supervised feature selection via dependence estimation[C]. The 24th International Conference on Machine Learning, 2007: 823–830. doi: 10.1145/1273496.1273600.
  • 加载中
图(2) / 表(7)
计量
  • 文章访问数:  396
  • HTML全文浏览量:  157
  • PDF下载量:  53
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-09-07
  • 修回日期:  2023-12-04
  • 网络出版日期:  2023-12-13
  • 刊出日期:  2024-02-29

目录

    /

    返回文章
    返回