Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis

JI Wei; WANG Chuanyu; WU Di; LI Yun; ZHENG Huifen

doi:10.11999/JEIT230981

Volume 46 Issue 2

Feb. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(2): 546-554

JI Wei, WANG Chuanyu, WU Di, LI Yun, ZHENG Huifen. Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis[J]. Journal of Electronics & Information Technology, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981

Citation:

JI Wei, WANG Chuanyu, WU Di, LI Yun, ZHENG Huifen. Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis[J]. Journal of Electronics & Information Technology, 2024, 46(2): 546-554. doi: 10.11999/JEIT230981

Citation:

PDF( 1208 KB)

Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis

doi: 10.11999/JEIT230981 cstr: 32379.14.JEIT230981

JI Wei¹,
WANG Chuanyu¹,
WU Di¹,
LI Yun^{2
,
,},
ZHENG Huifen³

1.
School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
2.
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
3.
Affiliated Geriatric Hospital of Nanjing Medical University, Nanjing 210024, China

Funds: The Basic Scientific (Natural Science) Major Program of the Higher Education Institutions of Jiangsu Province, China (21KJA520003)

Received Date: 2023-09-07
Rev Recd Date: 2023-12-04

Available Online: 2023-12-13

Publish Date: 2024-02-29

Abstract

Abstract

The research on speech-based Parkinson’s disease detection has the advantages of non-intrusive, low cost and non-invasive. The current publicly available speech datasets for Parkinson’s disease mostly originate from single-language speech, which has the characteristics such as insufficient data capacity and small differences in the pronunciation characteristics of the subjects' mother tongue. The Parkinson’s disease detection model trained on a single language dataset will experience performance degradation when faced with cross-language speech data. To avoid the impact of language differences and improve the detection performance of the model in cross-language scenarios, the ideas of adversarial transfer learning and feature decoupling is introduced and a Parkinson’s disease Cross-Language Speech Analysis Model (CLSAM) is proposed in this paper. Firstly, the model cascades a multihead self-attention encoder and a multi-layer neural network to form a feature extractor module, which is used to decouple the original Fbank speech features extracted from the pronunciation characteristics of the source domain and target domain into two vectors, namely domain invariant pathological information representation vector and domain information representation vector. Secondly, a dual adversarial training module with inconsistent target tasks is designed, which explicitly separates domain invariant pathological information and domain information. Finally, domain invariant pathological information is extracted from cross-language speech data for Parkinson’s disease detection. This paper verifies the effectiveness of the proposed method using a ten-fold cross-validation method on both the publicly available MaxLittle Parkinson’s disease speech dataset and the self-collected Parkinson’s disease speech dataset. Experimental results show that compared with traditional machine learning methods and existing transfer learning algorithms, the proposed model significantly improves the accuracy, sensitivity and F1 scores in cross-language scenarios.
- Cross-language speech analysis,
- Parkinson’s disease,
- Adversarial transfer learning,
- Feature decoupling

FullText(HTML)

References(25)

References

[1]	GULLAPALLI A S and MITTAL V K. Early detection of Parkinson’s disease through speech features and machine learning: a review[C]. ICT with Intelligent Applications: Proceedings of ICTIS, Singapore, 2022: 203–212. doi: 10.1007/978-981-16-4177-0_22.
[2]	BENBA A, JILBAB A, SANDABAD S, et al. Voice signal processing for detecting possible early signs of Parkinson’s disease in patients with rapid eye movement sleep behavior disorder[J]. International Journal of Speech Technology, 2019, 22(1): 121–129. doi: 10.1007/s10772-018-09588-0.
[3]	季薇, 杨茗淇, 李云, 等. 基于掩蔽自监督语音特征提取的帕金森病检测方法[J]. 电子与信息学报, 2023, 45(10): 3502–3510. doi: 10.11999/JEIT221041. JI Wei, YANG Mingqi, LI Yun, et al. Parkinson's disease detection method based on masked self-supervised speech feature extraction[J]. Journal of Electronics & Information Technology, 2023, 45(10): 3502–3510. doi: 10.11999/JEIT221041.
[4]	SUPHINNAPONG P, PHOKAEWVARANGKUL O, THUBTHONG N, et al. Objective vowel sound characteristics and their relationship with motor dysfunction in Asian Parkinson’s disease patients[J]. Journal of the Neurological Sciences, 2021, 426: 117487. doi: 10.1016/j.jns.2021.117487.
[5]	HSU S C, JIAO Yishan, MCAULIFFE M J, et al. Acoustic and perceptual speech characteristics of native Mandarin speakers with Parkinson's disease[J]. The Journal of the Acoustical Society of America, 2017, 141(3): EL293–EL299. doi: 10.1121/1.4978342.
[6]	KOVAC D, MEKYSKA J, GALAZ Z, et al. Multilingual analysis of speech and voice disorders in patients with Parkinson's Disease[C]. The 44th International Conference on Telecommunications and Signal Processing, Brno, Czech Republic, 2021: 273–277. doi: 10.1109/TSP52935.2021.9522597.
[7]	VÁSQUEZ-CORREA J C, ARIAS-VERGARA T, RIOS-URREGO C D, et al. Convolutional neural networks and a transfer learning strategy to classify Parkinson's Disease from speech in three different languages[C]. 24th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Havana, Cuba, 2019: 697–706. doi: 10.1007/978-3-030-33904-3_66.
[8]	KIM Y and CHOI Y. A cross-language study of acoustic predictors of speech intelligibility in individuals with Parkinson's Disease[J]. Journal of Speech, Language, and Hearing Research, 2017, 60(9): 2506–2518. doi: 10.1044/2017_JSLHR-S-16-0121.
[9]	NISHIO M and NIIMI S. Comparison of speaking rate, articulation rate and alternating motion rate in dysarthric speakers[J]. Folia Phoniatrica et Logopaedica, 2006, 58(2): 114–131. doi: 10.1159/000089612.
[10]	OROZCO-ARROYAVE J R, HöNIG F, ARIAS-LONDOñO J D, et al. Automatic detection of Parkinson's disease in running speech spoken in three different languages[J]. The Journal of the Acoustical Society of America, 2016, 139(1): 481–500. doi: 10.1121/1.4939739.
[11]	YEO E J, CHOI K, KIM S, et al. Cross-lingual dysarthria severity classification for English, Korean, and Tamil[C]. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Chiang Mai, Thailand, 2022: 566–574. doi: 10.23919/APSIPAASC55919.2022.9980124.
[12]	VÁSQUEZ-CORREA J C, RIOS-URREGO C D, ARIAS-VERGARA T, et al. Transfer learning helps to improve the accuracy to classify patients with different speech disorders in different languages[J]. Pattern Recognition Letters, 2021, 150: 272–279. doi: 10.1016/j.patrec.2021.04.011.
[13]	JIANG Junguang, SHU Yang, WANG Jianmin, et al. Transferability in deep learning: A survey[J]. arXiv: 2201.05867, 2022. doi: 10.48550/arXiv.2201.05867.
[14]	GHIFARY M, KLEIJN W B, and ZHANG Mengjie. Domain adaptive neural networks for object recognition[C]. 13th Pacific Rim International Conference on Artificial Intelligence, Gold Coast, QLD, Australia, 2014: 898–904. doi: 10.1007/978-3-319-13560-1_76.
[15]	ZHU Yongchun, ZHUANG Fuzhen, WANG Jindong, et al. Deep subdomain adaptation network for image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(4): 1713–1722. doi: 10.1109/tnnls.2020.2988928.
[16]	GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks[J]. The Journal of Machine Learning Research, 2016, 17(1): 2096–2030. doi: 10.1007/978-3-319-58347-1_10.
[17]	LONG Mingsheng, CAO Zhangjie, WANG Jianmin, et al. Conditional adversarial domain adaptation[C]. The 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 1647–1657. doi: 10.5555/3326943.3327094.
[18]	CAI Ruichu, LI Zijian, WEI Pengfei, et al. Learning disentangled semantic representation for domain adaptation[C]. International Joint Conferences on Artificial Intelligence (IJCAI), Macao, China, 2019: 2060–2066. doi: 10.24963/ijcai.2019/285.
[19]	TSANAS A, LITTLE M A, MCSHARRY P E, et al. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease[J]. IEEE Transactions on Biomedical Engineering, 2012, 59(5): 1264–1271. doi: 10.1109/TBME.2012.2183367.
[20]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010. doi: 10.5555/3295222.3295349.
[21]	OROZCO-ARROYAVE J R, VÁSQUEZ-CORREA J C, VARGAS-BONILLA J F, et al. NeuroSpeech: An open-source software for Parkinson’s speech analysis[J]. Digital Signal Processing, 2018, 77: 207–221. doi: 10.1016/j.dsp.2017.07.004.
[22]	CAI D, HE X, HAN J, et al. Orthogonal Laplacianfaces for face recognition[J]. IEEE Transactions on Image Processing, 2006, 15(11): 3608–3614. doi: 10.1109/TIP.2006.881945.
[23]	BOUSMALIS K, TRIGEORGIS G, SILBERMAN N, et al. Domain separation networks[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 343–351. doi: 10.5555/3157096.3157135.
[24]	LI Yiyang, WANG Shengsheng, WANG Bilin, et al. Transferable feature filtration network for multi-source domain adaptation[J]. Knowledge-Based Systems, 2023, 260: 110113. doi: 10.1016/J.KNOSYS.2022.110113.
[25]	SONG L, SMOLA A, GRETTON A, et al. Supervised feature selection via dependence estimation[C]. The 24th International Conference on Machine Learning, 2007: 823–830. doi: 10.1145/1273496.1273600.