Research on ECG Pathological Signal Classification Empowered by Diffusion Generative Data
-
摘要: 心电图(ECG)是衡量一个人身体健康的重要指标,由于ECG图像组成复杂,特征较多,人眼识别往往会出现误差,因此该文提出一种基于数据生成的ECG病理信号分类算法。首先,扩散生成网络通过向真实的ECG信号添加噪声,逐步将其转换为接近纯噪声的分布,从而便于模型的处理。为了提高生成速度和减少内存占用,该文进一步提出了一种基于知识蒸馏的蒸馏-扩散生成 (KD-DGN)模型,该模型在内存和生成效率上优于传统的DGN。该文还讨论了KD-DGN的内存占用、生成效率及ECG数据的准确性,探讨了轻量化处理后生成的数据特征。最后,通过比较原始MIT-BIH数据集与扩展数据集(MIT-BIH-PLUS)在分类模型中的效果,实验结果表明,卷积网络能够从DGN生成的扩展数据集中获取更多的特征信息从而提升ECG病理信号的识别效果。Abstract:
Objective Electrocardiogram (ECG) signals are key indicators of human health. However, their complex composition and diverse features make visual recognition prone to errors. This study proposes a classification algorithm for ECG pathological signals based on data generation. A Diffusion Generative Network (DGN), also known as a diffusion model, progressively adds noise to real ECG signals until they approach a noise distribution, thereby facilitating model processing. To improve generation speed and reduce memory usage, a Knowledge Distillation–Diffusion Generative Network (KD-DGN) is proposed, which demonstrates superior memory efficiency and generation performance compared with the traditional DGN. This work compares the memory usage, generation efficiency, and classification accuracy of DGN and KD-DGN, and analyzes the characteristics of the generated data after lightweight processing. In addition, the classification effects of the original MIT-BIH dataset and an extended dataset (MIT-BIH-PLUS) are evaluated. Experimental results show that convolutional networks extract richer feature information from the extended dataset generated by DGN, leading to improved recognition performance of ECG pathological signals. Methods The generative network-based ECG signal generation algorithm is designed to enhance the performance of convolutional networks in ECG signal classification. The process begins with a Gaussian noise-based image perturbation algorithm, which obscures the original ECG data by introducing controlled randomness. This step simulates real-world variability, enabling the model to learn more robust representations. A diffusion generative algorithm is then applied to reconstruct and reproduce the data, generating synthetic ECG signals that preserve the essential characteristics of the original categories despite the added noise. This reconstruction ensures that the underlying features of ECG signals are retained, allowing the convolutional network to extract more informative features during classification. To improve efficiency, the approach incorporates knowledge distillation. A teacher–student framework is adopted in which a lightweight student model is trained from the original, more complex teacher ECG data generation model. This strategy reduces computational requirements and accelerates the data generation process, improving suitability for practical applications. Finally, two comparative experiments are designed to validate the effectiveness and accuracy of the proposed method. These experiments evaluate classification performance against existing approaches and provide quantitative evidence of its advantages in ECG signal processing. Results and Discussions The data generation algorithm yields ECG signals with a Signal-to-Noise Ratio (SNR) comparable to that of the original data, while presenting more discernible signal features. The student model constructed through knowledge distillation produces ECG samples with the same SNR as those generated by the teacher model, but with substantially reduced complexity. Specifically, the student model achieves a 50% reduction in size, 37% lower memory usage, and a 57% shorter runtime compared with the teacher model ( Fig. 6 ). When the convolutional network is trained with data generated by the KD-DGN, its classification performance improves across all metrics compared with a convolutional network trained without KD-DGN. Precision reaches 97.4%, and the misidentification rate is reduced to approximately 3% (Fig. 9 ).Conclusions The DGN provides an effective data generation strategy for addressing the scarcity of ECG datasets. By supplying additional synthetic data, it enables convolutional networks to extract more diverse class-specific features, thereby improving recognition performance and reducing misidentification rates. Optimizing DGN with knowledge distillation further enhances efficiency, while maintaining SNR equivalence with the original DGN. This optimization reduces computational cost, conserves machine resources, and supports simultaneous task execution. Moreover, it enables the generation of new data without LOSS, allowing convolutional networks to learn from larger datasets at lower cost. Overall, the proposed approach markedly improves the classification performance of convolutional networks on ECG signals. Future work will focus on further algorithmic optimization for real-world applications. -
[1] 刘明波, 何新叶, 杨晓红, 等. 《中国心血管健康与疾病报告2023》要点解读[J]. 临床心血管病杂志, 2024, 40(8): 599–616. doi: 10.13201/j.issn.1001-1439.2024.08.002.LIU Mingbo, HE Xinye, YANG Xiaohong, et al. Interpretation of report on cardiovascular health and diseases in China 2023[J]. Journal of Clinical Cardiology, 2024, 40(8): 599–616. doi: 10.13201/j.issn.1001-1439.2024.08.002. [2] PENG Huyang, CHANG Xiaohan, YAO Zhenjie, et al. A deep learning framework for ECG denoising and classification[J]. Biomedical Signal Processing and Control, 2024, 94: 106441. doi: 10.1016/j.bspc.2024.106441. [3] LI Chengjun, WU Yacen, LIN Haijun, et al. ECG denoising method based on an improved VMD algorithm[J]. IEEE Sensors Journal, 2022, 22(23): 22725–22733. doi: 10.1109/JSEN.2022.3214239. [4] MERDJANOVSKA E and RASHKOVSKA A. Comprehensive survey of computational ECG analysis: Databases, methods and applications[J]. Expert Systems with Applications, 2022, 203: 117206. doi: 10.1016/j.dcan.2025.04.001. [5] REN Jianlin, ZHANG Ran, CAO Xiaodong, et al. Experimental evaluation of ECG signal denoising methods based on HRV indices and their application in indoor thermal comfort study under different temperatures[J]. Energy and Buildings, 2024, 303: 113797. doi: 10.1016/j.enbuild.2023.113797. [6] MA'SUM M A, JATMIKO W, and SUHARTANTO H. Enhanced Tele ECG system using Hadoop framework to deal with big data processing[C]. Proceedings of 2016 International Workshop on Big Data and Information Security, Jakarta, Indonesia, 2016: 121–126. doi: 10.1109/IWBIS.2016.7872900. [7] GUPTA V and MITTAL M. KNN and PCA classifier with autoregressive modelling during different ECG signal interpretation[J]. Procedia Computer Science, 2018, 125: 18–24. doi: 10.1016/j.procs.2017.12.005. [8] JING Enbiao, ZHANG Haiyang, LI Zhigang, et al. ECG heartbeat classification based on an improved ResNet‐18 model[J]. Computational and Mathematical Methods in Medicine, 2021, 2021: 6649970. doi: 10.1155/2021/6649970. [9] KUMAR M A and CHAKRAPANI A. Classification of ECG signal using FFT based improved Alexnet classifier[J]. PLoS One, 2022, 17(9): e0274225. doi: 10.1371/journal.pone.0274225. [10] KAMOZAWA H, MUROGA S, and TANAKA M. A detection method of atrial fibrillation from 24‐hour Holter‐ECG using CNN[J]. IEEJ Transactions on Electrical and Electronic Engineering, 2023, 18(4): 577–582. doi: 10.1002/tee.23756. [11] 邵虹, 荆一烜, 崔文成. 基于扩散生成对抗网络的核磁共振图像与计算机断层扫描图像跨模态转换[J]. 生物医学工程学杂志, 2025, 42(3): 575–584. doi: 10.7507/1001-5515.202404056.SHAO Hong, JING Yixuan, and CUI Wencheng. Cross modal translation of magnetic resonance imaging and computed tomography images based on diffusion generative adversarial networks[J]. Journal of Biomedical Engineering, 2025, 42(3): 575–584. doi: 10.7507/1001-5515.202404056. [12] SIDDIQUE N, PAHEDING S, ELKIN C P, et al. U-net and its variants for medical image segmentation: A review of theory and applications[J]. IEEE Access, 2021, 9: 82031–82057. doi: 10.1109/ACCESS.2021.3086020. [13] MERDJANOVSKA E and RASHKOVSKA A. Comprehensive survey of computational ECG analysis: Databases, methods and applications[J]. Expert Systems with Applications, 2022, 203: 117206. doi: 10.1016/j.eswa.2022.117206. (查阅网上资料,本条文献与第4条文献重复,请确认). [14] WEN Yihan, MA Xianping, ZHANG Xiaokang, et al. GCD-DDPM: A generative change detection model based on difference-feature-guided DDPM[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5404416. doi: 10.1109/TGRS.2024.3381752. [15] SHI Yongyi, XIA Wenjun, WANG Ge, et al. Blind CT image quality assessment using DDPM-derived content and transformer-based evaluator[J]. IEEE Transactions on Medical Imaging, 2024, 43(10): 3559–3569. doi: 10.1109/TMI.2024.3418652. [16] BECHINIA H, BENMERZOUG D, and KHLIFA N. Approach based lightweight custom convolutional neural network and fine-tuned MobileNet-V2 for ECG arrhythmia signals classification[J]. IEEE Access, 2024, 12: 40827–40841. doi: 10.1109/ACCESS.2024.3378730. [17] SAADATNEJAD S, OVEISI M, and HASHEMI M. LSTM-based ECG classification for continuous monitoring on personal wearable devices[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(2): 515–523. doi: 10.1109/JBHI.2019.2911367. [18] APANDI Z F M, IKEURA R, and HAYAKAWA S. Arrhythmia detection using MIT-BIH dataset: A review[C]. Proceedings of 2018 International Conference on Computational Approach in Smart Systems Design and Applications, Kuching, Malaysia, 2018: 1–5. doi: 10.1109/ICASSDA.2018.8477620. -