高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于记忆约束剪枝的中医结构化大模型轻量化方法研究

陆家发 唐凯 张国明 俞晓帆 顾文琦 李卓

陆家发, 唐凯, 张国明, 俞晓帆, 顾文琦, 李卓. 基于记忆约束剪枝的中医结构化大模型轻量化方法研究[J]. 电子与信息学报. doi: 10.11999/JEIT250909
引用本文: 陆家发, 唐凯, 张国明, 俞晓帆, 顾文琦, 李卓. 基于记忆约束剪枝的中医结构化大模型轻量化方法研究[J]. 电子与信息学报. doi: 10.11999/JEIT250909
LU Jiafa, TANG Kai, ZHANG Guoming, YU Xiaofan, GU Wenqi, LI Zhuo. A Study on Lightweight Method of TCM Structured Large Model Based on Memory-Constrained Pruning[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250909
Citation: LU Jiafa, TANG Kai, ZHANG Guoming, YU Xiaofan, GU Wenqi, LI Zhuo. A Study on Lightweight Method of TCM Structured Large Model Based on Memory-Constrained Pruning[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250909

基于记忆约束剪枝的中医结构化大模型轻量化方法研究

doi: 10.11999/JEIT250909 cstr: 32379.14.JEIT250909
基金项目: 国家科技重大专项-慢病防治技术推广行动实施项目(SQ2024AAA030211),
详细信息
    作者简介:

    陆家发:男,博士研究生,研究方向为医学自然语言处理与轻量化深度学习

    通讯作者:

    俞晓帆 20230053@njupt.edu.cn

  • 中图分类号: XXXXX

A Study on Lightweight Method of TCM Structured Large Model Based on Memory-Constrained Pruning

Funds: National Science and Technology Major Project - Implementation Program for the Promotion of Chronic Disease Prevention and Control Technologies
  • 摘要: 随着人工智能技术的快速发展,大语言模型正在中医医疗领域广泛试点应用。但基层中医医院部署大模型面临GPU资源受限、中医非结构化病历利用率低的双重痛点。为此,本论文提出一种轻量化的中医病历智能结构化模型。所提模型不仅借助知识蒸馏实现文本编码器轻量化,更为重要是在传统文本编码其中引入多模态融合模块,实现轻量化舌诊图像表征。具体而言,提出了一种基于记忆约束的多模态表征轻量化方法。所提方法将长短时记忆网络作为剪枝决策器,分析历史信息中长时依赖关系,以此来学习并量化多模态表征中特征连接的重要性。在此基础上,引入增强学习方法对舌诊特征提取模型参数进行反向更新,进一步提升剪枝决策的准确性。实验采用多中心21家三甲医院的10500份脱敏中医电子病历及舌像图像关联文本进行训练与验证。所提模型F1-score达91.7%,显存占用3.8GB,推理速度22rec/s,较BERT-Large提升27.2%效率且显存降低75%。消融实验表明,动态批次裁剪对显存节省贡献75%(较BERT-Large基准),在模型自身消融对比中节省62%显存,中医术语增强词表使罕见实体F1提升6.2%。
  • 图  1  模型架构图

    图  2  基于记忆约束的多模态融合轻量化算法框

    图  3  可视化热力图对比 图(a)-(d)分别表示原始模型热力图、剪枝conv1、剪枝conv3和全部剪枝热力图

    表  1  不同模型的性能对比

    模型 F1-score
    (%)
    显存占用
    (GB)
    推理速度
    (rec/s)
    BERT-Large 92.1 15.2 8
    本文模型 91.7 3.8 22
    CNN-BiLSTM 86.3 2.1 35
    本文模型(-AG) 88.9 3.7 23
    本文模型(-DBP) 91.5 6.2 20
    T + EfficientNet-B0[25] 89.3 4.1 28
    MobileBERT + MobileNet-V2[26] 88.7 3.5 32
    注:-AG表示移除自适应注意力门控,-DBP表示移除动态批次裁剪,动态批次裁剪的显存节省率62%指对比相同模型未启用该策略时的显存(6.2 GB)。
    下载: 导出CSV

    表  2  按实体类别细分的F1-score(测试集)

    实体类别具体实体名称F1-score(%)数据说明
    肛肠科刻下便血颜色94.2枚举型实体(淡红/鲜红/暗红),标签明确
    刻下便后肛内肿物是否脱出93.8二元枚举(是/否),关键词规则清晰
    骨科患处VAS评分95.6数值型实体(整数),格式固定
    健侧腕关节周径(cm)95.1decimal型数值,单位明确
    呼吸内科痰色92.5枚举型(黄/白/绿等),存在模糊描述(如“黄白相间”)
    有无发热93.3二元枚举(有/无),时间范围明确
    消化内科刻下腹泻频次91.8范围枚举(每日<4次等),需处理模糊表述(如“5~8次”)
    刻下有无黏液脓血便92.1三元枚举(有/无/未提及),文本表述直接
    证候类舌象(如“舌淡红”)88.7描述性文本,存在模糊性(如“淡红偏暗”)
    脉象(如“脉细涩”)87.5术语嵌套(脉象+病机),表述抽象
    注:数据来源于2100份测试集(来自总数据集10500份),各类实体标注参照《中医诊断学术语》
    下载: 导出CSV

    表  3  各模块对性能的影响

    模块 F1-score 变化 显存变化 长文本处理稳定性
    自适应注意力门控 +2.8% ±0% 无影响
    动态批次裁剪 ±0.2% -62% 提升至99.6%
    术语增强词表 +6.2% ±0% 无影响
    注:显存节省率计算公式:$ \Delta _{mem}=\frac{{M}_{base}-{M}_{\text{o}pt}}{{M}_{base}}\text{×}100\text{%} $
    其中:$ {M}_{base} $未使用优化策略时的显存(如无DBP时为6.2 GB)-$ {M}_{opt} $使用优化后的显存(3.8 GB)。
    下载: 导出CSV

    表  4  剪枝策略对比

    剪枝策略F1-score(%)显存占用(GB)推理速度(rec/s)
    CNN-BiLSTM+无剪枝91.76.218
    CNN-BiLSTM+L1范数剪枝91.24.820
    CNN-BiLSTM+记忆约束剪枝91.53.822
    下载: 导出CSV

    表  5  不同剪枝比例下的模型性能

    剪枝比例F1-score(%)显存占用(GB)推理速度(rec/s)
    30%91.24.519
    40%90.94.021
    50%90.13.524
    60%89.53.127
    70%87.12.730
    下载: 导出CSV

    表  6  不同LSTM隐藏层大小下的模型性能

    LSTM隐藏层大小F1-score(%)模型参数量(M)
    6489.258.2
    12889.559.1
    19289.560.3
    25689.461.8
    下载: 导出CSV

    表  7  不同跨模态注意力头数下的模型性能

    注意力头数F1-score(%)证候类实体F1-score(%)
    489.186.5
    889.587.4
    1289.687.5
    下载: 导出CSV
  • [1] 国家卫生健康委, 国家发展改革委, 教育部, 等. 2023年度全国三级公立中医医院绩效监测分析情况通报[EB/OL]. http://www.natcm.gov.cn/yizhengsi/zhengcewenjian/2025-03-28/36079.html, 2025. (查阅网上资料,未找到本条文献英文翻译,请确认并补充).
    [2] 张敏, 李军, 王芳, 等. 基于CNN-BiLSTM的电子病历实体识别研究[J]. 计算机应用, 2023, 43(5): 1567–1573. (查阅网上资料, 未找到本条文献信息, 请确认).

    ZHANG Min, LI Jun, WANG Fang, et al. Entity recognition in electronic medical records based on CNN-BiLSTM[J]. Journal of Computer Applications, 2023, 43(5): 1567–1573.
    [3] WANG J, LI Y, and ZHANG Q. Lightweight BERT for TCM entity recognition in primary hospitals[C]. International Conference on Biomedical and Health Informatics (BHI), 2023: 1–5. (查阅网上资料, 未找到本条文献信息, 请确认).
    [4] 王建国, 刘敏, 张强. 中医电子病历结构化方法研究进展[J]. 北京中医药大学学报, 2022, 45(8): 789–795. (查阅网上资料, 未找到本条文献信息, 请确认).

    WANG Jianguo, LIU Min, and ZHANG Qiang. Progress in structured methods of TCM electronic medical records[J]. Journal of Beijing University of Traditional Chinese Medicine, 2022, 45(8): 789–795.
    [5] DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, 2019: 4171–4186. doi: 10.18653/v1/N19-1423.
    [6] 李华, 王磊, 张明, 等. 基于轻量化和极化自注意力的中医望诊异常形态分类[J]. 数字中医药, 2024, 3(4): 25–33. (查阅网上资料, 未找到本条文献信息, 请确认).
    [7] JIAO Xiaoqi, YIN Yichun, SHANG Lifeng, et al. TinyBERT: Distilling BERT for natural language understanding[C]. Findings of the Association for Computational Linguistics: EMNLP, Hong Kong, China, 2020: 4163–4174. doi: 10.18653/v1/2020.findings-emnlp.372. (查阅网上资料,未找到本条文献出版地信息,请确认).
    [8] 李明, 张华, 王红. 基于注意力机制的中医实体识别轻量化研究[J]. 电子与信息学报, 2023, 45(2): 312–318. (查阅网上资料, 未找到本条文献信息, 请确认).

    LI Ming, ZHANG Hua, and WANG Hong. TCM entity recognition based on attention mechanism[J]. Journal of Electronics & Information Technology, 2023, 45(2): 312–318.
    [9] RONNEBERGER O, FISCHER P, and BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]. 18th International Conference on Medical Image Computing and Computer-Assisted Intervention - MICCAI, Munich, Germany, 2015: 234–241. doi: 10.1007/978-3-319-24574-4_28.
    [10] 中国中医药信息学会. T/CIATCM 013-2019 中医电子病历基本数据集[S]. 2019. (查阅网上资料, 未找到本条文献出版信息, 请确认并补充).

    China Information Association of Traditional Chinese Medicine. T/CIATCM 013-2019 Basic datasets for electronic medical records of traditional Chinese medicine[S]. 2019.
    [11] 国家中医药管理局. 中医诊断学术语[S]. 北京: 国家中医药管理局, 2021. (查阅网上资料, 未找到本条文献信息, 请确认).
    [12] 国家中医药管理局. 中药处方规范[S]. 北京: 国家中医药管理局, 2010. (查阅网上资料, 未找到本条文献信息, 请确认).
    [13] 范骁辉, 张俊华, 等. TCMChat: 基于LoRA的中医药生成式大模型[J]. 药理研究, 2025, 112: 105986. (查阅网上资料, 未找到本条文献信息, 请确认).
    [14] LI Hao, KADAV A, DURDANOVIC I, et al. Pruning filters for efficient ConvNets[C]. 5th International Conference on Learning Representations, Toulon, France, 2017.
    [15] LIU Zhuang, SUN Mingjie, ZHOU Tinghui, et al. Rethinking the value of network pruning[C]. 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [16] KUNDU S, NAZEMI M, BEEREL P A, et al. DNR: A tunable robust pruning framework through dynamic network rewiring of DNNs[C]. Proceedings of the 26th Asia and South Pacific Design Automation Conference, Tokyo, Japan, 2021: 344–350. doi: 10.1145/3394885.3431542.
    [17] JACOB B, KLIGYS S, CHEN Bo, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 2704–2713. doi: 10.1109/CVPR.2018.00286.
    [18] HAN Song, MAO Huizi, and DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding[J]. arXiv preprint arXiv: 1510.00149, 2015. doi: 10.48550/arXiv.1510.00149. (查阅网上资料,请作者核对文献类型及格式是否正确).
    [19] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    [20] ZHAO Xiongjun, WANG Xiang, YU Fenglei, et al. UniMed: Multimodal multitask learning for medical predictions[C]. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Las Vegas, USA, 2022: 1399–1404. doi: 10.1109/BIBM55620.2022.9995044.
    [21] MA Xinyin, FANG Gongfan, and WANG Xinchao. LLM-pruner: On the structural pruning of large language models[C]. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, USA, 2023: 950.
    [22] POUDEL P, CHHETRI A, GYAWALI P, et al. Multimodal federated learning with missing modalities through feature imputation network[C]. 29th Annual Conference on Medical Image Understanding and Analysis, Leeds, UK, 2026: 289–299. doi: 10.1007/978-3-031-98688-8_20.
    [23] BACK J, AHN N, and KIM J. Magnitude attention-based dynamic pruning[J]. Expert Systems with Applications, 2025, 276: 126957. doi: 10.1016/j.eswa.2025.126957.
    [24] LIU Jiaxin, LIU Wei, LI Yongming, et al. Attention-based adaptive structured continuous sparse network pruning[J]. Neurocomputing, 2024, 590: 127698. doi: 10.1016/j.neucom.2024.127698.
    [25] TAN Mingxing and LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks[C]. 36th International Conference on Machine Learning, Long Beach, USA, 2019: 6105–6114.
    [26] WU Qinzhuo, XU Weikai, LIU Wei, et al. MobileVLM: A vision-language model for better intra- and inter-UI understanding[C]. Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, USA, 2024: 10231–10251. doi: 10.18653/v1/2024.findings-emnlp.599.
    [27] FU Zheren, ZHANG Lei, XIA Hou, et al. Linguistic-aware patch slimming framework for fine-grained cross-modal alignment[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 26297–26306. doi: 10.1109/CVPR52733.2024.02485.
    [28] PAN Zhengxin, WU Fangyu, and ZHANG Bailing. Fine-grained image-text matching by cross-modal hard aligning network[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 19275–19284. doi: 10.1109/CVPR52729.2023.01847.
  • 加载中
图(3) / 表(7)
计量
  • 文章访问数:  15
  • HTML全文浏览量:  7
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 修回日期:  2026-01-22
  • 录用日期:  2026-01-22
  • 网络出版日期:  2026-02-11

目录

    /

    返回文章
    返回