高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向病理图像弱监督分类的层级融合多实例学习方法

陈晓禾 张甲昂 李玲芝 李桂秀 欧紫蓉 鲍月华 刘欣欣 虞秋辰 马雨涵 赵可喻 白华

陈晓禾, 张甲昂, 李玲芝, 李桂秀, 欧紫蓉, 鲍月华, 刘欣欣, 虞秋辰, 马雨涵, 赵可喻, 白华. 面向病理图像弱监督分类的层级融合多实例学习方法[J]. 电子与信息学报. doi: 10.11999/JEIT250726
引用本文: 陈晓禾, 张甲昂, 李玲芝, 李桂秀, 欧紫蓉, 鲍月华, 刘欣欣, 虞秋辰, 马雨涵, 赵可喻, 白华. 面向病理图像弱监督分类的层级融合多实例学习方法[J]. 电子与信息学报. doi: 10.11999/JEIT250726
CHEN Xiaohe, ZHANG Jiaang, LI Lingzhi, LI Guixiu, OU Zirong, BAO Yuehua, LIU Xinxin, YU Qiuchen, MA Yuhan, ZHAO Keyu, BAI Hua. Hierarchical Fusion Multi-Instance Learning for Weakly Supervised Pathological Image Classification[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250726
Citation: CHEN Xiaohe, ZHANG Jiaang, LI Lingzhi, LI Guixiu, OU Zirong, BAO Yuehua, LIU Xinxin, YU Qiuchen, MA Yuhan, ZHAO Keyu, BAI Hua. Hierarchical Fusion Multi-Instance Learning for Weakly Supervised Pathological Image Classification[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250726

面向病理图像弱监督分类的层级融合多实例学习方法

doi: 10.11999/JEIT250726 cstr: 32379.14.JEIT250726
基金项目: 宁波市重大研发计划项目(2024Z228),浙江省健康重大科技规划项目(WKJ-ZJ-2411),宁波市公共福利项目(2022S065),宁波市医学重点扶植学科项目(2022-F23)
详细信息
    作者简介:

    陈晓禾:男,教授,研究方向为人工智能、信号处理、生物医学信息学

    张甲昂:男,硕士生,研究方向为生物医学信息学

    李玲芝:女,主治医师,研究方向为皮肤肿瘤、银屑病

    李桂秀:女,住院医师,研究方向为皮肤肿瘤

    欧紫蓉:女,硕士生,研究方向为生物医学信息学

    鲍月华:女,硕士生,研究方向为生物医学信息学

    刘欣欣:女,硕士生,研究方向为生物医学信息学

    虞秋辰:男,博士生,研究方向为人工智能

    马雨涵:女,硕士生,研究方向为皮肤肿瘤

    赵可喻:男,住院医师,研究方向为特应性皮炎、恶性黑色素瘤

    白华:女,高级工程师,研究方向为医疗装备智能化

    通讯作者:

    李玲芝 lilingzhi912@163.com

  • 中图分类号: TN911.73;TP391.41

Hierarchical Fusion Multi-Instance Learning for Weakly Supervised Pathological Image Classification

Funds: The Ningbo Major Research and Development Plan Project (2024Z228), The Health Major Science and Technology Planning Project of Zhejiang Province, China (WKJ-ZJ-2411), The Public Welfare Projects of Ningbo, China (2022S065), The Project of Ningbo Leading Medical & Health Discipline (2022-F23)
  • 摘要: 病理图像分类对于癌症诊断至关重要,但现有方法存在依赖随机采样、多层级信息利用不足等问题。为此,该文提出一种层级融合多实例学习方法。首先,对病理图像的不同层级进行切分并用ResNet-50提取特征;然后,针对伪包标签不准确及背景噪声问题,提出基于注意力评价函数的伪包划分方法,利用门控注意力对低分辨率特征进行重要性评估,依据得分将特征划分为低分辨率伪包,并通过索引映射得到对应高分辨率伪包;最后,针对多层级信息利用不足的问题设计两阶段分类模型,第一阶段对低分辨率伪包进行初步分类,并依据预测置信度筛选出高判别性的关键区域及其对应的高分辨率特征;第二阶段通过交叉注意力机制,将筛选出的低分辨率特征与对应的高分辨率特征进行深度融合,随后将其与经过门控注意力聚合的高分辨率伪包特征进行拼接,以利用局部细节结合全局信息进行分析。在训练过程中,采用双分支交叉熵损失函数,联合优化低分辨率初步分类任务与高分辨率最终分类任务。实验使用了两个公开数据集Camelyon16、TCGA-LUNG及一个私有皮肤癌数据集NBU-Skin对模型进行测试,结果表明,该方法在多中心公开数据集和私有数据集上性能均优于CLAM、TransMIL等算法,其中在NBU-Skin数据集上五折交叉验证的平均准确率达到90.5%,平均AUC达到0.976。此外,该方法在跨病种、跨中心数据中表现稳定,为癌症病理的人工智能诊断提供了新的思路。
  • 图  1  病理图像存储方式示例

    图  2  三种多实例学习流程

    图  3  基于多层级模型的组织病理学图像弱监督分类算法框架

    图  4  基于注意力评价函数的伪包划分方法

    图  5  混淆矩阵对比

    图  6  模型预测热图

    1  多层级对应特征提取方法

     输入:共输入N张病理图像$ {\{W}_{i}\}_{i=1}^{N} $,其中$ {W}_{i} $代表第i张病理图
     像,每张病理图像在低分辨率下有m个图像块
     预训练网络ResNet50
     输出:低分辨率特征集合$ \{{\boldsymbol{X}}_{i,\text{low}}\}_{i=1}^{N} $,其中
     $ {\boldsymbol{X}}_{i,\text{low}}=\{{x}_{i,1},{x}_{i,2},\cdots ,{x}_{i,m}\} $
     高分辨率特征集合$ \{{\boldsymbol{X}}_{i,\text{high}}\}_{i=1}^{N} $,其中
     $ {\boldsymbol{X}}_{i,\text{high}}=\{{h}_{i,1},{h}_{i,2},\cdots ,{h}_{i,16m}\} $
     (1) for i←1 to N do
     (2)  使用最大类间方差法提取组织区域轮廓并去除空白区域
     (3)  将$ {W}_{i} $切分为256×256的图像块,得到第k个图像块的低分
     辨率坐标 $ \mathrm{low}\_ {\text{coords}}_{k}=({x}_{k},{y}_{k}) $
     (4)  for k←1 to m do
     (5)   从$ {W}_{i} $中根据坐标$ \mathrm{low}\_ {\text{coords}}_{k}=({x}_{k},{y}_{k}) $提取图像块
         $ {P}_{\mathrm{k},\text{low}} $
     (6)   $ {\boldsymbol{f}}_{\text{low}}\leftarrow \mathrm{ResNet}50({P}_{\mathrm{k},\text{low}}) $
     (7)   $ {\boldsymbol{X}}_{i,\text{low}}.\mathrm{append}({\boldsymbol{f}}_{\text{low}}) $
     (8)   通过${\mathrm{high}\_ \text{coords}}_{k}\mathrm{}=\{({x}_{\mathrm{k}}\times 4+i\times 256,{y}_{\mathrm{k}}\times 4+ $
        $ j\times 256)\mid i,j\in \{0{,}1,2{,}3\}\} $计算第k个图像块的高分辨
        率坐标$ {\mathrm{high}\_ \text{coords}}_{k}\mathrm{}=({x}_{high},{y}_{high}) $
     (9)   foreach $ ({x}_{high},{y}_{high})\in {\mathrm{high}\_ \text{coords}}_{k} $do
     (10)    从$ {W}_{i} $中根据坐标$ {high\_ coords}_{\mathrm{k}}=({x}_{high},{y}_{high}) $提
          取图像块 $ {P}_{\mathrm{k},\text{high}} $
     (11)    $ {\boldsymbol{f}}_{\text{high}}\leftarrow \mathrm{ResNet}50({P}_{\mathrm{k},\text{high}}) $
     (12)    $ {\boldsymbol{X}}_{i,\text{high}}.\mathrm{append}({\boldsymbol{f}}_{\text{high}}) $
     (13)   end foreach
     (14) end for
     (15) end for
    下载: 导出CSV

    表  1  数据分布情况

    数据集总数类别分布数量训练集验证集测试集
    NBU-Skin326BCC1662283365
    MM106
    SCC54
    TCGA-LUNG1053LUSC541738105210
    LUAD512
    Camelyon16399NORMAL23924327129
    TUMOR160
    下载: 导出CSV

    表  2  模型结果对比

    模型 模型推理时间 显存消耗 Camelyon16 NBU-Skin TCGA-LUNG
    准确率 AUC 准确率 AUC 准确率 AUC
    Mean-Pooling‎‎[23] 0.229 ms 2148 MB 0.675 0.761 0.755±0.050 0.863±0.114 0.813±0.094 0.881±0.099
    Max-Pooling‎[23]‎ 0.232 ms 2148 MB 0.587 0.599 0.797±0.018 0.881±0.044 0.801±0.031 0.860±0.034
    ABMIL‎[13]‎ 0.468 ms 2276 MB 0.862 0.876 0.838±0.063 0.927±0.089 0.844±0.023 0.919±0.026
    Dsmil‎[17] 0.864 ms 2284 MB 0.836 0.862 0.727±0.027 0.809±0.107 0.783±0.041 0.856±0.040
    Dsmil + APBD 0.913 ms 2463 MB 0.850 0.919 0.847±0.039 0.925±0.044 0.844±0.011 0.918±0.016
    Dsmil + RankMix‎[18] 1.176 ms 2949 MB 0.855 0.897 0.823±0.086 0.922±0.042 0.849±0.017 0.917±0.023
    Dsmil + ReMix‎[20] 0.901 ms 3052 MB 0.829 0.905 0.818±0.055 0.908±0.047 0.833±0.037 0.915±0.020
    CLAM-SB‎[14]‎ 0.456 ms 2584 MB 0.806 0.865 0.862±0.070 0.950±0.020 0.834±0.030 0.912±0.034
    CLAM-SB + MDDP‎[24] 0.820 ms 3394 MB 0.872 0.868 0.751±0.083 0.878±0.055 0.841±0.033 0.925±0.042
    CLAM-MB‎[14]‎ 0.841 ms 2584 MB 0.782 0.770 0.865±0.050 0.953±0.032 0.840±0.043 0.912±0.019
    TransMIL‎[15]‎ 6.620 ms 10114 MB 0.858 0.906 0.798±0.100 0.930±0.061 0.819±0.038 0.885±0.030
    DTFD(MaxS)‎[16]‎ 1.833 ms 2140 MB 0.858 0.870 0.859±0.024 0.898±0.012 0.764±0.010 0.837±0.017
    DTFD(MaxMinS)‎[16]‎ 1.911 ms 2336 MB 0.881 0.906 0.792±0.101 0.887±0.074 0.832±0.031 0.907±0.031
    DTFD(AFS)‎[16]‎ 2.052 ms 2130MB 0.881 0.901 0.786±0.101 0.891±0.081 0.849±0.036 0.927±0.019
    HFMIL(本文) 2.249 ms 2356 MB 0.883 0.979 0.905±0.030 0.976±0.016 0.860±0.043 0.931±0.026
    下载: 导出CSV

    表  3  消融实验结果

    HFMIL消融实验设置Camelyon16NBU-SkinTCGA-LUNG
    多层级特征融合伪包划分准确率AUC准确率AUC准确率AUC
    0.4720.4200.8000.8820.7850.880
    0.7510.6680.8760.9520.8230.854
    0.7980.7820.9070.9680.8800.927
    0.8830.9790.9380.9970.9000.966
    下载: 导出CSV

    表  4  top-k参数敏感性实验结果

    top-k取值准确率AUC
    5%0.907±0.0550.975±0.013
    10%0.901±0.0580.971±0.011
    15%0.905±0.0300.976±0.016
    20%0.900±0.0370.972±0.015
    25%0.892±0.0660.966±0.023
    下载: 导出CSV
  • [1] HAN Bingfeng, ZHENG Rongshou, ZENG Hongmei, et al. Cancer incidence and mortality in China, 2022[J]. Journal of the National Cancer Center, 2024, 4(1): 47–53. doi: 10.1016/j.jncc.2024.01.006.
    [2] 姜梦琦, 韩昱晨, 傅小龙. 基于人工智能的H-E染色全切片病理学图像分析在肺癌研究中的进展[J]. 中国癌症杂志, 2024, 34(3): 306–315. doi: 10.19401/j.cnki.1007-3639.2024.03.009.

    JIANG Mengqi, HAN Yuchen, and FU Xiaolong. Research progress on H-E stained whole slide image analysis by artificial intelligence in lung cancer[J]. China Oncology, 2024, 34(3): 306–315. doi: 10.19401/j.cnki.1007-3639.2024.03.009.
    [3] 王钰萌, 刘振丙, 刘再毅. 隐私保护的联邦弱监督组织病理学亚型分类方法[J/OL]. https://jeit.ac.cn/cn/article/doi/10.11999/JEIT250842, 2025.

    WANG Yumeng, LIU Zhenbing, and LIU Zaiyi. Privacy-preserving federated weakly-supervised learning for cancer subtyping on histopathology images[J/OL]. https://jeit.ac.cn/cn/article/doi/10.11999/JEIT250842, 2025.
    [4] 金怀平, 薛飞跃, 李振辉, 等. 基于病理图像集成深度学习的胃癌预后预测方法[J]. 电子与信息学报, 2023, 45(7): 2623–2633. doi: 10.11999/JEIT220655.

    JIN H P, XUE F Y, LI Z H, et al. Prognostic prediction of gastric cancer based on ensemble deep learning of pathological images[J]. Journal of Electronics & Information Technology, 2023, 45(7): 2623–2633. doi: 10.11999/JEIT220655.
    [5] FEI Manman, ZHANG Xin, CHEN Dongdong, et al. Whole slide cervical cancer classification via graph attention networks and contrastive learning[J]. Neurocomputing, 2025, 613: 128787. doi: 10.1016/j.neucom.2024.128787.
    [6] ZHANG Jiawei, SUN Zhanquan, WANG Kang, et al. Prognosis prediction based on liver histopathological image via graph deep learning and transformer[J]. Applied Soft Computing, 2024, 161: 111653. doi: 10.1016/j.asoc.2024.111653.
    [7] LI Mingze, ZHANG Bingbing, SUN Jian, et al. Weakly supervised breast cancer classification on WSI using transformer and graph attention network[J]. International Journal of Imaging Systems and Technology, 2024, 34(4): e23125. doi: 10.1002/ima.23125.
    [8] WANG Fuying, XIN Jiayi, ZHAO Weiqin, et al. TAD-graph: Enhancing whole slide image analysis via task-aware subgraph disentanglement[J]. IEEE Transactions on Medical Imaging, 2025, 44(6): 2683–2695. doi: 10.1109/TMI.2025.3545680.
    [9] WU Kun, JIANG Zhiguo, TANG Kunming, et al. Pan-cancer histopathology WSI pre-training with position-aware masked autoencoder[J]. IEEE Transactions on Medical Imaging, 2025, 44(4): 1610–1623. doi: 10.1109/TMI.2024.3513358.
    [10] 张印辉, 张金凯, 何自芬, 等. 全局感知与稀疏特征关联图像级弱监督病理图像分割[J]. 电子与信息学报, 2024, 46(9): 3672–3682. doi: 10.11999/JEIT240364.

    ZHANG Yinhui, ZHANG Jinkai, HE Zifen, et al. Global perception and sparse feature associate image-level weakly supervised pathological image segmentation[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3672–3682. doi: 10.11999/JEIT240364.
    [11] YAN Rui, LV Zhilong, YANG Zhidong, et al. Sparse and hierarchical transformer for survival analysis on whole slide images[J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28(1): 7–18. doi: 10.1109/JBHI.2023.3307584.
    [12] MA Yingfan, LUO Xiaoyuan, FU Kexue, et al. Transformer-based video-structure multi-instance learning for whole slide image classification[C]. Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 14263–14271. doi: 10.1609/aaai.v38i13.29338.
    [13] ILSE M, TOMCZAK J, and WELLING M. Attention-based deep multiple instance learning[C]. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 2127–2136.
    [14] LU M Y, WILLIAMSON D F K, CHEN T Y, et al. Data-efficient and weakly supervised computational pathology on whole-slide images[J]. Nature Biomedical Engineering, 2021, 5(6): 555–570. doi: 10.1038/s41551-020-00682-w.
    [15] SHAO Zhuchen, BIAN Hao, CHEN Yang, et al. TransMIL: Transformer based correlated multiple instance learning for whole slide image classification[C]. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021: 164. (查阅网上资料, 未找到本条文献出版地信息, 请确认).
    [16] ZHANG Hongrun, MENG Yanda, ZHAO Yitian, et al. DTFD-MIL: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 18780–18790. doi: 10.1109/CVPR52688.2022.01824.
    [17] LI Bin, LI Yin, and ELICEIRI K W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 14313–14323. doi: 10.1109/CVPR46437.2021.01409.
    [18] CHEN Y C and LU C S. RankMix: Data augmentation for weakly supervised learning of classifying whole slide images with diverse sizes and imbalanced categories[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 23936–23945. doi: 10.1109/CVPR52729.2023.02292.
    [19] LIU Pei, JI Luping, ZHANG Xinyu, et al. Pseudo-bag mixup augmentation for multiple instance learning-based whole slide image classification[J]. IEEE Transactions on Medical Imaging, 2024, 43(5): 1841–1852. doi: 10.1109/TMI.2024.3351213.
    [20] YANG Jiawei, CHEN Hanbo, ZHAO Yu, et al. ReMix: A general and efficient framework for multiple instance learning based whole slide image classification[C]. Proceedings of the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, Singapore, Singapore, 2022: 35–45. doi: 10.1007/978-3-031-16434-7_4.
    [21] BEJNORDI B E, VETA M, VAN DIEST P J, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer[J]. JAMA, 2017, 318(22): 2199–2210. doi: 10.1001/jama.2017.14585.
    [22] TOMCZAK K, CZERWIŃSKA P, and WIZNEROWICZ M. The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge[J]. Contemporary Oncology, 2015, 19(1A): A68–A77. doi: 10.5114/wo.2014.47136.
    [23] ZHOU S K, RUECKERT D, and FICHTINGER G. Handbook of Medical Image Computing and Computer Assisted Intervention[M]. London: Academic Press, 2020: 521–546.
    [24] LOU Wei, LI Guanbin, WAN Xiang, et al. Multi-modal denoising diffusion pre-training for whole-slide image classification[C]. Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 2024: 10804–10813. doi: 10.1145/3664647.3680882.
  • 加载中
图(6) / 表(5)
计量
  • 文章访问数:  14
  • HTML全文浏览量:  3
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 修回日期:  2026-01-22
  • 录用日期:  2026-01-22
  • 网络出版日期:  2026-02-12

目录

    /

    返回文章
    返回