高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型

周涛 侯森宝 陆惠玲 刘赟璨 党培

周涛, 侯森宝, 陆惠玲, 刘赟璨, 党培. C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型[J]. 电子与信息学报, 2023, 45(5): 1807-1816. doi: 10.11999/JEIT220445
引用本文: 周涛, 侯森宝, 陆惠玲, 刘赟璨, 党培. C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型[J]. 电子与信息学报, 2023, 45(5): 1807-1816. doi: 10.11999/JEIT220445
ZHOU Tao, HOU Senbao, LU Huiling, LIU Yuncan, DANG Pei. C2 Transformer U-Net: A Medical Image Segmentation Model for Cross-modality and Contextual Semantics[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1807-1816. doi: 10.11999/JEIT220445
Citation: ZHOU Tao, HOU Senbao, LU Huiling, LIU Yuncan, DANG Pei. C2 Transformer U-Net: A Medical Image Segmentation Model for Cross-modality and Contextual Semantics[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1807-1816. doi: 10.11999/JEIT220445

C2 Transformer U-Net:面向跨模态和上下文语义的医学图像分割模型

doi: 10.11999/JEIT220445
基金项目: 国家自然科学基金(62062003),宁夏自治区重点研发计划(2020BEB04022),宁夏自然科学基金(2022AAC03149),北方民族大学引进人才科研启动项目(2020KYQD08)
详细信息
    作者简介:

    周涛:男,博士,教授,博士生导师,主要研究方向为计算机辅助诊断、医学图像分析与处理、模式识别等

    侯森宝:男,硕士生,研究方向为图像图形智能处理

    陆惠玲:女,副教授,研究方向为医学图像分析处理、机器学习

    刘赟璨:女,硕士生,研究方向为图像图形智能处理

    党培:女,硕士生,研究方向为图像图形智能处理

    通讯作者:

    侯森宝 hsb378093739@163.com

  • 中图分类号: TN911.73; TP391

C2 Transformer U-Net: A Medical Image Segmentation Model for Cross-modality and Contextual Semantics

Funds: The National Natural Science Foundation of China (62062003), The Key Research and Development Projects of Ningxia Autonomous Region (2020BEB04022), The National Natural Science Foundation of Ningxia (2022AAC03149), The Introduction of Talents and Scientific Research Start-up Project of Northern University for Nationalities (2020KYQD08)
  • 摘要: 跨模态的医学图像可以在同一病灶处提供更多的语义信息,针对U-Net网络主要使用单模态图像用于分割,未充分考虑跨模态、上下文语义相关性的问题,该文提出面向跨模态和上下文语义的医学图像分割C2 Transformer U-Net模型。该模型的主要思想是:首先,在编码器部分提出主干、辅助U-Net网络结构,来提取不同模态的语义信息;然后,设计了多模态上下文语义感知处理器(MCAP),有效地提取同一病灶跨模态的语义信息,跳跃连接中使用主网络的两种模态图像相加后传入Transformer解码器,增强模型对病灶的表达能力;其次,在编-解码器中采用预激活残差单元和Transformer架构,一方面提取病灶的上下文特征信息,另一方面使网络在充分利用低层和高层特征时更加关注病灶的位置信息;最后,使用临床多模态肺部医学图像数据集验证算法的有效性,对比实验结果表明所提模型对于肺部病灶分割的Acc, Pre, Recall, Dice, Voe与Rvd分别为:97.95%, 94.94%, 94.31%, 96.98%, 92.57%与93.35%。对于形状复杂肺部病灶的分割,具有较高的精度和相对较低的冗余度,总体上优于现有的先进方法。
  • 图  1  C2 Transformer U-Net网络架构

    图  2  多模态上下文感知处理器

    图  3  Transformer多头注意力编-解码分支

    图  4  跨模态语义相关性的不同编码器分割网络的雷达图和可视化分割结果图

    图  5  不同分割网络的雷达图和可视化分割结果图

    图  6  上下文语义相关的雷达图和可视化分割结果图

    表  1  评价指标定义

    评价指标定义评价指标定义
    Acc$ {\text{Acc = }}\dfrac{{{\text{TP + TN}}}}{{{\text{TP + FP + FN + TN}}}} $Pre$ {\text{Pre = }}\dfrac{{{\text{TP}}}}{{{\text{TP + FP}}}} $
    Dice${\rm{Dice}} = \dfrac{ {2 \times \left| {P \cap G} \right|} }{ {\left| P \right| + \left| G \right|} }$Recall${\text{Recall = }}\dfrac{{{\text{TP}}}}{{{\text{TP + FN}}}}$
    Voe${\rm{Voe} } = {\rm{abs} }\left(1 - \left| {\dfrac{ {P \cap G} }{ {P \cup G} } } \right|\right)$Rvd$ {\text{Rvd}} = \dfrac{{{\text{abs}}(P{{ - }}G)}}{G} $
    下载: 导出CSV

    表  2  跨模态语义相关性的不同编码器分割结果(%)

    模型AccPreRecallDiceVoeRvd
    U-Net[16]90.2390.3890.3390.2890.9792.09
    Y-Net[17]90.1690.0990.1890.0991.4292.45
    本文97.9594.9494.3196.9892.5793.35
    下载: 导出CSV

    表  3  不同分割网络的分割结果(%)

    模型ACCPreRecallDiceVOERVD
    SegNet[18]89.2389.3888.3387.2879.9781.13
    WNet[19]90.1689.4991.2888.5982.0883.45
    Attention UNet[20]91.3090.9491.3189.9884.5784.35
    ResUNet[21]91.2390.0890.3289.0183.4584.02
    SEResUNet[22]92.3892.1792.0792.2090.9391.04
    UTNet[23]94.5893.8693.4492.8392.0793.20
    本文97.9594.9494.3196.9892.5793.35
    下载: 导出CSV

    表  4  上下文语义相关的分割结果(%)

    模型AccPreRecallDiceVoeRvd
    RMUNet93.2690.6991.1492.6889.4591.10
    RTMUNet94.5992.8092.5593.5089.8791.37
    RTMMUNet95.1893.1392.6994.0190.0292.04
    RTMMSUNet96.6293.6093.1594.4090.0992.05
    本文97.9594.9494.3196.9892.5793.35
    下载: 导出CSV
  • [1] DALCA A V, GUTTAG J, and SABUNCU M R. Anatomical priors in convolutional networks for unsupervised biomedical segmentation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, USA, 2018: 9290–9299.
    [2] ZHOU Tao, LU Huiling, YANG Zaoli, et al. The ensemble deep learning model for novel COVID-19 on CT images[J]. Applied Soft Computing, 2021, 98: 106885. doi: 10.1016/j.asoc.2020.106885
    [3] JAMES A P and DASARATHY B V. Medical image fusion: A survey of the state of the art[J]. Information Fusion, 2014, 19: 4–19. doi: 10.1016/j.inffus.2013.12.002
    [4] LI Haoming, JIANG Huiyan, LI Siqi, et al. DenseX-Net: An end-to-end model for lymphoma segmentation in whole-body PET/CT Images[J]. IEEE Access, 2020, 8: 8004–8018. doi: 10.1109/ACCESS.2019.2963254
    [5] HUSSEIN S, GREEN A, WATANE A, et al. Automatic segmentation and quantification of white and brown adipose tissues from PET/CT Scans[J]. IEEE Transactions on Medical Imaging, 2017, 36(3): 734–744. doi: 10.1109/TMI.2016.2636188
    [6] MU Wei, CHEN Zhe, SHEN Wei, et al. A segmentation algorithm for quantitative analysis of heterogeneous tumors of the cervix with 18F-FDG PET/CT[J]. IEEE Transactions on Biomedical Engineering, 2015, 62(10): 2465–2479. doi: 10.1109/TBME.2015.2433397
    [7] ZHOU Tao, DONG YaLi, LU HuiLing, et al. APU-Net: An attention mechanism parallel U-Net for lung tumor segmentation[J]. BioMed Research International, 2022, 2022: 5303651. doi: 10.1155/2022/5303651
    [8] CUI Hui, WANG Xiuying, LIN W, et al. Primary lung tumor segmentation from PET-CT volumes with spatial-topological constraint[J]. International Journal of Computer Assisted Radiology and Surgery, 2016, 11(1): 19–29. doi: 10.1007/s11548-015-1231-0
    [9] ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6230–6239.
    [10] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Identity mappings in deep residual networks[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 630–645.
    [11] HAN Guang, ZHU Mengcheng, ZHAO Xuechen, et al. Method based on the cross-layer attention mechanism and multiscale perception for safety helmet-wearing detection[J]. Computers and Electrical Engineering, 2021, 95: 107458. doi: 10.1016/j.compeleceng.2021.107458
    [12] WANG Sinong, LI B Z, KHABSA M, et al. Linformer: Self-attention with linear complexity[EB/OL]. https://arxiv.org/abs/2006.04768, 2020.
    [13] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. The 31st International Conference on Neural Information Processing Systems (NIPS'17), Long Beach, USA, 2017: 6000–6010.
    [14] BELLO L, ZOPH B, LE Q, et al. Attention augmented convolutional networks[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019: 3285–3294.
    [15] PARMAR N, VASWANI A, USZKOREIT J, et al. Image transformer[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 4052–4061.
    [16] RONNEBERGER O, FISCHER P, and BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]. The 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Munich, Germany, 2015: 234−241.
    [17] LAN Hengrong, JIANG Daohuai, YANG Changchun, et al. Y-Net: Hybrid deep learning image reconstruction for photoacoustic tomography in vivo[J]. Photoacoustics, 2020, 20: 100197. doi: 10.1016/j.pacs.2020.100197
    [18] BADRINARAYANAN V, KENDALL A, and CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481–2495. doi: 10.1109/TPAMI.2016.2644615
    [19] XU Lina, TETTEH G, LIPKOVA J, et al. Automated whole-body bone lesion detection for multiple myeloma on 68Ga-pentixafor PET/CT imaging using deep learning methods[J]. Contrast Media & Molecular Imaging, 2018, 2018: 2391925. doi: 10.1155/2018/2391925
    [20] OKTAY O, SCHLEMPER J, LE FOLGOC L, et al. Attention U-Net: Learning where to look for the pancreas[EB/OL]. https://arxiv.org/abs/1804.03999, 2018.
    [21] LIU Jin, KANG Yanqin, QIANG Jun, et al. Low-dose CT imaging via cascaded ResUnet with spectrum loss[J]. Methods, 2022, 202: 78–87. doi: 10.1016/j.ymeth.2021.05.005
    [22] CAO Zheng, YU Bohan, LEI Biwen, et al. Cascaded SE-ResUnet for segmentation of thoracic organs at risk[J]. Neurocomputing, 2021, 453: 357–368. doi: 10.1016/j.neucom.2020.08.086
    [23] GAO Yunhe, ZHOU Mu, and METAXAS D. UTNet: A hybrid transformer architecture for medical image segmentation[EB/OL]. https://arxiv.org/abs/2107.00781, 2021.
  • 加载中
图(6) / 表(4)
计量
  • 文章访问数:  932
  • HTML全文浏览量:  795
  • PDF下载量:  225
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-14
  • 修回日期:  2022-08-24
  • 录用日期:  2022-08-25
  • 网络出版日期:  2022-08-30
  • 刊出日期:  2023-05-10

目录

    /

    返回文章
    返回