Advanced Search
Volume 46 Issue 2
Feb.  2024
Turn off MathJax
Article Contents
ZHOU Tao, YE Xinyu, LIU Fengzhen, LU Huiling. CL-YOLOv5: PET/CT Lung Cancer Detection With Cross-modal Lightweight YOLOv5 Model[J]. Journal of Electronics & Information Technology, 2024, 46(2): 624-632. doi: 10.11999/JEIT230052
Citation: ZHOU Tao, YE Xinyu, LIU Fengzhen, LU Huiling. CL-YOLOv5: PET/CT Lung Cancer Detection With Cross-modal Lightweight YOLOv5 Model[J]. Journal of Electronics & Information Technology, 2024, 46(2): 624-632. doi: 10.11999/JEIT230052

CL-YOLOv5: PET/CT Lung Cancer Detection With Cross-modal Lightweight YOLOv5 Model

doi: 10.11999/JEIT230052
Funds:  The National Natural Science Foundation of China (62062003), Ningxia Natural Science Foundation Project (2022AAC03149), Key Research and Development Projects of Ningxia Autonomous Region(2020BEB04022)
  • Received Date: 2023-02-14
  • Rev Recd Date: 2023-05-05
  • Available Online: 2023-05-16
  • Publish Date: 2024-02-29
  • Multimodal medical images can provide more semantic information at the same lesion. To address the problems that cross-modal semantic features are not fully considered and model complexity is too high, a Cross-modal Lightweight YOLOv5(CL-YOLOv5) lung cancer detection model is proposed. Firstly, three-branch network is proposed to learn semantic information of Positron Emission Tomography (PET), Computed Tomography (CT) and PET/CT; Secondly, Cross-modal Interactive Enhancement block is designed to fully learn multimodal semantic correlation, cosine reweighted Transformer efficiently learns global feature relationship, interactive enhancement network extracts lesion features; Finally, dual-branch lightweight block is proposed, ACtivate Or Not (ACON) bottleneck structure reduces parameters while increasing network depth and robustness, the other branch is densely connected recursive re-parametric convolution with maximized feature transfer, recursive spatial interaction efficiently learning multimodal features. In lung cancer PET/CT multimodal dataset, the model in this paper achieves 94.76% mAP optimal performance and 3238 s highest efficiency, 0.81 M parameters are obtained, which is 7.7 times and 5.3 times lower than YOLOv5s and EfficientDet-d0, overall outperforms existing state-of-the-art methods in multimodal comparative experiments. In multi-modal comparison experiment, it is generally better than the existing advanced methods, further verification by ablation experiments and heat map visualization ablation experiment.
  • loading
  • [1]
    MIRANDA D, THENKANIDIYOOR V, and DINESH D A. Review on approaches to concept detection in medical images[J]. Biocybernetics and Biomedical Engineering, 2022, 42(2): 453–462. doi: 10.1016/j.bbe.2022.02.012.
    [2]
    周涛, 刘赟璨, 陆惠玲, 等. ResNet及其在医学图像处理领域的应用: 研究进展与挑战[J]. 电子与信息学报, 2022, 44(1): 149–167. doi: 10.11999/JEIT210914.

    ZHOU Tao, LIU Yuncan, LU Huiling, et al. ResNet and its application to medical image processing: Research progress and challenges[J]. Journal of Electronics &Information Technology, 2022, 44(1): 149–167. doi: 10.11999/JEIT210914.
    [3]
    YU Guanghua, CHANG Qinyao, LV Wengyu, et al. PP-PicoDet: A better real-time object detector on mobile devices[J]. arXiv: 2111.00902, 2021.
    [4]
    HURTIK P, MOLEK V, HULA J, et al. Poly-YOLO: Higher speed, more precise detection and instance segmentation for YOLOv3[J]. Neural Computing and Applications, 2022, 34(10): 8275–8290. doi: 10.1007/s00521-021-05978-9.
    [5]
    WANG C Y, BOCHKOVSKIY A, and LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J]. arXiv: 2207.02696, 2022.
    [6]
    刘政怡, 段群涛, 石松, 等. 基于多模态特征融合监督的RGB-D图像显著性检测[J]. 电子与信息学报, 2020, 42(4): 997–1004. doi: 10.11999/JEIT190297.

    LIU Zhenyi, DUAN Quntao, SHI Song, et al. RGB-D image saliency detection based on multi-modal feature-fused supervision[J]. Journal of Electronics &Information Technology, 2020, 42(4): 997–1004. doi: 10.11999/JEIT190297.
    [7]
    ASVADI A, GARROTE L, PREMEBIDA C, et al. Real-time deep convnet-based vehicle detection using 3d-lidar reflection intensity data[C]. ROBOT 2017: Third Iberian Robotics Conference, Sevilla, Spain, 2017: 475–486.
    [8]
    YADAV R, VIERLING A, and BERNS K. Radar+ RGB fusion for robust object detection in autonomous vehicle[C]. The 2020 IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 2020: 1986–1990.
    [9]
    QIAN Kun, ZHU Shilin, ZHANG Xinyu, et al. Robust multimodal vehicle detection in foggy weather using complementary Lidar and radar signals[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 444–453.
    [10]
    CHEN Yiting, SHI Jinghao, YE Zelin, et al. Multimodal object detection via probabilistic ensembling[C]. 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 139–158.
    [11]
    HERMESSI H, MOURALI O, and ZAGROUBA E. Multimodal medical image fusion review: Theoretical background and recent advances[J]. Signal Processing, 2021, 183: 108036. doi: 10.1016/j.sigpro.2021.108036.
    [12]
    MOKNI R, GARGOURI N, DAMAK A, et al. An automatic computer-aided diagnosis system based on the multimodal fusion of breast cancer (MF-CAD)[J]. Biomedical Signal Processing and Control, 2021, 69: 102914. doi: 10.1016/j.bspc.2021.102914.
    [13]
    RUBINSTEIN E, SALHOV M, NIDAM-LESHEM M, et al. Unsupervised tumor detection in dynamic PET/CT imaging of the prostate[J]. Medical Image Analysis, 2019, 55: 27–40. doi: 10.1016/j.media.2019.04.001.
    [14]
    MING Yue, DONG Xiying, ZHAO Jihuai, et al. Deep learning-based multimodal image analysis for cervical cancer detection[J]. Methods, 2022, 205: 46–52. doi: 10.1016/j.ymeth.2022.05.004.
    [15]
    QIN Ruoxi, WANG Zhenzhen, JIANG Lingyun, et al. Fine-grained lung cancer classification from PET and CT images based on multidimensional attention mechanism[J]. Complexity, 2020, 2020: 6153657. doi: 10.1155/2020/6153657.
    [16]
    DIRKS I, KEYAERTS M, NEYNS B, et al. Computer-aided detection and segmentation of malignant melanoma lesions on whole-body 18F-FDG PET/CT using an interpretable deep learning approach[J]. Computer Methods and Programs in Biomedicine, 2022, 221: 106902. doi: 10.1016/j.cmpb.2022.106902.
    [17]
    CAO Siyuan, YU Beinan, LUO Lun, et al. PCNet: A structure similarity enhancement method for multispectral and multimodal image registration[J]. Information Fusion, 2023, 94: 200–214. doi: 10.1016/j.inffus.2023.02.004.
    [18]
    TAN Mingxing, PANG Ruoming, and LE Q V. EfficientDet: Scalable and efficient object detection[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10778–10787.
    [19]
    LI Chuyi, LI Lulu, GENG Yifei, et al. YOLOv6 v3.0: A full-scale reloading[J]. arXiv: 2301.05586, 2023.
    [20]
    LI Dongyang and ZHAI Junyong. A real-time vehicle window positioning system based on nanodet[C]. 2022 Chinese Intelligent Systems Conference, Singapore, 2022: 697–705.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(3)

    Article Metrics

    Article views (990) PDF downloads(151) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return