高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于对比层级相关性传播的由粗到细的类激活映射算法研究

孙辉 史玉龙 王蕊

孙辉, 史玉龙, 王蕊. 基于对比层级相关性传播的由粗到细的类激活映射算法研究[J]. 电子与信息学报, 2023, 45(4): 1454-1463. doi: 10.11999/JEIT220113
引用本文: 孙辉, 史玉龙, 王蕊. 基于对比层级相关性传播的由粗到细的类激活映射算法研究[J]. 电子与信息学报, 2023, 45(4): 1454-1463. doi: 10.11999/JEIT220113
SUN Hui, SHI Yulong, WANG Rui. Study of Coarse-to-Fine Class Activation Mapping Algorithms Based on Contrastive Layer-wise Relevance Propagation[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1454-1463. doi: 10.11999/JEIT220113
Citation: SUN Hui, SHI Yulong, WANG Rui. Study of Coarse-to-Fine Class Activation Mapping Algorithms Based on Contrastive Layer-wise Relevance Propagation[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1454-1463. doi: 10.11999/JEIT220113

基于对比层级相关性传播的由粗到细的类激活映射算法研究

doi: 10.11999/JEIT220113
基金项目: 天津市自然科学基金(18JCYBJC42300)
详细信息
    作者简介:

    孙辉:男,讲师,主要研究方向为无线传感器网络、智慧机场、机场驱鸟、认知无线电、多智能体

    史玉龙:男,硕士生,研究方向为图像处理、机场驱鸟、系统辩识、无线传感器网络

    王蕊:女,教授,主要研究方向为机场驱鸟、分布式系统、无线传感网络、混沌系统、多智能体、系统辨识

    通讯作者:

    王蕊 ruiwang@cauc.edu.cn

  • 中图分类号: TP183

Study of Coarse-to-Fine Class Activation Mapping Algorithms Based on Contrastive Layer-wise Relevance Propagation

Funds: The Natural Science Foundation of Tianjin (18JCYBJC42300)
  • 摘要: 以卷积神经网络为代表的深度学习算法高度依赖于模型的非线性和调试技术,在实际应用过程中普遍存在黑箱属性,严重限制了其在安全敏感领域的进一步发展。为此,该文提出一种由粗到细的类激活映射算法(CF-CAM),用于对深度神经网络的决策行为进行诊断。该算法重新建立了特征图和模型决策之间的关系,利用对比层级相关性传播理论获取特征图中每个位置对网络决策的贡献生成空间级的相关性掩码,找到影响模型决策的重要性区域,再与经过模糊化操作的输入图像进行线性加权重新输入到网络中得到特征图的目标分数,从空间域和通道域实现对深度神经网络进行由粗到细的解释。实验结果表明,相较于其他方法该文提出的CF-CAM在忠实度和定位性能上具有显著提升。此外,该文将CF-CAM作为一种数据增强策略应用于鸟类细粒度分类任务,对困难样本进行学习,可以有效提高网络识别的准确率,进一步验证了CF-CAM算法的有效性和优越性。
  • 图  1  CF-CAM算法计算流程图

    图  2  CF-CAM的结果对比

    图  3  CF-CAM的类别可分性结果

    图  4  CF-CAM的多目标可视化结果

    图  5  CF-CAM模型参数敏感性检查结果

    图  6  CF-CAM模型诊断结果

    图  7  基于显著图数据增强过程

    算法1 CF-CAM算法
     输入: Image I, Baseline Image Ib, Class c, Model f(x), target
        layer l, Gaussian blur parameters: ksize, sigma.
     (1) Initialization: Initial Lc CF-CAM←0, αc←[ ], Baseline
       Input ${I_b} = {\text{Guassian\_blur2d} }(I,{\text{ksize,sigma} })$;
     (2) Get feature maps of target layer Ak, C is the number of
       channels in Ak, Relevance weights Rc;
     (3) for k in [0, 1, ···, C–1] do
       $ M_k^c = {\text{upsample}}(R_k^c{A_k}) $;
       $ I' = I \odot M_k^c + {I_b} \odot (1 - M_k^c) $;
       $ \alpha _k^c = {f^c}(I') - {f^c}({I_b}) $;
       $ L_{{\text{CF - CAM}}}^c = L_{{\text{CF - CAM}}}^c + \alpha _k^cM_k^c $;
     end
     (4) Return Lc CF-CAM
     输出:Saliency map Lc CF-CAM
    下载: 导出CSV

    表  1  CF-CAM忠实度评估结果(%)

    RISEGrad-CAMGrad-CAM++Score-CAMRelevance-CAMCF-CAM
    A.D.57.446.343.941.445.239.8
    A.I.8.715.218.620.517.521.3
    下载: 导出CSV

    表  2  CF-CAM的定位性能评估结果

    方法RISEGrad-CAMGrad-CAM++Score-CAMRelevance-CAMCF-CAM
    比例40.552.354.661.853.962.7
    下载: 导出CSV
  • [1] 时增林, 叶阳东, 吴云鹏, 等. 基于序的空间金字塔池化网络的人群计数方法[J]. 自动化学报, 2016, 42(6): 866–874. doi: 10.16383/j.aas.2016.c150663

    SHI Zenglin, YE Yangdong, WU Yunpeng, et al. Crowd counting using rank-based spatial pyramid pooling network[J]. Acta Automatica Sinica, 2016, 42(6): 866–874. doi: 10.16383/j.aas.2016.c150663
    [2] 付晓薇, 杨雪飞, 陈芳, 等. 一种基于深度学习的自适应医学超声图像去斑方法[J]. 电子与信息学报, 2020, 42(7): 1782–1789. doi: 10.11999/JEIT190580

    FU Xiaowei, YANG Xuefei, CHEN Fang, et al. An adaptive medical ultrasound images despeckling method based on deep learning[J]. Journal of Electronics &Information Technology, 2020, 42(7): 1782–1789. doi: 10.11999/JEIT190580
    [3] PU Fangling, DING Chujiang, CHAO Zeyi, et al. Water-quality classification of inland lakes using Landsat8 images by convolutional neural networks[J]. Remote Sensing, 2019, 11(14): 1674. doi: 10.3390/rs11141674
    [4] SAMBASIVAM G and OPIYO G D. A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks[J]. Egyptian Informatics Journal, 2021, 22(1): 27–34. doi: 10.1016/j.eij.2020.02.007
    [5] ZEILER M D and FERGUS R. Visualizing and understanding convolutional networks[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 818–833.
    [6] ZHOU Bolei, KHOSLA A, LAPEDRIZA A, et al. Object detectors emerge in deep scene CNNs[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [7] PETSIUK V, DAS A, and SAENKO K. RISE: Randomized input sampling for explanation of black-box models[C]. British Machine Vision Conference 2018, Newcastle, UK, 2018.
    [8] FONG R C and VEDALDI A. Interpretable explanations of black boxes by meaningful perturbation[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 3449–3457.
    [9] AGARWAL C, SCHONFELD D, and NGUYEN A. Removing input features via a generative model to explain their attributions to an image classifier's decisions[EB/OL]. https://arxiv.org/abs/1910.042562019, 2019.
    [10] CHANG Chunhao, CREAGER E, GOLDENBERG A, et al. Explaining image classifiers by counterfactual generation[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [11] SIMONYAN K, VEDALDI A, and ZISSERMAN A. Deep inside convolutional networks: Visualising image classification models and saliency maps[C]. The 2nd International Conference on Learning Representations, Banff, Canada, 2014.
    [12] SPRINGENBERG J T, DOSOVITSKIY A, BROX T, et al. Striving for simplicity: The all convolutional net[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [13] BACH S, BINDER A, MONTAVON G, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation[J]. PloS One, 2015, 10(7): e0130140. doi: 10.1371/journal.pone.0130140
    [14] ZHOU Bolei, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 2921–2929.
    [15] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626.
    [16] CHATTOPADHAY A, SARKAR A, HOWLADER P, et al. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks[C]. 2018 IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, USA, 2018: 839–847.
    [17] OMEIZA D, SPEAKMAN S, CINTAS C, et al. Smooth grad-cam++: An enhanced inference level visualization technique for deep convolutional neural network models[EB/OL]. https://arxiv.org/abs/1908.01224, 2019.
    [18] WANG Haofan, WANG Zifan, DU Mengnan, et al. Score-CAM: Score-weighted visual explanations for convolutional neural networks[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, 2020: 111–119.
    [19] GU Jindong, YANG Yinchong, and TRESP V. Understanding individual decisions of CNNs via contrastive backpropagation[C]. The 14th Asian Conference on Computer Vision, Perth, Australia, 2018: 119–134.
    [20] KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84–90. doi: 10.1145/3065386
    [21] SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [22] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
    [23] LEE J R, KIM S, PARK I, et al. Relevance-CAM: Your model already knows where to look[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 14939–14948.
    [24] SATTARZADEH S, SUDHAKAR M, PLATANIOTIS K N, et al. Integrated Grad-CAM: Sensitivity-aware visual explanation of deep convolutional networks via integrated gradient-based scoring[C]. ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 2021: 1775–1779.
    [25] ZHANG Qinglong, RAO Lu, and YANG Yubin. Group-CAM: Group score-weighted visual explanations for deep convolutional networks[EB/OL]. https://arxiv.org/abs/2103.13859, 2021.
    [26] WAH C, BRANSON S, WELINDER P, et al. The Caltech-UCSD birds-200-2011 dataset[R]. CNS-TR-2011-001, 2011.
    [27] RUSSAKOVSKY O, DENG Jia, SU Hao, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252. doi: 10.1007/s11263-015-0816-y
    [28] SMILKOV D, THORAT N, KIM B, et al. SmoothGrad: Removing noise by adding noise[EB/OL]. https://arxiv.org/abs/1706.03825, 2017.
    [29] SUNDARARAJAN M, TALY A, and YAN Qiqi. Axiomatic attribution for deep networks[C]. The 34th International Conference on Machine Learning, Sydney, Australia, 2017: 3319–3328.
    [30] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [31] WU Pingyu, ZHAI Wei, and CAO Yang. Background activation suppression for weakly supervised object localization[EB/OL]. https://arxiv.org/abs/2112.00580, 2022.
  • 加载中
图(7) / 表(3)
计量
  • 文章访问数:  482
  • HTML全文浏览量:  421
  • PDF下载量:  100
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-01-27
  • 修回日期:  2022-06-10
  • 录用日期:  2022-06-17
  • 网络出版日期:  2022-06-20
  • 刊出日期:  2023-04-10

目录

    /

    返回文章
    返回