高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于动量增强特征图的对抗防御算法

胡军 石艺杰

胡军, 石艺杰. 基于动量增强特征图的对抗防御算法[J]. 电子与信息学报, 2023, 45(12): 4548-4555. doi: 10.11999/JEIT221414
引用本文: 胡军, 石艺杰. 基于动量增强特征图的对抗防御算法[J]. 电子与信息学报, 2023, 45(12): 4548-4555. doi: 10.11999/JEIT221414
HU Jun, SHI Yijie. Adversarial Defense Algorithm Based on Momentum Enhanced Future Map[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4548-4555. doi: 10.11999/JEIT221414
Citation: HU Jun, SHI Yijie. Adversarial Defense Algorithm Based on Momentum Enhanced Future Map[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4548-4555. doi: 10.11999/JEIT221414

基于动量增强特征图的对抗防御算法

doi: 10.11999/JEIT221414
基金项目: 国家自然科学基金(61936001, 62276038),重庆市教委重点合作项目(HZ2021008),重庆市自然科学基金(cstc2019jcyj-cxttX0002, cstc2021ycjh-bgzxm0013)
详细信息
    作者简介:

    胡军:男,博士,教授,研究方向为多粒度认知计算、人工智能安全和图分析与挖掘

    石艺杰:男,硕士生,研究方向为对抗机器学习

    通讯作者:

    胡军 hujun@cqupt.edu.cn

  • 中图分类号: TN915.08; TP309.2

Adversarial Defense Algorithm Based on Momentum Enhanced Future Map

Funds: The National Natural Science Foundation of China (61936001, 62276038), The Key Cooperation Project of Chongqing Municipal Education Commission (HZ2021008), The National Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002, cstc2021ycjh-bgzxm0013)
  • 摘要: 深度神经网络(DNN)因其优异的性能而被广泛应用,但易受对抗样本攻击的问题使其面临巨大的安全风险。通过对DNN的卷积过程进行可视化,发现随着卷积层数加深,对抗攻击对原始输入产生的扰动愈加明显。基于这一发现,采用动量法中前向结果修正后向结果的思想,该文提出一种基于动量增强特征图的防御算法(MEF)。MEF算法在DNN的卷积层上部署特征增强层构成特征增强块(FEB),FEB会结合原始输入以及浅层卷积层的特征图生成特征增强图,进而利用特征增强图来增强深层的特征图。同时,为了保证每层特征增强图的有效性,增强后的特征图还会对特征增强图进行进一步更新。为验证MEF算法的有效性,使用多种白盒与黑盒攻击对部署MEF算法的DNN模型进行攻击实验,结果表明在投影梯度下降法(PGD)以及快速梯度符号法(FGSM)的攻击实验中,MEF算法对对抗样本的识别精度比对抗训练(AT)高出3%~5%,且对干净样本的识别精度也有所提升。此外,使用比训练时更强的对抗攻击方法进行测试时,与目前先进的噪声注入算法(PNI)以及特征扰动算法(L2P)相比,MEF算法表现出更强的鲁棒性。
  • 图  1  MEF算法的特征提取过程

    图  2  特征增强块

    图  3  部署MEF算法的ResNet18模型

    图  4  部署AT算法后ResNet18模型的特征图

    图  5  部署MEF算法后ResNet18模型的特征图

    图  6  强FGSM的防御鲁棒性对比

    图  7  PGD攻击MEF算法产生的对抗样本

    算法1 MEF算法
     输入:训练集$ D = \{ ({X_i},{t_i}),i = 1,2,\cdots,n\} $,训练周期$ I $,
        动量参数$ \beta $,初始化模型参数$ W $,交叉熵损失函数$ L( \cdot ) $
     输出:训练模型DNN
     (1) $ {H_i} = {X_i} $ /*$ {H_i} $是$ {X_i} $的初始特征增强图*/
     (2) for epoch $ I $
     (3)  for $ k $ /*$ k $是卷积层*/
     (4)    $f'_k = {\text{conv} }(f_{k - 1}^{})$ /*conv指卷积操作,$f'_k$指本层未增
          强的特征图,$ {f_{k - 1}} $指上层特征图*/
     (5)   $h'_k = {\text{crop} }({h_{k - 1} })$ /*crop指对上层增强图的裁剪操作*/
     (6)   ${f_k} = f'_k + h'_k$ /*$ {f_k} $是本层最终特征图*/
     (7)   ${h_k} = \beta \cdot h'_k + (1 - \beta ) \cdot {f_k}$ /*$ {h_k} $是本层最终特征增强图*
     (8)  end for
     (9)  update $ W $ based on the loss function $ L( \cdot ) $
     (10) end for
    下载: 导出CSV

    表  1  MEF算法与AT算法对迁移攻击的防御精度

    FGSMC&WPGD
    无防御模型11.000
    AT54.066.883.0
    MEF53.467.484.0
    下载: 导出CSV

    表  2  MEF算法与AT算法对查询攻击的防御精度

    防御算法查询次数$ N $
    100030005000
    AT71.770.970.5
    MEF71.771.671.4
    下载: 导出CSV

    表  3  C&W攻击中MEF算法与AT算法对抗样本的$ {L_2} $距离对比

    防御算法置信度$ K $
    00.11.02.05.0
    MEF8.6578.83110.01211.27917.229
    AT5.4685.6066.7948.09512.029
    下载: 导出CSV

    表  4  AT,GS以及MEF算法针对PGD攻击的防御精度

    输入类型ATGSMEF
    对抗样本43.345.246.4
    干净样本83.483.884.44
    下载: 导出CSV

    表  5  PNI, L2P, HFA以及MEF算法的对比试验

    攻击阈值$ \varepsilon $迭代次数$ N $PNIL2PHFAMEF
    0.031047.2147.3944.2746.33
    2045.4144.7642.0545.07
    3044.9444.3441.6544.80
    防御精度均值45.8545.5042.6645.40
    0.061020.7221.4417.4320.39
    2012.6912.9211.2413.42
    3011.5810.999.8711.73
    防御精度均值15.0015.1212.8515.18
    0.071017.1018.0915.0216.87
    208.098.307.179.19
    306.416.285.267.62
    防御精度均值10.5310.899.1511.23
    干净样本识别精度82.0484.2982.3184.44
    下载: 导出CSV

    表  6  MEF的非模糊梯度测试

    模糊梯度的特征通过
    (1) 单步攻击性能优于迭代攻击
    (2) 黑盒攻击优于白盒攻击
    (3) 无界扰动攻击无法达到100%成功率
    (4) 基于梯度的攻击无法生成对抗样本
    (5) 提高扰动阈值不会增加成功率
    下载: 导出CSV
  • [1] MAULUD D H, ZEEBAREE S R M, JACKSI K, et al. State of art for semantic analysis of natural language processing[J]. Qubahan Academic Journal, 2021, 1(2): 21–28. doi: 10.48161/qaj.v1n2a44
    [2] ALHARBI S, ALRAZGAN M, ALRASHED A, et al. Automatic speech recognition: Systematic literature review[J]. IEEE Access, 2021, 9: 131858–131876. doi: 10.1109/ACCESS.2021.3112535
    [3] 陈怡, 唐迪, 邹维. 基于深度学习的Android恶意软件检测: 成果与挑战[J]. 电子与信息学报, 2020, 42(9): 2082–2094. doi: 10.11999/JEIT200009

    CHEN Yi, TANG Di, and ZOU Wei. Android malware detection based on deep learning: Achievements and challenges[J]. Journal of Electronics &Information Technology, 2020, 42(9): 2082–2094. doi: 10.11999/JEIT200009
    [4] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]. 2nd International Conference on Learning Representations, Banff, Canada, 2014.
    [5] GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [6] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]. 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [7] CARLINI N and WAGNER D. Towards evaluating the robustness of neural networks[C]. 2017 IEEE Symposium on Security and Privacy (SP), San Jose, USA, 2017: 39–57.
    [8] LIU Yanpei, CHEN Xinyun, LIU Chang, et al. Delving into transferable adversarial examples and black-box attacks[C]. 5th International Conference on Learning Representations, Toulon, France, 2017.
    [9] ANDRIUSHCHENKO M, CROCE F, FLAMMARION N, et al. Square attack: A query-efficient black-box adversarial attack via random search[C]. 16th European Conference on Computer Vision, Glasgow, UK, 2020: 484–501.
    [10] LIN Jiadong, SONG Chuanbiao, HE Kun, et al. Nesterov accelerated gradient and scale invariance for adversarial attacks[C]. 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [11] 邹军华, 段晔鑫, 任传伦, 等. 基于噪声初始化、Adam-Nesterov方法和准双曲动量方法的对抗样本生成方法[J]. 电子学报, 2022, 50(1): 207–216. doi: 10.12263/DZXB.20200839

    ZOU Junhua, DUAN Yexin, REN Chuanlun, et al. Perturbation initialization, Adam-Nesterov and Quasi-Hyperbolic momentum for adversarial examples[J]. Acta Electronica Sinica, 2022, 50(1): 207–216. doi: 10.12263/DZXB.20200839
    [12] XU Weilin, EVANS D, and QI Yanjun. Feature squeezing: Detecting adversarial examples in deep neural networks[C]. 2018 Network and Distributed System Security Symposium (NDSS), San Diego, USA, 2018.
    [13] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929–1958.
    [14] DHILLON G S, AZIZZADENESHELI K, LIPTON Z C, et al. Stochastic activation pruning for robust adversarial defense[C]. 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [15] VIEVK B S and BABU R V. Single-step adversarial training with dropout scheduling[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 947–956.
    [16] MENG Dongyu and CHEN Hao. MagNet: A two-pronged defense against adversarial examples[C]. The 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, USA, 2017: 135–147.
    [17] SONG Yang, KIM T, NOWOZIN S, et al. PixelDefend: Leveraging generative models to understand and defend against adversarial examples[C]. 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [18] PAPERNOT N, MCDANIEL P, WU Xi, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C]. 2016 IEEE Symposium on Security and Privacy (SP), San Jose, USA, 2016: 582–597.
    [19] XIE Cihang, WU Yuxin, VAN DER MAATEN L, et al. Feature denoising for improving adversarial robustness[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 501–509.
    [20] HE Zhezhi, RAKIN A S, and FAN Deliang. Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 588–597.
    [21] JEDDI A, SHAFIEE M J, KARG M, et al. Learn2Perturb: An end-to-end feature perturbation learning to improve adversarial robustness[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 1238–1247.
    [22] ZHANG Xiaoqin, WANG Jinxin, WANG Tao, et al. Robust feature learning for adversarial defense via hierarchical feature alignment[J]. Information Sciences, 2021, 560: 256–270. doi: 10.1016/J.INS.2020.12.042
    [23] XIAO Chang and ZHENG Changxi. One man's trash is another man's treasure: Resisting adversarial examples by adversarial examples[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 409–418.
    [24] ATHALYE A, CARLINI N, and WAGNER D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 274–283.
  • 加载中
图(7) / 表(7)
计量
  • 文章访问数:  284
  • HTML全文浏览量:  216
  • PDF下载量:  95
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-11-09
  • 修回日期:  2023-03-05
  • 网络出版日期:  2023-03-10
  • 刊出日期:  2023-12-26

目录

    /

    返回文章
    返回