高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于信息瓶颈的深度学习模型鲁棒性增强方法

董庆宽 何浚霖

董庆宽, 何浚霖. 基于信息瓶颈的深度学习模型鲁棒性增强方法[J]. 电子与信息学报, 2023, 45(6): 2197-2204. doi: 10.11999/JEIT220603
引用本文: 董庆宽, 何浚霖. 基于信息瓶颈的深度学习模型鲁棒性增强方法[J]. 电子与信息学报, 2023, 45(6): 2197-2204. doi: 10.11999/JEIT220603
DONG Qingkuan, HE Junlin. Robustness Enhancement Method of Deep Learning Model Based on Information Bottleneck[J]. Journal of Electronics & Information Technology, 2023, 45(6): 2197-2204. doi: 10.11999/JEIT220603
Citation: DONG Qingkuan, HE Junlin. Robustness Enhancement Method of Deep Learning Model Based on Information Bottleneck[J]. Journal of Electronics & Information Technology, 2023, 45(6): 2197-2204. doi: 10.11999/JEIT220603

基于信息瓶颈的深度学习模型鲁棒性增强方法

doi: 10.11999/JEIT220603
基金项目: 陕西省自然科学基础研究计划(2020JM-184)
详细信息
    作者简介:

    董庆宽:男,硕士生导师,研究方向为网络与信息安全、深度学习与安全

    通讯作者:

    何浚霖 425764309@qq.com

  • 中图分类号: TN911.7; TP18

Robustness Enhancement Method of Deep Learning Model Based on Information Bottleneck

Funds: The Science Basic Research Plan in Shaanxi Province of China (2020JM-184)
  • 摘要: 作为深度学习技术的核心算法,深度神经网络容易对添加了微小扰动的对抗样本产生错误的判断,这种情况的出现对深度学习模型的安全性带来了新的挑战。深度学习模型对对抗样本的抵抗能力被称为鲁棒性,为了进一步提升经过对抗训练算法训练的模型的鲁棒性,该文提出一种基于信息瓶颈的深度学习模型对抗训练算法。其中,信息瓶颈以信息论为基础,描述了深度学习的过程,使深度学习模型能够更快地收敛。所提算法使用信息瓶颈理论提出的优化目标推导出的结论,将模型中输入到线性分类层的张量加入损失函数,通过样本交叉训练的方式将干净样本与对抗样本输入模型时得到的高层特征对齐,使模型在训练过程中能够更好地学习输入样本与其真实标签的关系,最终对对抗样本具有良好的鲁棒性。实验结果表明,所提算法对多种对抗攻击均具有良好的鲁棒性,并且在不同的数据集与模型中具有泛化能力。
  • 图  1  深度神经网络示意图

    图  2  算法流程图

    图  3  类激活图与特征图

    图  4  干净样本与对抗样本测试正确率对比图

    表  1  公式变量对照表

    公式变量名称 公式变量名称
    $L$目标函数 $p\left( {{\cdot}} \right)$边缘概率分布
    $ \tilde x $ 对抗样本 $q\left( {{\cdot}} \right)$ 边缘概率分布
    $y$ 网络输出 $\beta $ 信息瓶颈通过率
    $z$ 隐藏变量 $H\left( {{\cdot}} \right)$
    $I\left( {{\cdot}} \right)$ 互信息 ${\rm{KL}}\left( { {\cdot} } \right)$ KL散度
    ${L_{{\rm{IB}}} }$ 损失函数 ${\rm{CE}}\left( { {\cdot} } \right)$ 交叉熵
    下载: 导出CSV

    表  2  使用的数据集信息

    数据集名称图片大小是否彩色数量(104张)类别(种)β
    CIFAR10032×3262010–5
    CIFAR1032×3261010–5
    MNIST28×2871010–3
    Fashion-MNIST28×2871010–3
    下载: 导出CSV

    表  3  不同防御方法在CIFAR10数据集上的鲁棒性(%)

    干净样本FGSMPGD-20PGD-100C&WDeepFool
    无防御93.065.954.249.792.041.9
    TRADES(1/λ=6)84.961.056.656.481.261.3
    TRADES(1/λ=1)88.656.349.148.984.059.1
    ADT86.860.452.151.652.4
    Feature Scatter90.078.470.568.662.6
    Fast_AT78.672.472.372.278.571.1
    本文85.079.078.878.784.973.5
    下载: 导出CSV

    表  4  Resnet18与VGG16模型在CIFAR10数据集上的鲁棒性(%)

    无防御 (Resnet18)本文 (Resnet18)无防御 (VGG16)本文 (VGG16)
    干净样本(ε=0)93.085.092.181.4
    FGSM(ε=2/8/16)83.1/65.9/66.484.9/79.0/78.783.6/47.8/28.381.4/79.8/75.9
    PGD-40(ε=2/8/16)79.1/51.5/45.284.9/78.7/77.681.3/24.3/11.881.4/79.7/74.6
    C&W(ε=2/8/16)92.7/92.0/91.085.0/84.9/84.892.0/91.5/90.781.3/81.2/81.2
    DeepFool(ε=2/8/16)78.3/41.9/16.583.5/78.5/71.578.6/31.8/5.179.2/73.5/67.0
    下载: 导出CSV

    表  5  ResNet18模型在CIFAR100数据集20分类任务上的鲁棒性(%)

    攻击算法无防御本文
    干净样本(ε=0)76.7466.02
    FGSM(ε=2/8/16)51.71/34.73/30.6464.28/59.18/52.78
    PGD-20(ε=2/8/16)46.10/14.34/5.2564.26/58.96/51.91
    PGD-100(ε=2/8/16)44.12/8.73/2.5664.26/58.94/51.62
    C&W(ε=2/8/16)49.64/16.55/3.6664.05/58.22/50.48
    DeepFool(ε=2/8/16)76.21/74.42/72.1966.00/65.86/57.00
    下载: 导出CSV

    表  6  CNN网络在MNIST数据集上的鲁棒性(%)

    攻击算法无防御本文
    干净样本(ε=0)99.199.1
    FGSM(ε=2/8/16)98.9/96.3/88.999.1/98.1/94.9
    PGD(ε=2/8/16)98.8/90.8/67.099.1/97.8/91.4
    C&W(ε=2/8/16)99.1/99.0/99.099.1/99.0/99.0
    DeepFool(ε=2/8/16)98.4/93.4/64.298.8/97.5/93.7
    下载: 导出CSV

    表  7  CNN网络在Fashion-MNIST数据集上的鲁棒性(%)

    攻击算法无防御本文
    干净样本(ε=0)93.4787.41
    FGSM(ε=2/8/16)80.13/48.09/35.1786.18/82.74/78.40
    PGD-20(ε=2/8/16)76.27/32.76/24.2386.14/81.90/75.04
    PGD-100(ε=2/8/16)75.41/29.11/23.8886.14/81.78/74.10
    C&W(ε=2/8/16)93.25/91.95/90.2887.35/87.21/86.96
    DeepFool(ε=2/8/16)77.67/25.64/0.3686.09/82.26/76.69
    下载: 导出CSV
  • [1] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]. The 2nd International Conference on Learning Representations (ICLR), Banff, Canada, 2014: 1–10.
    [2] GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. The 3rd International Conference on Learning Representations (ICLR), San Diego, USA, 2015: 1–11.
    [3] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]. 6th International Conference on Learning Representations (ICLR), Vancouver, Canada, 2018: 1–28.
    [4] MOOSAVI-DEZFOOLI S M, FAWZI A, and FROSSARD P. DeepFool: A simple and accurate method to fool deep neural networks[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 2574–2582.
    [5] CARLINI N and WAGNER D. Towards evaluating the robustness of neural networks[C]. IEEE Symposium on Security and Privacy (SP), San Jose, USA, 2017: 39–57.
    [6] WONG E, RICE L, and KOLTER J Z. Fast is better than free: Revisiting adversarial training[C]. The 8th International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 2020: 1–17.
    [7] ZHENG Haizhong, ZHANG Ziqi, GU Juncheng, et al. Efficient adversarial training with transferable adversarial examples[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 1178–1187.
    [8] DONG Yinpeng, DENG Zhijie, PANG Tianyu, et al. Adversarial distributional training for robust deep learning[C]. The 34th International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2020: 693.
    [9] WANG Hongjun, LI Guanbin, LIU Xiaobai, et al. A Hamiltonian Monte Carlo method for probabilistic adversarial attack and learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 1725–1737. doi: 10.1109/TPAMI.2020.3032061
    [10] CHEN Sizhe, HE Zhengbao, SUN Chengjin, et al. Universal adversarial attack on attention and the resulting dataset DAmageNet[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 2188–2197. doi: 10.1109/TPAMI.2020.3033291
    [11] FAN Jiameng and LI Wenchao. Adversarial training and provable robustness: A tale of two objectives[C/OL]. The 35th AAAI Conference on Artificial Intelligence, 2021: 7367–7376.
    [12] GOKHALE T, ANIRUDH R, KAILKHURA B, et al. Attribute-guided adversarial training for robustness to natural perturbations[C/OL]. The 35th AAAI Conference on Artificial Intelligence, 2021: 7574–7582.
    [13] LI Xiaoyu, ZHU Qinsheng, HUANG Yiming, et al. Research on the freezing phenomenon of quantum correlation by machine learning[J]. Computers, Materials & Continua, 2020, 65(3): 2143–2151. doi: 10.32604/cmc.2020.010865
    [14] SALMAN H, SUN Mingjie, YANG G, et al. Denoised smoothing: A provable defense for pretrained classifiers[C]. The 34th International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2020: 1841.
    [15] SHAO Rui, PERERA P, YUEN P C, et al. Open-set adversarial defense with clean-adversarial mutual learning[J]. International Journal of Computer Vision, 2022, 130(4): 1070–1087. doi: 10.1007/s11263-022-01581-0
    [16] MUSTAFA A, KHAN S H, HAYAT M, et al. Image super-resolution as a defense against adversarial attacks[J]. IEEE Transactions on Image Processing, 2020, 29: 1711–1724. doi: 10.1109/TIP.2019.2940533
    [17] GU Shuangchi, YI Ping, ZHU Ting, et al. Detecting adversarial examples in deep neural networks using normalizing filters[C]. The 11th International Conference on Agents and Artificial Intelligence (ICAART), Prague, Czech Republic, 2019: 164–173.
    [18] TISHBY N, PEREIRA F C, and BIALEK W. The information bottleneck method[EB/OL]. https://arxiv.org/pdf/physics/0004057.pdf, 2000.
    [19] TISHBY N and ZASLAVSKY N. Deep learning and the information bottleneck principle[C]. IEEE Information Theory Workshop (ITW), Jerusalem, Israel, 2015: 1–5.
    [20] SHWARTZ-ZIV R and TISHBY N. Opening the black box of deep neural networks via information[EB/OL]. https://arXiv.org/abs/1703.00810, 2017.
    [21] KOLCHINSKY A, TRACEY B D, and WOLPERT D H. Nonlinear information bottleneck[J]. Entropy, 2019, 21(12): 1181. doi: 10.3390/e21121181
    [22] ALEMI A A, FISCHER I, DILLON J V, et al. Deep variational information bottleneck[C]. The 5th International Conference on Learning Representations (ICLR), Toulon, France, 2017: 1–19.
    [23] SHAMIR O, SABATO S, and TISHBY N. Learning and generalization with the information bottleneck[J]. Theoretical Computer Science, 2010, 411(29/30): 2696–2711. doi: 10.1016/j.tcs.2010.04.006
    [24] STILL S and BIALEK W. How many clusters? An information-theoretic perspective[J]. Neural Computation, 2004, 16(12): 2483–2506. doi: 10.1162/0899766042321751
    [25] KINGMA D P and BA J. Adam: A method for stochastic optimization[C]. 3rd International Conference on Learning Representations (ICLR), San Diego, USA, 2015: 1–15.
    [26] ZHANG Hongyang, YU Yaodong, JIAO Jiantao, et al. Theoretically principled trade-off between robustness and accuracy[C]. The 36th International Conference on Machine Learning (ICML), Long Beach, USA, 2019: 7472–7482.
    [27] ZHANG Haichao and WANG Jianyu. Defense against adversarial attacks using feature scattering-based adversarial training[C]. The 33rd International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2019: 164.
    [28] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770–778.
    [29] LIU Shuying and DENG Weihong. Very deep convolutional neural network based image classification using small training sample size[C]. The 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 2015: 730–734.
  • 加载中
图(4) / 表(7)
计量
  • 文章访问数:  1005
  • HTML全文浏览量:  433
  • PDF下载量:  177
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-12
  • 修回日期:  2022-10-13
  • 网络出版日期:  2022-10-20
  • 刊出日期:  2023-06-10

目录

    /

    返回文章
    返回