基于二阶对抗样本的对抗训练防御

钱亚冠; 张锡敏; 王滨; 顾钊铨; 李蔚; 云本胜

doi:10.11999/JEIT200723

基于二阶对抗样本的对抗训练防御

doi: 10.11999/JEIT200723

1.
浙江科技学院理学院/大数据学院杭州 310023
2.
杭州海康威视网络与信息安全实验室杭州 310052
3.
广州大学网络空间先进技术研究院广州 510006

基金项目: 国家重点研发计划项目(2018YFB2100400)，国家自然科学基金(61902082)

详细信息

作者简介:
钱亚冠：男，1976年生，副教授，研究方向为人工智能安全

张锡敏：女，1996年生，硕士生，研究方向为对抗机器学习

王滨：男，1978年生，研究员，研究方向为网络与信息安全

顾钊铨：男，1989年生，教授，研究方向为人工智能安全

李蔚：女，1978年生，副教授，研究方向为计算机视觉

云本胜：男，1980年生，副教授，研究方向为数据挖掘

通讯作者:
王滨　32874546@qq.com

中图分类号: TN915.08; TP309.2
计量
- 文章访问数: 1120
- HTML全文浏览量: 653
- PDF下载量: 131
- 被引次数: 0
出版历程
- 收稿日期: 2020-08-06
- 修回日期: 2021-08-20
- 网络出版日期: 2021-09-16
- 刊出日期: 2021-11-23

Adversarial Training Defense Based on Second-order Adversarial Examples

1.
School of Science/School of Big-data Science, Zhejiang University of Science and Technology, Hangzhou 310023, China
2.
Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co., Ltd. Hangzhou 310052, China
3.
Cyberspace Institute of Advanced Technology (CIAT), Guangzhou University, Guangzhou 510006, China

Funds: The National Research and Development Program of China (2018YFB2100400), The National Natural Science Foundation of China (61902082)

摘要

摘要: 深度神经网络(DNN)应用于图像识别具有很高的准确率，但容易遭到对抗样本的攻击。对抗训练是目前抵御对抗样本攻击的有效方法之一。生成更强大的对抗样本可以更好地解决对抗训练的内部最大化问题，是提高对抗训练有效性的关键。该文针对内部最大化问题，提出一种基于2阶对抗样本的对抗训练，在输入邻域内进行2次多项式逼近，生成更强的对抗样本，从理论上分析了2阶对抗样本的强度优于1阶对抗样本。在MNIST和CIFAR10数据集上的实验表明，2阶对抗样本具有更高的攻击成功率和隐蔽性。与PGD对抗训练相比，2阶对抗训练防御对当前典型的对抗样本均具有鲁棒性。
- 对抗样本 /
- 对抗训练 /
- 2阶泰勒展开
Abstract: Although Deep Neural Networks (DNN) achieves high accuracy in image recognition, it is significantly vulnerable to adversarial examples. Adversarial training is one of the effective methods to resist adversarial examples empirically. Generating more powerful adversarial examples can solve the inner maximization problem of adversarial training better, which is the key to improve the effectiveness of adversarial training. In this paper, to solve the inner maximization problem, an adversarial training based on second-order adversarial examples is proposed to generate more powerful adversarial examples through quadratic polynomial approximation in a tiny input neighborhood. Through theoretical analysis, second-order adversarial examples are shown to outperform first-order adversarial examples. Experiments on MNIST and CIFAR10 data sets show that second-order adversarial examples have high attack success rate and high concealment. Compared with PGD adversarial training, adversarial training based on second-order adversarial examples is robust to all the existing typical attacks.
- Adversarial examples /
- Adversarial training /
- The second-order Taylor expansion

HTML全文

图 1 C与优化过程中交叉熵损失函数的关系

下载: 全尺寸图片幻灯片

图 2 对抗训练DNN对于对抗样本的分类准确率

下载: 全尺寸图片幻灯片

表 1 基于2阶对抗样本的对抗训练算法

输入 ${\boldsymbol{X}}$ 为数据集； $T$ 为训练批次； $M$ 为训练集大小； $n$ 为梯度下　　　　降迭代次数； $\tau$ 为学习率
输出 $q$ 为模型参数
1：初始化模型参数 $q$
2：　for ${\text{epoch} } = 1,2,\cdots, T$ do
3：　　　for $m = 1,2,\cdots, M$ do
4：　　　　　　 $\nabla L(X) \leftarrow {\left[ {\dfrac{{\partial L({\boldsymbol{X}})}}{{\partial {{\boldsymbol{X}}_i}}}} \right]_{n \times 1}}$ 　5：　　　　　　 ${\nabla ^2}L({\boldsymbol{X}}) \leftarrow {\left[ {\dfrac{{{\partial ^2}L({\boldsymbol{X}})}}{{\partial {{\boldsymbol{X}}_i}\partial {{\boldsymbol{X}}_j}}}} \right]_{n \times n}}$
6：　　　　　　 $T({{\boldsymbol{\delta }}}) = L({\boldsymbol{X}}) + \nabla L{({\boldsymbol{X}})^T}{\delta } + \dfrac{1}{2}{{{\boldsymbol{\delta}} }^T}{\nabla ^2}L({\boldsymbol{X}}){{\boldsymbol{\delta}} }$
7：　　　　　　for $k = 1,2,\cdots, n$ do
8：　　　　　　　　 ${{\boldsymbol{\delta}} } \leftarrow {{\boldsymbol{\delta}} } + \alpha \cdot {\text{sign}}({\nabla _{\delta }}T({{\boldsymbol{\delta}} }))$
9：　　　　　　end for
10：　　　　　　 ${{\boldsymbol{\theta}} } \leftarrow {{\boldsymbol{\theta}} } - \tau \cdot {\nabla _{\theta }}L({\boldsymbol{X}} + {{\boldsymbol{\delta}} })$
11：　　　end for
12：end for

下载: 导出CSV

表 2 不同的对抗样本在MNIST和CIFAR10的对比

	MNIST				CIFAR10
	${\ell _2}$	${\ell _\infty }$	PSNR	ASR(%)	${\ell _2}$	${\ell _\infty }$	PSNR	ASR(%)
本文	1.97	0.24	76.0	100	1.84	0.24	81.4	100
C&W	2.56	0.27	71.4	100	2.42	0.30	79.3	100
Deepfool	3.25	0.30	73.8	88.1	2.92	0.30	74.2	81.4
M-DI²-FGSM	2.75	0.29	75.6	95.1	3.12	0.25	77.1	91.7
FGSM	3.26	0.30	74.2	54.1	2.34	0.30	75.0	51.3
PGD	2.25	0.26	72.3	100	2.18	0.58	78.7	100

下载: 导出CSV

参考文献(19)

[1]	CHICCO D, SADOWSKI P, and BALDI P. Deep autoencoder neural networks for gene ontology annotation predictions[C]. Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Newport Beach, America, 2014: 533–554.
[2]	SPENCER M, EICKHOLT J, and CHENG Jianlin. A deep learning network approach to ab initio protein secondary structure prediction[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(1): 103–112. doi: 10.1109/TCBB.2014.2343960
[3]	MIKOLOV T, DEORAS A, POVEY D, et al. Strategies for training large scale neural network language models[C]. 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, Waikoloa, America, 2011: 196–201.
[4]	HINTON G, DENG Li, YU Dong, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE Signal Processing Magazine, 2012, 29(6): 82–97. doi: 10.1109/MSP.2012.2205597
[5]	LECUN Y, KAVUKCUOGLU K, FARABET C, et al. Convolutional networks and applications in vision[C]. Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, France, 2010: 253–256.
[6]	KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, Lake Tahoe Nevada, America, 2012: 1097–1105.
[7]	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[C]. 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada, 2014.
[8]	STALLKAMP J, SCHLIPSING M, SALMEN J, et al. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition[J]. Neural Networks, 2012, 32: 323–332. doi: 10.1016/j.neunet.2012.02.016
[9]	CARLINI N and WAGNER D. Towards evaluating the robustness of neural networks[C]. 2017 IEEE Symposium on Security and Privacy (SP), San Jose, America, 2017: 39–57.
[10]	MOOSAVI-DEZFOOLI S M, FAWZI A, and FROSSARD P. DeepFool: A simple and accurate method to fool deep neural networks[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, America, 2016: 2574–2582. doi: 10.1109/CVPR.2016.282.
[11]	XIE Cihang, ZHANG Zhishuai, ZHOU Yuyin, et al. Improving transferability of adversarial examples with input diversity[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, America, 2019: 2725–2734.
[12]	LEE J G, JUN S, CHO Y W, et al. Deep learning in medical imaging: General overview[J]. Korean Journal of Radiology, 2017, 18(4): 570–584. doi: 10.3348/kjr.2017.18.4.570
[13]	MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]. ICLR 2018 Conference Blind Submission, Vancouver, Canada, 2018.
[14]	GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. 3rd International Conference on Learning Representations, San Diego, America, 2015.
[15]	ARAUJO A, MEUNIER L, PINOT R, et al. Robust neural networks using randomized adversarial training[EB/OL]. https://arxiv.org/pdf/1903.10219.pdf, 2020.
[16]	LAMB A, BINAS J, GOYAL A, et al. Fortified networks: Improving the robustness of deep networks by modeling the manifold of hidden representations[C]. ICLR 2018 Conference Blind Submission, Vancouver, Canada, 2018.
[17]	XU Weilin, EVANS D, and QI Yanjun. Feature squeezing: Detecting adversarial examples in deep neural networks[C]. Network and Distributed Systems Security Symposium (NDSS), San Diego, America, 2018. doi: 10.14722/ndss.2018.23198.
[18]	BELINKOV Y and BISK Y. Synthetic and natural noise both break neural machine translation[C]. ICLR 2018 Conference Blind Submission, Vancouver, Canada, 2018.
[19]	YANG Yuzhe, ZHANG Guo, KATABI D, et al. ME-net: Towards effective adversarial robustness with matrix estimation[C]. Proceedings of the 36th International Conference on Machine Learning, Long Beach, America, 2019.