基于相位偏移的深度神经网络隐蔽后门攻击策略

张恒; 夏雨; 任燕; 杜林康; 张治坤

doi:10.11999/JEIT251145

基于相位偏移的深度神经网络隐蔽后门攻击策略

doi: 10.11999/JEIT251145 cstr: 32379.14.JEIT251145

张恒^1, ,,
夏雨¹,
任燕¹,
杜林康²,
张治坤³

1.
江苏海洋大学计算机工程学院连云港 222005
2.
西安交通大学网络空间安全学院西安 710049
3.
浙江大学计算机科学与技术学院杭州 310058

基金项目: 国家自然科学基金(61873106, 62402379, 62402431, 62441618)，江苏省杰出青年科学基金项目(BK20200049)

详细信息

作者简介:
张恒：男，教授，研究方向为信息物理系统安全、网络安全

夏雨：男，硕士生，研究方向为机器学习安全、网络安全

任燕：女，博士生，研究方向为机器学习安全、网络安全

杜林康：男，助理教授，研究方向为数据隐私保护、数据溯源与确权

张治坤：男，教授，研究方向为可信人工智能、数据安全

通讯作者:
张恒　zhangheng@jou.edu.cn

中图分类号: TP183
计量
- 文章访问数: 169
- HTML全文浏览量: 64
- PDF下载量: 24
- 被引次数: 0
出版历程
- 收稿日期: 2025-11-01
- 修回日期: 2026-03-09
- 录用日期: 2026-03-09
- 网络出版日期: 2026-03-18

Phase Shift-Based Covert Backdoor Attack Strategy in Deep Neural Networks

1.
School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, China
2.
School of Cyber Science and Engineering, Xi’anJiaotong University, Xi’an 710049, China
3.
School of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China

Funds: The National Natural Science Foundation of China (61873106, 62402379, 62402431, 62441618), The Nature Science Foundation of Jiangsu Province for Distinguished Young Scholars (BK20200049)

摘要

摘要: 后门攻击对深度神经网络(DNN)构成重大安全威胁。被植入后门的模型在遭遇特定触发器输入时会诱发预设错误输出，而对干净样本仍维持基准性能。现有研究已在空间域与频域触发器设计方面展开探索，但多数方法为确保攻击成功率(ASR)，而牺牲了触发器的不可感知性。该文提出一种基于相位偏移的频域后门攻击(FDPS)方法。该方法通过离散傅里叶变换(DFT)将图像映射至频域，并在选定的频率分量上施加相位扰动以嵌入触发器。具体而言，FDPS优先针对中高频相位分量进行精细调控，以最小化幅度谱变化并避免引入可察觉的伪影。鉴于相位信息主导正弦波的相对位移，此类扰动可自然协调视觉语义，从而显著提升隐蔽性。相较于传统幅度扰动策略，相位偏移在保留图像全局结构的同时，更有效地规避了基于图像的防御检测机制。实验表明，与BadNets、Blend、WaNet及Ftrojan等基准后门攻击相比，FDPS在攻击成功率、干净样本准确率以及结构相似性指数(SSIM)等指标上均表现优越。此外，在GTSRB数据集上，仅需毒化2%的训练样本即可实现99%的攻击成功率，显著降低了攻击的样本需求与技术门槛，展现出对多样化攻击场景的更强鲁棒性与适配能力。
- 后门攻击 /
- 深度神经网络 /
- 图像分类 /
- 频域
Abstract: Objective The proliferation of deep neural networks (DNNs) in safety-critical domains such as autonomous driving and biomedical diagnostics has heightened concerns about their vulnerability to adversarial threats, particularly backdoor attacks. These attacks embed hidden triggers during training, causing models to behave normally on clean inputs while executing malicious actions when specific triggers are present. Existing backdoor methods predominantly operate in the spatial domain or frequency domain, but they face a fundamental trade-off between attack success rate (ASR) and stealthiness. Spatial triggers often introduce visible artifacts, while frequency-based amplitude perturbations disrupt energy distribution, making them detectable by advanced defenses like spectral anomaly detection. This work addresses the critical need for a backdoor paradigm that simultaneously achieves high attack performance, minimal perceptual distortion, and robustness against state-of-the-art defenses. Our objective is to develop a frequency-domain backdoor attack leveraging phase manipulation, which inherently aligns with human visual perception and structural coherence, thereby overcoming the limitations of existing methods. Methods FDPS integrates frequency-domain phase manipulation with perceptual similarity screening and standard data poisoning. The method begins by converting input images from RGB to YCrCb color space. This conversion isolates chrominance channels while preserving luminance information intact. Next, the system applies Discrete Fourier Transform to the chrominance components. This transformation produces complex frequency spectra. The method computes phase information using atan2 function and selectively shifts high-frequency components. Image reconstruction is performed through Inverse Fourier Transform. The framework incorporates Learned Perceptual Image Patch Similarity filtering. This filter discards generated instances that fall below similarity thresholds. The screening ensures all retained triggers maintain visual imperceptibility. Accepted poisoned samples receive target class labels. These samples are combined with clean training data following standard protocols. Results and Discussions FDPS achieves near-perfect 99% attack success rates while maintaining benign accuracy across three datasets and two network architectures (Table 1). The method operates by manipulating phase information in chrominance channels via Fourier transforms, with LPIPS filtering ensuring visual stealth. Experimental results show poisoned images retain semantic focus, as confirmed by Grad-CAM visualizations aligning with clean patterns (Fig. 4). The approach demonstrates strong defense evasion, scoring an anomaly index of 1.73 against Neural Cleanse - below the detection threshold of 2 (Fig. 3-5). Ablation studies validate that high-frequency phase perturbations achieve over 90% attack success with just 2% poisoning while minimizing impact on model utility (Fig. 6; Table 3). Conclusions An end-to-end frequency-domain strategy was developed to embed covert triggers in image classifiers while maintaining clean-data fidelity. By shifting selected phase components in chrominance and filtering with LPIPS, FDPS achieves 99% ASR with negligible BA loss and produces minimal visible artifacts. It also evades leading detection tools, including Grad-CAM, Neural Cleanse, ANP, and STRIP. The findings indicate that phase-centric, high-frequency perturbations constitute an especially potent and stealthy backdoor mechanism. Future work should explore broader modality coverage and develop frequency-domain anomaly detectors as principled countermeasures.
- Backdoor attacks /
- Deep neural networks /
- Image classification /
- Frequency domain

HTML全文

图 1 不同攻击方式下，原图与毒化图像的空间域残差对比

下载: 全尺寸图片幻灯片

图 2 基于相位偏移的后门攻击框架

下载: 全尺寸图片幻灯片

图 3 FDPS频域单点触发机理的可视化

下载: 全尺寸图片幻灯片

图 4 同一样本在不同后门攻击下的Grad-CAM可视化(FDPS仍聚焦于目标本体)

下载: 全尺寸图片幻灯片

图 5 FDPS面对ANP时攻击成功率(ASR)与良性样本准确率(BA)随剪枝强度变化曲线(ASR下降滞后于BA)

下载: 全尺寸图片幻灯片

图 6 不同攻击方法的异常指数分布，虚线表示阈值为2

下载: 全尺寸图片幻灯片

图 7 FDPS 与干净样本在 STRIP 检测下的输出熵分布对比(分布近乎重合)

下载: 全尺寸图片幻灯片

图 8 不同注入率下，FDPS方法的攻击成功率

下载: 全尺寸图片幻灯片

表 1 不同基线方法在三个数据集上的攻击成功率(ASR)和良性样本准确率(BA)对比

模型	方法	CIFAR10		GTSRB		CINIC10
模型	方法	BA(%)	ASR(%)	BA(%)	ASR(%)	BA(%)	ASR(%)
ResNet18	Clean	95.00	-	99.20	-	88.02	-
	BadNets	94.02	99.42	99.13	99.52	85.77	99.26
	Blend	94.26	99.87	99.02	99.77	85.82	99.89
	WaNet	94.22	99.74	99.10	99.73	84.87	89.61
	Ftrojan	94.10	99.26	99.01	98.90	85.90	99.70
	DUBA	94.55	99.68	99.17	99.92	87.85	99.24
	FDPS	94.02	99.33	99.15	99.12	86.11	99.13
RepVGG	Clean	91.20	-	99.40	-	87.24	-
	BadNets	91.07	99.10	99.34	99.67	86.78	99.19
	Blend	91.10	98.74	99.13	99.36	85.76	98.02
	WaNet	91.07	97.95	99.13	99.38	86.16	88.76
	Ftrojan	90.08	99.14	99.29	99.26	85.82	97.84
	DUBA	91.18	99.78	99.34	99.78	87.01	99.02
	FDPS	91.00	99.19	99.30	99.35	86.22	99.11

下载: 导出CSV

表 2 五种后门攻击方法在三数据集上的SSIM与LPIPS对比

数据集	BadNets		Blend		WaNet		Ftrojan		DUBA		FDPS
数据集	SSIM	LPIPS	SSIM	LPIPS	SSIM	LPIPS	SSIM	LPIPS	SSIM	LPIPS	SSIM	LPIPS
CIFAR10	0.972	0.0093	0.864	0.1319	0.953	0.0099	0.954	0.0125	0.977	0.0081	0.972	0.0058
GTSRB	0.968	0.0347	0.879	0.1569	0.950	0.0528	0.912	0.0598	0.972	0.0201	0.975	0.0112
CINIC10	0.978	0.0065	0.887	0.1832	0.941	0.0951	0.971	0.0754	0.968	0.0072	0.972	0.0052

下载: 导出CSV

表 3 不同频率位置下良性样本准确率(BA)与攻击成功率(ASR)的比较

频率	CIFAR10		GTSSRB		CINIC10
频率	BA(%)	ASR(%)	BA(%)	ASR(%)	BA(%)	ASR(%)
($ u=\dfrac{1}{2}M,v=\dfrac{1}{2}N $)	94.02	99.33	99.15	99.12	86.11	99.13
($ u=\dfrac{3}{4}M,v=\dfrac{3}{4}N $)	93.55	98.37	99.10	98.75	85.97	97.94
($ u=M,v=N $)	93.11	94.18	94.18	95.47	85.49	87.55

下载: 导出CSV

参考文献(26)

[1]	CHENG Jian, LONG Kaifang, ZHANG Shuang, et al. Text-image scene graph fusion for multimodal named entity recognition[J]. IEEE Transactions on Artificial Intelligence, 2024, 5(6): 2828–2839. doi: 10.1109/TAI.2023.3326416.
[2]	任海玉, 刘建平, 王健, 等. 基于大语言模型的智能问答系统研究综述[J]. 计算机工程与应用, 2025, 61(7): 1–24. doi: 10.3778/j.issn.1002-8331.2409-0300. REN Haiyu, LIU Jianping, WANG Jian, et al. Research on intelligent question answering system based on large language model[J]. Computer Engineering and Applications, 2025, 61(7): 1–24. doi: 10.3778/j.issn.1002-8331.2409-0300.
[3]	NOWROOZI E, JADALLA N, GHELICHKHANI S, et al. Mitigating label flipping attacks in malicious URL detectors using ensemble trees[J]. IEEE Transactions on Network and Service Management, 2024, 21(6): 6875–6884. doi: 10.1109/TNSM.2024.3447411.
[4]	任利强, 贾舒宜, 王海鹏, 等. 基于深度学习的时间序列分类研究综述[J]. 电子与信息学报, 2024, 46(8): 3094–3116. doi: 10.11999/JEIT231222. REN Liqiang, JIA Shuyi, WANG Haipeng, et al. A review of research on time series classification based on deep learning[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3094–3116. doi: 10.11999/JEIT231222.
[5]	汪旭童, 尹捷, 刘潮歌, 等. 神经网络后门攻击与防御综述[J]. 计算机学报, 2024, 47(8): 1713–1743. doi: 10.11897/SP.J.1016.2024.01713. WANG Xutong, YIN Jie, LIU Chaoge, et al. A survey of backdoor attacks and defenses on neural networks[J]. Chinese Journal of Computers, 2024, 47(8): 1713–1743. doi: 10.11897/SP.J.1016.2024.01713.
[6]	FENG Jun, LAI Yuzhe, SUN Hong, et al. SADBA: Self-adaptive distributed backdoor attack against federated learning[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 16568–16576. doi: 10.1609/aaai.v39i16.33820.
[7]	FENG Jun, YANG L T, ZHU Qing, et al. Privacy-preserving tensor decomposition over encrypted data in a federated cloud environment[J]. IEEE Transactions on Dependable and Secure Computing, 2020, 17(4): 857–868. doi: 10.1109/TDSC.2018.2881452.
[8]	ZHANG Pengfei, SUN Hong, ZHANG Zhikun, et al. Privacy-preserving recommendations with mixture model-based matrix factorization under local differential privacy[J]. IEEE Transactions on Industrial Informatics, 2025, 21(7): 5451–5459. doi: 10.1109/tii.2025.3555993.
[9]	杜巍, 刘功申. 深度学习中的后门攻击综述[J]. 信息安全学报, 2022, 7(3): 1–16. doi: 10.19363/J.cnki.cn10-1380/tn.2022.05.01. DU Wei and LIU Gongshen. A survey of backdoor attack in deep learning[J]. Journal of Cyber Security, 2022, 7(3): 1–16. doi: 10.19363/J.cnki.cn10-1380/tn.2022.05.01.
[10]	GU Tianyu, LIU Kang, DOLAN-GAVITT B, et al. BadNets: Evaluating backdooring attacks on deep neural networks[J]. IEEE Access, 2019, 7: 47230–47244. doi: 10.1109/ACCESS.2019.2909068.
[11]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626. doi: 10.1109/ICCV.2017.74.
[12]	NGUYEN T A and TRAN A T. WaNet-imperceptible warping-based backdoor attack[C]. 9th International Conference on Learning Representations, Austria, 2021. (查阅网上资料, 未找到本条文献出版城市信息, 请确认并补充).
[13]	LIN Junyu, XU Lei, LIU Yingqi, et al. Composite backdoor attack for deep neural network by mixing existing benign features[C]. Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, USA, 2020: 113–131. doi: 10.1145/3372297.3423362. (查阅网上资料,未找到本条文献出版城市信息,请确认并补充).
[14]	WANG Tong, YAO Yuan, XU Feng, et al. An invisible black-box backdoor attack through frequency domain[C]. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 396–413. doi: 10.1007/978-3-031-19778-9_23.
[15]	FENG Yu, MA Benteng, ZHANG Jing, et al. FIBA: Frequency-injection based backdoor attack in medical image analysis[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 20844–20853. doi: 10.1109/CVPR52688.2022.02021.
[16]	GAO Yansong, XU Change, WANG Derui, et al. STRIP: A defence against trojan attacks on deep neural networks[C]. Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, USA, 2019: 113–125. doi: 10.1145/3359789.3359790.
[17]	XU Honghui, FANG Chuangjie, WANG Renfang, et al. Dual-enhanced high-order self-learning tensor singular value decomposition for robust principal component analysis[J]. IEEE Transactions on Artificial Intelligence, 2024, 5(7): 3564–3578. doi: 10.1109/TAI.2024.3373388.
[18]	YAN Qingsen, FENG Yixu, ZHANG Cheng, et al. HVI: A new color space for low-light image enhancement[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2025: 5678–5687. doi: 10.1109/CVPR52734.2025.00533.
[19]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
[20]	DING Xiaohan, ZHANG Xiangyu, MA Ningning, et al. RepVGG: Making VGG-style ConvNets great again[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13728–13737. doi: 10.1109/CVPR46437.2021.01352.
[21]	KRIZHEVSKY A. Learning multiple layers of features from tiny images[R]. Technical Report TR-2009, 2009.
[22]	STALLKAMP J, SCHLIPSING M, SALMEN J, et al. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition[J]. Neural Networks, 2012, 32: 323–332. doi: 10.1016/j.neunet.2012.02.016.
[23]	DARLOW L N, CROWLEY E J, ANTONIOU A, et al. CINIC-10 is not ImageNet or CIFAR-10[J]. arXiv preprint arXiv: 1810.03505, 2018. doi: 10.48550/arXiv.1810.03505. (查阅网上资料,请核对文献类型及格式是否正确).
[24]	GAO Yudong, CHEN Honglong, SUN Peng, et al. A dual stealthy backdoor: From both spatial and frequency perspectives[C]. Proceedings of the 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 1851–1859. doi: 10.1609/aaai.v38i3.27954.
[25]	WU Dongxian and WANG Yisen. Adversarial neuron pruning purifies backdoored deep models[C]. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021: 1293. (查阅网上资料, 未找到本条文献出版地信息, 请确认并补充).
[26]	WANG Bolun, YAO Yuanshun, SHAN S, et al. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks[C]. Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, USA, 2019: 707–723. doi: 10.1109/SP.2019.00031.