Defending against Deepfakes by Attribute-Aware Attack

GAO Fan; YAN Weidan; SHAO Wenze; ZHANG Dengyin

doi:10.11999/JEIT260043

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 >

GAO Fan, YAN Weidan, SHAO Wenze, ZHANG Dengyin. Defending against Deepfakes by Attribute-Aware Attack[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260043

Citation:

GAO Fan, YAN Weidan, SHAO Wenze, ZHANG Dengyin. Defending against Deepfakes by Attribute-Aware Attack[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260043

GAO Fan, YAN Weidan, SHAO Wenze, ZHANG Dengyin. Defending against Deepfakes by Attribute-Aware Attack[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260043

Citation:

GAO Fan, YAN Weidan, SHAO Wenze, ZHANG Dengyin. Defending against Deepfakes by Attribute-Aware Attack[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260043

PDF( 4210 KB)

Defending against Deepfakes by Attribute-Aware Attack

doi: 10.11999/JEIT260043 cstr: 32379.14.JEIT260043

1.
School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
2.
School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Funds: The National Natural Science Foundation of China (62471241, 92470126)

Received Date: 2026-01-12
Accepted Date: 2026-05-29
Rev Recd Date: 2026-05-29

Available Online: 2026-06-08

Abstract

Abstract

Objective Deepfakes can cause serious personal and property damage when misused. To prevent forged images from spreading, existing methods often use adversarial examples to protect facial images from deepfake manipulation. However, traditional gradient-based attacks show limited generalization and low generation efficiency in black-box attack scenarios. Their performance is also weaker than that of current methods based on Generative Adversarial Networks (GANs), which are used to train cross-model adversarial examples. Although GAN-based methods support fast inference, their lack of perceptual constraints often makes the generated adversarial perturbations visually noticeable. The rapid development of deepfake models also raises higher requirements for the generalization ability of adversarial examples. Therefore, imperceptible and generalizable adversarial attack methods are needed for proactive deepfake defense. Methods To further improve the transferability and imperceptibility of adversarial examples generated by existing methods, this paper proposes an attribute-aware adversarial example generation method for deepfake defense. The proposed method generates imperceptible perturbations and improves cross-model generalization through a frequency-domain identity fusion mechanism. Specifically, it focuses on the foreground regions of facial images, uses attribute-aware salient segmentation masks to separate facial and hairstyle regions, and combines these masks with adaptive spatial-frequency attention-based perturbation generators to generate region-specific adversarial perturbations. This strategy improves the imperceptibility of adversarial examples and reduces the additional computational cost caused by global processing. From the perspective of data augmentation, this paper further uses phase swapping in the frequency domain to fuse identity-related features from reference face images. This design reduces perturbation overfitting and improves generalization performance. Results and Discussions The proposed method is trained and tested on the CelebA-HQ dataset using proxy models. Compared with existing proactive defense methods, the experimental results show that the proposed method generates adversarial examples with strong imperceptibility and cross-model defense capability. It achieves a high defense success rate against various proxy models. The average Peak Signal-to-Noise Ratio (PSNR) of forged outputs under adversarial perturbations is reduced to 16.79 dB, representing an improvement of approximately 1.87% over the second-best method. Defense performance against HiSD is improved by approximately 7.5% compared with the second-best method. Defense performance against AttGAN is approximately 12.7% higher than that of the second-best GAN-based defense method. Moreover, the Learned Perceptual Image Patch Similarity (LPIPS) metric shows that the adversarial perturbations have high imperceptibility. Conclusions This study proposes a facial attribute-aware attack method for deepfake defense. The method incorporates a frequency-domain identity fusion mechanism to increase the diversity of adversarial feature inputs. Adaptive spatial-frequency attention-based perturbation generators are also designed to extract local facial information and dynamically adjust adversarial features. These designs allow the method to preserve perturbation components that are both imperceptible and attack-effective, leading to strong cross-model generalization. Future work will focus on proactive deepfake defense methods with improved imperceptibility and generalization, especially in cross-model transfer attack scenarios.
- Deepfakes defense,
- Adversarial examples,
- Generative Adversarial Networks(GAN),
- Attribute-aware attack

FullText(HTML)

References(37)

References

[1]	PENG Chunlei, LUO Xiaoyi, LIU Decheng, et al. Semantic token transformer for face forgery detection[J]. IEEE Transactions on Information Forensics and Security, 2025, 20: 4904–4914. doi: 10.1109/TIFS.2025.3567110.
[2]	刘鹏宇, 郑添阳, 董敏. 一种伪造注意图驱动的多任务深伪视频检测模型[J]. 电子与信息学报, 2026, 48(1): 346–358. doi: 10.11999/JEIT250926. LIU Pengyu, ZHENG Tianyang, and DONG Min. A fake attention map-driven multi-task deepfake video detection model[J]. Journal of Electronics & Information Technology, 2026, 48(1): 346–358. doi: 10.11999/JEIT250926.
[3]	GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015. doi: 10.48550/arXiv.1412.6572.
[4]	LIU Decheng, SU Qixuan, PENG Chunlei, et al. Imperceptible face forgery attack via adversarial semantic mask[EB/OL]. https://doi.org/10.48550/arXiv.2406.10887, 2024.
[5]	DEB D, ZHANG Jianbang, and JAIN A K. AdvFaces: Adversarial face synthesis[C]. 2020 IEEE International Joint Conference on Biometrics (IJCB), Houston, USA, 2020: 1–10. doi: 10.1109/IJCB48548.2020.9304898.
[6]	瞿左珉, 殷琪林, 盛紫琦, 等. 人脸深度伪造主动防御技术综述[J]. 中国图象图形学报, 2024, 29(2): 318–342. doi: 10.11834/jig.230128. QU Zuomin, YIN Qilin, SHENG Ziqi, et al. Overview of deepfake proactive defense techniques[J]. Journal of Image and Graphics, 2024, 29(2): 318–342. doi: 10.11834/jig.230128.
[7]	RUIZ N, BARGAL S A, and SCLAROFF S. Disrupting deepfakes: Adversarial attacks against conditional image translation networks and facial manipulation systems[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 236–251. doi: 10.1007/978-3-030-66823-5_14.
[8]	WANG Run, HUANG Ziheng, CHEN Zhikai, et al. Anti-forgery: Towards a stealthy and robust DeepFake disruption attack via adversarial perceptual-aware perturbations[C]. The Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 2022: 761–767. doi: 10.24963/ijcai.2022/107.
[9]	HUANG Hao, WANG Yongtao, CHEN Zhaoyu, et al. CMUA-watermark: A cross-model universal adversarial watermark for combating deepfakes[C]. The 36th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2022: 989–997. doi: 10.1609/aaai.v36i1.19982.
[10]	XIAO Chaowei, LI Bo, ZHU Junyan, et al. Generating adversarial examples with adversarial networks[C]. The 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 3905–3911. doi: 10.24963/ijcai.2018/543.
[11]	TANG Long, YE Dengpan, LU Zhenhao, et al. Feature extraction matters more: An effective and efficient universal deepfake disruptor[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2025, 21(2): 46. doi: 10.1145/3653457.
[12]	王金伟, 曾可慧, 张家伟, 等. 基于空频联合卷积神经网络的GAN生成人脸检测[J]. 计算机科学, 2023, 50(6): 216–224. doi: 10.11896/jsjkx.220400268. WANG Jinwei, ZENG Kehui, ZHANG Jiawei, et al. GAN-generated face detection based on space-frequency convolutional neural network[J]. Computer Science, 2023, 50(6): 216–224. doi: 10.11896/jsjkx.220400268.
[13]	HE Ziwen, WANG Wei, GUAN Weinan, et al. Defeating deepfakes via adversarial visual reconstruction[C]. The 30th ACM International Conference on Multimedia, Lisboa, Portugal, 2022: 2464–2472. doi: 10.1145/3503161.3547923.
[14]	洪钰婷, 陈北京. 抵抗第二次人脸属性编辑的不可感知主动防御算法[J]. 计算机辅助设计与图形学学报, 2025: 1–10. doi: 10.3724/SP.J.1089.2024-00316. HONG Yuting and CHEN Beijing. Imperceptible proactive defense against second facial attribute editing[J]. Journal of Computer-Aided Design & Computer Graphics, 2025: 1–10. doi: 10.3724/SP.J.1089.2024-00316.
[15]	FENG Yixiang and HUANG Fangjun. Compression-resistant adversarial perturbation for real-world proactive defense against deepfakes[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2026, 36(4): 5161–5172. doi: 10.1109/TCSVT.2025.3626505.
[16]	QU Zuomin, XI Zuping, LU Wei, et al. DF-RAP: A robust adversarial perturbation for defending against deepfakes in real-world social network scenarios[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 3943–3957. doi: 10.1109/TIFS.2024.3372803.
[17]	BYUN J, GO H, and KIM C. Geometrically adaptive dictionary attack on face recognition[C]. The 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2022: 3809–3818. doi: 10.1109/WACV51458.2022.00386.
[18]	LI Qilei, GAO Mingliang, ZHANG Guisheng, et al. Defending deepfakes by saliency-aware attack[J]. IEEE Transactions on Computational Social Systems, 2024, 11(4): 5060–5067. doi: 10.1109/TCSS.2023.3271121.
[19]	YANG Yong, LI Changjiang, JIANG Yi, et al. Invisible-face: Rethinking facial attribute privacy in social media photo sharing[J]. IEEE Transactions on Information Forensics and Security, 2025, 20: 6101–6116. doi: 10.1109/TIFS.2025.3579592.
[20]	吴涛, 纪琼辉, 先兴平, 等. 信息熵驱动的图神经网络黑盒迁移对抗攻击方法[J]. 电子与信息学报, 2025, 47(10): 3814–3825. doi: 10.11999/JEIT250303. WU Tao, JI Qionghui, XIAN Xingping, et al. Information entropy-driven black-box transferable adversarial attack method for graph neural networks[J]. Journal of Electronics & Information Technology, 2025, 47(10): 3814–3825. doi: 10.11999/JEIT250303.
[21]	TOKOZUME Y, USHIKU Y, and HARADA T. Between-class learning for image classification[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 5486–5494. doi: 10.1109/CVPR.2018.00575.
[22]	HENDRYCKS D, ZOU A, MAZEIKA M, et al. PixMix: Dreamlike pictures comprehensively improve safety measures[C]. The 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 16783–16792. doi: 10.1109/CVPR52688.2022.01628.
[23]	钱亚冠, 孔亚鑫, 陈科成, 等. 利用频谱衰减增强深度神经网络对抗迁移攻击[J]. 电子与信息学报, 2025, 47(10): 3847–3857. doi: 10.11999/JEIT250157. QIAN Yaguan, KONG Yaxin, CHEN Kecheng, et al. Adversarial transferability attack on deep neural networks through spectral coefficient decay[J]. Journal of Electronics & Information Technology, 2025, 47(10): 3847–3857. doi: 10.11999/JEIT250157.
[24]	凌海, 凌捷. 基于特征融合的对抗样本定向目标攻击可迁移性增强[J]. 计算机工程, 2025, 51(11): 162–170. doi: 10.19678/j.issn.1000-3428.0069983. LING Hai and LING Jie. Transferability enhancement of adversarial sample directed targeted attack based on feature fusion[J]. Computer Engineering, 2025, 51(11): 162–170. doi: 10.19678/j.issn.1000–3428.0069983. doi: 10.19678/j.issn.1000-3428.0069983.
[25]	YU Hu, ZHENG Naishan, ZHOU Man, et al. Frequency and spatial dual guidance for image dehazing[C]. 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 181–198. doi: 10.1007/978-3-031-19800-7_11.
[26]	沈瑜, 白珊, 魏子易, 等. 基于跨模态感知和空频交叉的医学图像融合[J]. 中国激光, 2025, 52(9): 0907106. doi: 10.3788/CJL241333. SHEN Yu, BAI Shan, WEI Ziyi, et al. Medical image fusion network for cross-modality perception and spatial-frequency interaction[J]. Chinese Journal of Lasers, 2025, 52(9): 0907106. doi: 10.3788/CJL241333.
[27]	TORBUNOV D, HUANG Yi, YU Haiwan, et al. UVCGAN: UNet vision transformer cycle-consistent GAN for unpaired image-to-image translation[C]. The 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2023: 702–712. doi: 10.1109/WACV56688.2023.00077.
[28]	XIONG Zihao, ZHOU Fei, WU Fengyi, et al. DRPCA-Net: Make robust PCA great again for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5005516. doi: 10.1109/TGRS.2025.3588392.
[29]	ZHU Hao, WU W, ZHU Wentao, et al. CelebV-HQ: A large-scale video facial attributes dataset[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 650–667. doi: 10.1007/978-3-031-20071-7_38.
[30]	YU Changqian, GAO Changxin, WANG Jingbo, et al. BiSeNet V2: Bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051–3068. doi: 10.1007/s11263-021-01515-2.
[31]	SIDDIQUEE M M R, ZHOU Zongwei, TAJBAKHSH N, et al. Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization[C]. The 2019 IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019: 191–200. doi: 10.1109/ICCV.2019.00028.
[32]	HE Zhenliang, ZUO Wangmeng, KAN Meina, et al. AttGAN: Facial attribute editing by only changing what you want[J]. IEEE Transactions on Image Processing, 2019, 28(11): 5464–5478. doi: 10.1109/TIP.2019.2916751.
[33]	LI Xinyang, ZHANG Shengchuan, HU Jie, et al. Image-to-image translation via hierarchical style disentanglement[C]. The 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 8635–8644. doi: 10.1109/CVPR46437.2021.00853.
[34]	CHOI Y, CHOI M, KIM M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8789–8797. doi: 10.1109/CVPR.2018.00916.
[35]	TANG Hao, LIU Hong, XU Dan, et al. AttentionGAN: Unpaired image-to-image translation using attention-guided generative adversarial networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(4): 1972–1987. doi: 10.1109/TNNLS.2021.3105725.
[36]	KARRAS T, LAINE S, and AILA T. A style-based generator architecture for generative adversarial networks[C]. The 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4396–4405. doi: 10.1109/CVPR.2019.00453.
[37]	RÖSSLER A, COZZOLINO D, VERDOLIVA L, et al. FaceForensics++: Learning to detect manipulated facial images[C]. The 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 1–11. doi: 10.1109/ICCV.2019.00009.