语义引导局部扰动的扩散模型对抗样本生成方法

doi:10.16451/j.cnki.issn1003-6059.202602001

摘要
图/表
参考文献
相关文章 (7)

全文: PDF (1568 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对DiffAttack在语义引导、显著性区域及图像自然性等方面存在的问题,提出语义引导局部扰动的扩散模型对抗样本生成方法.首先,设计文本嵌入模块,在扩散模型去噪过程开始前对文本嵌入进行迭代优化,生成用于引导语义偏移的对抗性文本嵌入,作为引导去噪的条件.然后,在去噪过程中,加入局部掩码融合模块,在潜空间中对显著区域注入局部扰动,提升对抗样本的攻击性.最后,采用多层次联合感知损失函数,在图像与潜在空间层面联合约束感知差异,保持对抗样本攻击性的同时增强图像的自然性.在ImageNet-Compatible子集上以Inception作为代理模型生成对抗样本,并迁移至3种不同的模型架构中进行评估.结果显示,相比DiffAttack,文中方法的平均Top-1准确率降低2.8%,FID(Fréchet Inception Distance)指标提升0.4,说明文中方法生成的对抗样本在保持图像自然性的同时具有更强的攻击性,能更好地检测模型在安全性和鲁棒性方面存在的问题,具有更强的实用价值.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	赵宏
	许明婷
	刘泽

关键词 ：对抗样本, 扩散模型, 语义引导, 局部扰动, 多层次感知损失

Abstract：To address the issues of diffusion-based adversarial example generation methods(DiffAttack) in semantic guidance, salient regions, and image naturalness, an adversarial example generation method based on semantic-guided local perturbation diffusion model is proposed in this paper. First, a text embedding module is designed to iteratively optimize the text embedding before the denoising process of the diffusion model. The adversarial text embeddings used to guide semantic shifts are generated and adopted as the conditions for denoising. Second, a local mask fusion module is incorporated into the denoising process. The local perturbations are injected into salient regions in the latent space to enhance the attack effectiveness of the adversarial examples. Finally, a multi-level joint perceptual loss function is employed to jointly constrain perceptual differences at both the image and latent space levels. The image naturalness is enhanced while the attack effectiveness of the adversarial examples is maintained. Adversarial examples are generated on the ImageNet-Compatible subset using Inception as a proxy model, and are evaluated across three different model architectures. The results show that, compared with DiffAttack, the proposed method reduces the average Top-1 accuracy by 2.8% while improving the FID(Fréchet Inception Distance) score by 0.4. These results demonstrate that the proposed method generates adversarial examples with both stronger attack effectiveness and enhanced image naturalness. The proposed method can better detect the issues in security and robustness of the model, exhibiting strong practical value.

Key words： Adversarial Examples Diffusion Model Semantic Guidance Local Perturbation Multi-level Perceptual Loss

收稿日期: 2025-12-08

ZTFLH:

TP 391

基金资助:国家自然科学基金项目(No.62166025)资助

通讯作者: 赵宏,博士,教授.主要研究方向为计算机视觉、并行与分布式处理、嵌入式系统等.E-mail:zhaoh@lut.edu.cn.

作者简介: 许明婷,硕士研究生,主要研究方向为计算机视觉、对抗攻击等.E-mail:1636807087@qq.com.
刘泽,硕士研究生,主要研究方向为模式识别、人工智能等.E-mail:15755269725@163.com.

引用本文:

赵宏, 许明婷, 刘泽. 语义引导局部扰动的扩散模型对抗样本生成方法[J]. 模式识别与人工智能, 2026, 39(2): 97-111. ZHAO Hong, XU Mingting, LIU Ze. Adversarial Example Generation Method Based on Semantic-Guided Local Perturbation Diffusion Model. Pattern Recognition and Artificial Intelligence, 2026, 39(2): 97-111.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202602001 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2026/V39/I2/97

[1] RAKESH R K, NAMITA G R, KULKARNI R. Image Recognition, Classification and Analysis Using Convolutional Neural Networks // Proc of the 1st International Conference on Electrical, Electronics, Information and Communication Technologies. Washington, USA: IEEE, 2022. DOI: 10.1109/ICEEICT53079.2022.9768474.
[2] DU H, SHI H L, ZENG D, et al. The Elements of End-to-End Deep Face Recognition: A Survey of Recent Advances. ACM Computing Surveys, 2022, 54(10s). DOI: 10.1145/3507902.
[3] CHIB P S, SINGH P.Recent Advancements in End-to-End Autonomous Driving Using Deep Learning: A Survey. IEEE Transactions on Intelligent Vehicles, 2024, 9(1): 103-118.
[4] ZHANG X W, ZHENG X L, MAO W J.Adversarial Perturbation Defense on Deep Neural Networks. ACM Computing Surveys, 2021, 54(8). DOI: 10.1145/3465397.
[5] 王志波,王雪,马菁菁,等.面向计算机视觉系统的对抗样本攻击综述.计算机学报, 2023, 46(2): 436-468.
(WANG Z B, WANG X, MA J J, et al. Survey on Adversarial Example Attack for Computer Vision Systems. Chinese Journal of Computers, 2023, 46(2): 436-468.)
[6] GOODFELLOW I J, SHLENS J, SZEGEDY C.Explaining and Har-nessing Adversarial Examples[C/OL]. [2025-11-15].https://arxiv.org/pdf/1412.6572.
[7] XIE C H, ZHANG Z S, ZHOU Y Y, et al. Improving Transferability of Adversarial Examples with Input Diversity // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2725-2734.
[8] DONG Y P, PANG T Y, SU H, et al. Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 4307-4316.
[9] GAO L L, ZHANG Q L, SONG J K, et al. Patch-Wise Attack for Fooling Deep Neural Network // Proc of the 16th European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2020: 307-322.
[10] CARLINI N, WAGNER D.Towards Evaluating the Robustness of Neural Networks // Proc of the IEEE Symposium on Security and Privacy. Washington, USA: IEEE, 2017: 39-57.
[11] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Nets // Proc of the 28th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2014: 2672-2680.
[12] HO J, JAIN A, ABBEEL P. Denoising Diffusion Probabilistic Mo-dels // Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 6840-6851.
[13] XIAO C W, LI B, ZHU J Y, et al. Generating Adversarial Examples with Adversarial Networks // Proc of the 27th International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2018: 3905-3911.
[14] JANDIAL S, MANGLA P, VARSHNEY S, et al. AdvGAN++: Harnessing Latent Layers for Adversary Generation // Proc of the IEEE/CVF International Conference on Computer Vision Workshop. Washington, USA: IEEE, 2019: 2045-2048.
[15] 石磊,张晓涵,洪晓鹏,等.多尺度梯度对抗样本生成网络.模式识别与人工智能, 2022, 35(6): 483-496.
(SHI L, ZHANG X H, HONG X P, et al. Multi-scale Gradient Adversarial Examples Generation Network. Pattern Recognition and Artificial Intelligence, 2022, 35(6): 483-496.)
[16] CHEN X Q, GAO X T, ZHAO J J, et al. AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models // Proc of the IEEE/CVF International Conference on Computer Vision. Washing-ton, USA: IEEE, 2023: 4539-4549.
[17] DAI X L, LIANG K S, XIAO B.AdvDiff: Generating Unrestricted Ad-versarial Examples Using Diffusion Models // Proc of the 18th European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2024: 93-109.
[18] GUO Q, PANG S M, JIA X J, et al. Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models via Diffusion Models. IEEE Transactions on Information Forensics and Security, 2025, 20: 1333-1348.
[19] CHEN J Q, CHEN H, CHEN K Y, et al. Diffusion Models for Imperceptible and Transferable Adversarial Attack. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47(2): 961-977.
[20] SONG J M, MENG C L, ERMON S.Denoising Diffusion Implicit Models[C/OL]. [2025-11-15].https://arxiv.org/pdf/2010.02502.
[21] ZHANG R, ISOLA P, EFROS A A, et al. The Unreasonable Effec-tiveness of Deep Features as a Perceptual Metric // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 586-595.
[22] SHENSA M J.The Discrete Wavelet Transform: Wedding the a Trous and Mallat Algorithms. IEEE Transactions on Signal Proce-ssing, 1992, 40(10): 2464-2482.
[23] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 11966-11976.
[24] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[25] SIMONYAN K, ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2025-11-15]. https://arxiv.org/pdf/1409.1556.
[26] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the Inception Architecture for Computer Vision // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2818-2826.
[27] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4510-4520.
[28] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale[C/OL].[2025-11-15]. https://arxiv.org/pdf/2010.11929.
[29] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 9992-10002.
[30] TOUVRON H, CORD M, DOUZE M, et al. Training Data-Efficient Image Transformers & Distillation through Attention // Proc of the 38th International Conference on Machine Learning . San Diego, USA: JMLR, 2021: 10347-10357.
[31] TOLSTIKHIN I, HOULSBY N, KOLESNIKOV A, et al. MLP-Mixer: An All-MLP Architecture for Vision // Proc of the 35th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2021: 24261-24272.
[32] KURAKIN A, GOODFELLOW I, BENGIO S, et al. Adversarial Attacks and Defences Competition // Proc of the NIPS'17 Competition: Building Intelligent Systems. Berlin, Germany: Springer, 2018: 195-231.
[33] TRAMÈR F, KURAKIN A, PAPERNOT N, et al. Ensemble Adversarial Training: Attacks and Defenses[C/OL].[2025-11-15]. https://arxiv.org/pdf/1705.07204.
[34] HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium[C/OL].[2025-11-15]. https://arxiv.org/pdf/1706.08500.
[35] LONG Y Y, ZHANG Q L, ZENG B H, et al. Frequency Domain Model Augmentation for Adversarial Attack // Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2022: 549-566.
[36] LAIDLAW C, FEIZI S.Functional Adversarial Attacks // Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 10408-10418.
[37] BHATTAD A, CHONG M J, LIANG K Z, et al. Unrestricted Adversarial Examples via Semantic Manipulation[C/OL].[2025-11-15]. https://openreview.net/attachment?id=Sye_OgHFwH&name=original_pdf.
[38] YUAN S M, ZHANG Q L, GAO L L, et al. Natural Color Fool: Towards Boosting Black-Box Unrestricted Attacks // Proc of the 36th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2022: 7546-7560.
[39] HU J, SHEN L, SUN G.Squeeze-and-Excitation Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 7132-7141.