基于干净标签的人脸识别模型后门水印方法

doi:10.16451/j.cnki.issn1003-6059.202510006

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (1859 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Face recognition models are widely applied in critical areas, such as security authentication and intelligent surveillance. These models are faced with significant security and copyright risks due to their high reliance on sensitive biometric features. Backdoor watermarking technology for face recognition models is widely utilized for copyright verification, but most existing methods rely on dirty-label strategies. Consequently, data semantic consistency is destroyed, and the watermarks can be easily detected by current backdoor-detection mechanisms, which limit practical deployment. To address these issues, a clean-label backdoor watermarking method for face recognition models（CBW2F） is proposed in this paper. High imperceptibility and strong robustness are achieved without modifying any sample labels. Specifically, imperceptible adversarial perturbations are first applied to a subset of samples. The model dependence on original salient features is weakened, and the learning of the embedded backdoor trigger pattern is encouraged. A structured and visually natural rainbow filter is then introduced as the trigger. Through its cooperation with the perturbation, the model achieves effective watermark embedding while maintaining its original recognition performance. Experiments demonstrate that CBW2F effectively evades label-consistency-based backdoor detection and maintains strong robustness under various watermark removal attacks, including model fine-tuning and model distillation. It outperforms existing state-of-the-art approaches across multiple evaluation metrics, providing a practical solution for copyright protection in face recognition models.

Key words： Backdoor Watermarking Face Recognition Intellectual Property Protection Adversarial Perturbation

Received: 16 August 2025

ZTFLH:

TP 391

Fund:National Natural Science Foundation of China（No.62476137,62406148,62306339）, Natural Science Foundation of Jiangsu Province（No.SBK2024047556）

Corresponding Authors: LI Yun, Ph.D., professor. His research interests include trusted AI.

About author:: YAN Xing, Master student. Her research interests include AI security.

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	YAN Xing
	LI Yun

Cite this article:

YAN Xing,LI Yun. Clean-Label Backdoor Watermarking for Face Recognition Models[J]. Pattern Recognition and Artificial Intelligence, 2025, 38(10): 938-948.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202510006 OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2025/V38/I10/938

[1] LECUN Y, BENGIO Y, HINTON G. DeepLearning. Nature, 2015, 521(7553): 436-444.
[2] WEBB S.Deep Learning for Biology. Nature, 2018, 554(7693): 555-557.
[3] TEWARI A, BERNARD F, GARRIDO P, et al. FML: Face Model Learning from Videos//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 10804-10814.
[4] YAO A C.How to Generate and Exchange Secrets//Proc of the 27th Annual Symposium on Foundations of Computer Science. Wa-shington, USA: IEEE, 1986: 162-167.
[5] LEDERER I, MAYER R, RAUBER A.Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(10): 13082-13100.
[6] VAN DIJK M, GENTRY C, HALEVI S, et al. Fully Homomorphic Encryption over the Integers//Proc of the Annual International Conference on the Theory and Applications of Cryptographic Techniques. Berlin, Germany: Springer, 2010: 24-43.
[7] YAN Y F, PAN X D, ZHANG M, et al. Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation[C/OL].[2025-08-07]. https://www.usenix.org/system/files/usenixsecurity23-yan.pdf.
[8] LI Y M, BAI Y, JIANG Y, et al. Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection//Proc of the 36th International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: MIT Press, 2022: 13238-13250.
[9] ADI Y, BAUM C, CISSE M, et al. Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring[C/OL].[2025-08-07]. https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-adi.pdf.
[10] DOWLIN N, GILAD-BACHRACH R, LAINE K, et al. Crypto-Nets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy//Proc of the 33rd International Confe-rence on Machine Learning. San Diego, USA: JMLR, 2016: 201-210.
[11] ZHENG Y T, LIN Y R, LU Y Q, et al. Efficient Privacy-Preserving Machine Learning with Homomorphic Encryption through Pruning//Proc of the 8th International Conference on Artificial Intelligence and Big Data. Washington, USA: IEEE, 2025: 293-298.
[12] VAIDYA J, KANTARCIOĞLU M, CLIFTON C. Privacy-Preserving Naive Bayes Classification. VLDB Journal, 2008, 17(4): 879-898.
[13] DU W L, HAN Y S, CHEN S G.Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification//Proc of the SIAM International Conference on Data Mining. Philadelphia, USA: SIAM, 2004: 222-233.
[14] JAGANNATHAN G, WRIGHT R N.Privacy-Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data//Proc of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York, USA: ACM, 2005: 593-599.
[15] CHEN H L, ROUHANI B D, FU C, et al. DeepMarks: A Secure Fingerprinting Framework for Digital Rights Management of Deep Learning Models//Proc of the International Conference on Multimedia Retrieval. New York, USA: ACM, 2019: 105-113.
[16] UCHIDA Y, NAGAI Y, SAKAZAWA S, et al. Embedding Watermarks into Deep Neural Networks//Proc of the ACM International Conference on Multimedia Retrieval. New York, USA: ACM, 2017: 269-277.
[17] WANG T H, KERSCHBAUM F.Attacks on Digital Watermarks for Deep Neural Networks//Proc of the IEEE International Confe-rence on Acoustics, Speech and Signal Processing. Washington, USA: IEEE, 2019: 2622-2626.
[18] FAN L X, NG K W, CHAN C S, et al. DeepIPR: Deep Neural Network Ownership Verification with Passports. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 6122-6139.
[19] YANG P, LAO Y J, LI P.Robust Watermarking for Deep Neural Networks via Bi-level Optimization//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 14821-14830.
[20] ZHANG J, CHEN D D, LIAO J, et al. Passport-Aware Normalization for Deep Model Protection//Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 22619-22628.
[21] HUA G, TEOH A B J, XIANG Y, et al. Unambiguous and High-Fidelity Backdoor Watermarking for Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(8): 11204-11217.
[22] PENG W J, YI J W, WU F Z,et al. Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark//Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg. Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark//Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023, I: 7653-7668.
[23] ZHANG J L, GU Z S, JANG J Y, et al. Protecting Intellectual Property of Deep Neural Networks with Watermarking//Proc of the Asia Conference on Computer and Communications Security. New York, USA: ACM, 2018: 159-172.
[24] TURNER A, TSIPRAS D, MADRY A.Label-Consistent Backdoor Attacks[C/OL].[2025-08-07]. https://www.arxiv.org/pdf/1912.02771.
[25] ZHAO S H, MA X J, ZHENG X, et al. Clean-Label Backdoor Attacks on Video Recognition Models//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 14431-14440.
[26] ZHU M Y, LI Y M, GUO J F, et al. Towards Sample-Specific Backdoor Attack with Clean Labels via Attribute Trigger. IEEE Transactions on Dependable and Secure Computing, 2025, 22(5): 4685-4698.
[27] YANG P, CHEN J B, HSIEH C J, et al. ML-LOO: Detecting Adversarial Examples with Feature Attribution. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 6639-6647.
[28] CAO Q, SHEN L, XIE W D, et al. VGGFace2: A Dataset for Recognising Faces Across Pose and Age//Proc of the 13th IEEE International Conference on Automatic Face and Gesture Recognition. Washington, USA: IEEE, 2018: 67-74.
[29] LI S, DENG W H, DU J P.Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2584-2593.
[30] LIU Z W, LUO P, WANG X G, et al. Deep Learning Face Attri-butes in the Wild//Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 3730-3738.
[31] NGUYEN T A, TRAN A T.WaNet: Imperceptible Warping Based Backdoor Attack[C/OL].[2025-08-07]. https://arxiv.org/pdf/2102.10369.
[32] LI Y Z, LI Y M, WU B Y, et al. Invisible Backdoor Attack with Sample-Specific Triggers//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 16443-16452.
[33] TANCIK M, MILDENHALL B, NG R.StegaStamp: Invisible Hyperlinks in Physical Photographs//Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 2114-2123.