Deepfake Detection Method Combining Multi-stage Feature Disentanglement and Frequency-Domain Information
LIN Liwei1,2, LI Yang1,2, ZHU Hengliang1,2, WANG Mengqiang1,2, HUANG Chuan3, CHEN Jianwei3, ZHANG Jing1,2, CHEN Bixia1,2
1. Fujian Provincial Key Laboratory of Big Data Mining and App-lications, Fujian University of Technology, Fuzhou 350118; 2. School of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118; 3. College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350117
Abstract:Deepfake detection is faced with significant challenges due to its limited generalization capability and poor adaptability to unseen forgery techniques. To address these issues, a deepfake detection method combining multi-stage feature disentanglement and frequency-domain information(MFD-FD) is proposed. First, a hierarchical feature disentanglement strategy is designed. By introducing a forgery suppression loss and a reconstruction loss, content features are progressively separated from artifact features from shallow to deep layers. Thus, the coupling between the two features is effectively reduced with critical information preserved and the model can focus on more purified artifact representations. Next, frequency domain information is introduced to compensate for the deficiency of spatial features in spectral information, thereby enhancing the detection stability of the model against perturbations such as image compression. Finally, a frequency-domain fusion data augmentation method based on a cosine transition mask is presented to enhance the model robustness by synthesizing diverse forged samples. Extensive experiments demonstrate that MFD-FD outperforms the state-of-the-art methods in both generalization and robustness.
[1] THIES J, ZOLLHÖFER M, NIEβNER M. Deferred Neural Rende-ring: Image Synthesis Using Neural Textures. ACM Transactions on Graphics(TOG), 2019, 38(4). DOI: 10.1145/3306346.3323035. [2] CHAN C, GINOSAR S, ZHOU T H, et al. Everybody Dance Now // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 5932-5941. [3] ROSSLER A, COZZOLINO D, VERDOLIVA L, et al. FaceForensics++: Learning to Detect Manipulated Facial Images // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2019. DOI: 10.1109/ICCV.2019.00009. [4] ZHAO H Q, WEI T Y, ZHOU W B, et al. Multi-attentional Deepfake Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 2185-2194. [5] SHIOHARA K, YAMASAKI T.Detecting Deepfakes with Self-Blended Images // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 18699-18708. [6] CHEN L, ZHANG Y, SONG Y B, et al. Self-Supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 18689-18698. [7] FRANK J, EISENHOFER T, SCHÖNHERR L, et al. Leveraging Frequency Analysis for Deep Fake Image Recognition // Proc of the 37th International Conference on Machine Learning. San Diego, USA: JMLR, 2020: 3247-3258. [8] QIAN Y Y, YIN G J, SHENG L, et al. Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 86-103. [9] MASI I, KILLEKAR A, MASCARENHAS R M, et al. Two-Branch Recurrent Network for Isolating Deepfakes in Videos // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 667-684. [10] HU J S, WANG S L, LI X Y.Improving the Generalization Ability of Deepfake Detection via Disentangled Representation Learning // Proc of the IEEE International Conference on Image Processing. Washington, USA: IEEE, 2021: 3577-3581. [11] LIANG J H, SHI H F, DENG W H.Exploring Disentangled Content Information for Face Forgery Detection // Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2022: 128-145. [12] ZHOU P, HAN X T, MORARIU V I, et al. Two-Stream Neural Networks for Tampered Face Detection // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2017: 1831-1839. [13] WODAJO D, ATNAFU S.Deepfake Video Detection Using Convolutional Vision Transformer[C/OL]. [2025-10-19].https://arxiv.org/pdf/2102.11126. [14] LI L Z, BAO J M, ZHANG T, et al. Face X-ray for More General Face Forgery Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 5001-5010. [15] LI J M, XIE H T, LI J H, et al. Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 6454-6463. [16] LIU H G, LI X D, ZHOU W B, et al. Spatial-Phase Shallow Lear-ning: Rethinking Face Forgery Detection in Frequency Domain // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 772-781. [17] BENGIO Y, COURVILLE A, VINCENT P.Representation Lear-ning: A Review and New Perspectives. IEEE Transactions on Pa-ttern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828. [18] YAN Z Y, ZHANG Y, FAN Y B, et al. UCF: Uncovering Co-mmon Features for Generalizable Deepfake Detection // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2023: 22412-22423. [19] LI Y Z, LÜ S W.Exposing Deepfake Videos by Detecting Face Warping Artifacts[C/OL]. [2025-10-19].https://arxiv.org/pdf/1811.00656. [20] ZHAO T C, XU X, XU M Z, et al. Learning Self-Consistency for Deepfake Detection // Proc of the IEEE/CVF International Confe-rence on Computer Vision. Washington, USA: IEEE, 2021: 15003-15013. [21] BAI W M, LIU Y F, ZHANG Z P, et al. AUNet: Learning Relations between Action Units for Face Forgery Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 24709-24719. [22] YAN Z Y, LUO Y H, LÜ S W, et al. Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 8984-8994. [23] YU M M, LI H Y, YANG J X, et al. FDML: Feature Disentangling and Multi-view Learning for Face Forgery Detection. Neurocomputing, 2024, 572. DOI: 10.1016/j.neucom.2023.127192. [24] YANG Y C, SOATTO S.FDA: Fourier Domain Adaptation for Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 4084-4094. [25] LI Y Z, YANG X, SUN P, et al. Celeb-DF: A Large-Scale Cha-llenging Dataset for Deepfake Forensics // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3207-3216. [26] LI L Z, BAO J M, YANG H, et al. FaceShifter: Towards High Fidelity and Occlusion Aware Face Swapping[C/OL].[2025-10-19]. https://arxiv.org/pdf/1912.13457. [27] DFD[DB/OL]. [2025-10-19]. https://research.google/blog/contributing-data-to-deepfake-detection-research. [28] DOLHANSKY B, HOWES R, PFLAUM B, et al. The Deepfake Detection Challenge(DFDC) Preview Dataset[DB/OL].[2025-10-19]. https://arxiv.org/pdf/1910.08854. [29] DOLHANSKY B, BITTON J, PFLAUM B, et al. The Deepfake Detection Challenge(DFDC) Dataset[DB/OL].[2025-10-19]. https://arxiv.org/pdf/2006.07397. [30] CHOLLET F.Xception: Deep Learning with Depthwise Separable Convolutions // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 1800-1807. [31] KING D E.Dlib-ml: A Machine Learning Toolkit. The Journal of Machine Learning Research, 2009, 10: 1755-1758. [32] ZI B J, CHANG M H, CHEN J J, et al. WildDeepfake: A Cha-llenging Real-World Dataset for Deepfake Detection // Proc of the 28th ACM International Conference on Multimedia. New York, USA: ACM, 2020: 2382-2390. [33] CAO J Y, MA C, YAO T P, et al. End-to-End Reconstruction-Classification Learning for Face Forgery Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 4103-4112. [34] XU Y T, LIANG J, SHENG L J, et al. Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection. International Journal of Computer Vision, 2024, 132(12): 5663-5680. [35] LIU B Y, ZHANG X, LING H F, et al. AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2025, 7(3): 422-431. [36] WANG Z Y, CHEN Y X, YAO Y Z, et al. IDCNet: Image Decomposition and Cross-View Distillation for Generalizable Deepfake Detection. IEEE Transactions on Information Forensics and Security, 2025, 20: 8373-8386. [37] DANG H, LIU F, STEHOUWER J, et al. On the Detection of Digital Face Manipulation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 5780-5789. [38] LUO Y C, ZHANG Y, YAN J C, et al. Generalizing Face Forgery Detection with High-Frequency Features // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 16312-16321. [39] KONG C Q, CHEN B L, LI H L, et al. Detect and Locate: Exposing Face Manipulation by Semantic-and Noise-Level Telltales. IEEE Transactions on Information Forensics and Security, 2022, 17: 1741-1756. [40] NI Y S, MENG D P, YU C Q, et al. CORE: Consistent Representation Learning for Face Forgery Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 12-21. [41] WANG T Y, CHOW K P.Noise Based Deepfake Detection via Multi-head Relative-Interaction. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(12): 14548-14556. [42] ZHANG D Y, HE R Y, LIAO X, et al. Face Forgery Detection Based on Fine-Grained Clues and Noise Inconsistency. IEEE Transactions on Artificial Intelligence, 2025, 6(1): 144-158. [43] TIAN J H, CHEN P, YU C, et al. Learning to Discover Forgery Cues for Face Forgery Detection. IEEE Transactions on Information Forensics and Security, 2024, 19: 3814-3828. [44] WANG J K, WU Z X, OUYANG W H, et al. M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection // Proc of the International Conference on Multimedia Retrieval. New York, USA: ACM, 2022: 615-623. [45] LUO A W, KONG C Q, HUANG J W, et al. Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery Detection. IEEE Transactions on Information Forensics and Security, 2024, 19: 1168-1182. [46] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 618-626. [47] SHUAI C, ZHONG J M, WU S, et al. Locate and Verify: A Two-Stream Network for Improved Deepfake Detection // Proc of the 31st ACM International Conference on Multimedia. New York, USA: ACM, 2023: 7131-7142. [48] CHENG H, PANG W Y, LI K, et al. EFIMD-Net: Enhanced Feature Interaction and Multi-domain Fusion Deep Forgery Detection Network. Journal of Imaging, 2025, 11(9). DOI: 10.3390/jimaging11090312.