A Survey of Image Stylization Methods Based on Deep Neural Networks
TU Pengqi1, GAO Changxin1, SANG Nong1
1. Key Laboratory on Image Information Processing and Intelligent Control of Ministry of Education, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074
Abstract:Image stylization aims to transform an image from one style to another with the semantic content retained by stylization models. Inspired by the powerful feature extraction and expression capabilities of deep neural networks, various image stylization methods based on deep neural networks are proposed successively. In this paper, image stylization methods based on deep neural networks are divided into reference-based and domain-based image stylization methods according to the definition of style, and the related references are summarized. Different from the existing related reviews, this paper only focuses on image stylization methods based on deep neural networks, and these methods are classified comprehensively and in detail from the perspective of style definition. Finally, experimental results of current representative research on commonly used datasets of image stylization task are summarized, the problems of the existing methods are analyzed, and the research in the future is prospected.
涂鹏琦, 高常鑫, 桑农. 基于深度神经网络的图像风格化方法综述[J]. 模式识别与人工智能, 2022, 35(4): 333-347.
TU Pengqi, GAO Changxin, SANG Nong. A Survey of Image Stylization Methods Based on Deep Neural Networks. Pattern Recognition and Artificial Intelligence, 2022, 35(4): 333-347.
[1] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein Generative Adversarial Networks // Proc of the 34th International Conference on Machine Learning. San Diego, USA: JMLR, 2017: 214-223. [2] BERTHELOT D, SCHUMM T, METZ L.BEGAN: Boundary Equilibrium Generative Adversarial Networks[C/OL]. [2022-01-12].https://arxiv.org/pdf/1703.10717.pdf. [3] ZHAO J B, MATHIEU M, LECUN Y.Energy-Based Generative Adversarial Networks[C/OL]. [2022-01-12].https://arxiv.org/pdf/1609.03126.pdf. [4] KARRAS T, AILA T, LAINE S, et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation[C/OL].[2022-01-12]. https://arxiv.org/pdf/1710.10196.pdf. [5] BROCK A, DONAHUE J, SIMONYAN K.Large Scale GAN Trai-ning for High Fidelity Natural Image Synthesis[C/OL]. [2022-01-12].https://arxiv.org/pdf/1809.11096.pdf. [6] 年福东,王文涛,王妍,等.基于关键点表示的语音驱动说话人脸视频生成.模式识别与人工智能, 2021, 34(6): 572-580. (NIAN F D, WANG W T, WANG Y, et al. Speech Driven Talking Face Video Generation via Landmarks Representation. Pattern Re-cognition and Artificial Intelligence, 2021, 34(6): 572-580.) [7] 周星宇,潘志松,胡谷雨,等.局部可视对抗扰动生成方法.模式识别与人工智能, 2020, 33(1): 11-20. (ZHOU X Y, PAN Z S, HU G Y, et al. Generation of Localized and Visible Adversarial Perturbations. Pattern Recognition and Artificial Intelligence, 2020, 33(1): 11-20.) [8] 杜秋平,刘群.基于图像云模型语义标注的条件生成对抗网络.模式识别与人工智能, 2018, 31(4): 379-388. (DU Q P, LIU Q. Conditional Generative Adversarial Network Based on Image Semantic Annotation of Cloud Model. Pattern Re-cognition and Artificial Intelligence, 2018, 31(4): 379-388.) [9] ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-Image Translation with Conditional Adversarial Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5967-5976. [10] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2242-2251. [11] CHOI Y, CHOI M, KIM M, et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2018: 8789-8797. [12] WANG T C, LIU M Y, ZHU J Y, et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8798-8807. [13] EMAMI H, ALIABADI M M, DONG M, et al. SPA-GAN: Spatial Attention GAN for Image-to-Image Translation. IEEE Transactions on Multimedia, 2021, 23: 391-401. [14] DESHPANDE A, LU J J, YEH M C, et al. Learning Diverse Image Colorization // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6837-6845. [15] CHO W, BAHNG H, PARK D K, et al. Text2Colors: Guiding Image Colorization through Text-Driven Palette Generation[C/OL].[2022-01-12]. https://arxiv.org/pdf/1804.04128v1.pdf. [16] YOO S, BAHNG H, CHUNG S, et al. Coloring with Limited Data: Few-Shot Colorization via Memory Augmented Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2019: 11275-11284. [17] HUANG Y F, QIU S, WANG C B, et al. Learning Representations for High-Dynamic-Range Image Color Transfer in a Self-Supervised Way. IEEE Transactions on Multimedia, 2021, 23: 176-188. [18] DONG C, LOY C C, HE K M, et al. Learning a Deep Convolutional Network for Image Super-Resolution // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 184-199. [19] LEDIG C, THEIS L, HUSZÁR F, et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2017: 105-114. [20] HOFFMAN J, TZENG E, PARK T, et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation // Proc of the 35th International Conference on Machine Learning. San Diego, USA: JMLR, 2018: 1989-1998. [21] LI X T, LIU S F, KAUTZ J, et al. Learning Linear Transformations for Fast Arbitrary Style Transfer[C/OL].[2022-01-12]. https://arxiv.org/pdf/1808.04537.pdf. [22] 蔡雨婷,陈昭炯,叶东毅.基于双层级联GAN的草图到真实感图像的异质转换.模式识别与人工智能, 2018, 31(10): 877-886. (CAI Y T, CHEN Z J, YE D Y. Bi-level Cascading GAN-Based Heterogeneous Conversion of Sketch-to-Realistic Images. Pattern Recognition and Artificial Intelligence, 2018, 31(10): 877-886.) [23] PERARNAU G, VAN DE WEIJER J, RADUCANU B, et al. Invertible Conditional GANs for Image Editing[C/OL].[2022-01-12]. https://arxiv.org/pdf/1611.06355.pdfhttps://arxiv.org/pdf/1611.06355.pdf. [24] GORIJALA M, DUKKIPATI A.Image Generation and Editing with Variational info Generative Adversarial Networks[C/OL]. [2022-01-12].https://arxiv.org/pdf/1701.04568.pdf. [25] CHENG Y, GAN Z, LI Y T, et al. Sequential Attention GAN for Interactive Image Editing // Proc of the 28th ACM International Conference on Multimedia. New York, USA: ACM, 2020: 4383-4391. [26] NGÔ L M, KARAOĜLU S, GEVERS T. Self-Supervised Face Image Manipulation by Conditioning GAN on Face Decomposition. IEEE Transactions on Multimedia, 2022, 24: 377-385. [27] GATYS L A, ECKER A S, BETHGE M. Image Style Transfer Using Convolutional Neural Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2414-2423. [28] HUANG X, BELONGIE S. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 1510-1519. [29] GU S Y, CHEN C L, LIAO J, et al. Arbitrary Style Transfer with Deep Feature Reshuffle // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8222-8231. [30] KOTOVENKO D, SANAKOYEU A, MA P C, et al. A Content Transformation Block for Image Style Transfer // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 10024-10033. [31] 盛家川,董玙璠,李小妹,等.中国肖像画的风格转移算法.模式识别与人工智能, 2021, 34(6): 509-521. (SHENG J C, DONG Y F, LI X M, et al. Chinese Portrait Pain-ting Style Transfer Algorithm. Pattern Recognition and Artificial Intelligence, 2021, 34(6): 509-521.) [32] 王楠楠,李洁,高新波.人脸画像合成研究的综述与对比分析.模式识别与人工智能, 2018, 31(1): 37-48. (WANG N N, LI J, GAO X B. A Review and Comparison Study on Face Sketch Synthesis. Pattern Recognition and Artificial Inte-lligence, 2018, 31(1): 37-48.) [33] GATYS L A, ECKER A S, BETHGE M. Texture Synthesis Using Convolutional Neural Networks // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2015: 262-270. [34] JOHNSON J, ALAHI A, LI F F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 694-711. [35] ULYANOV D, LEBEDEV V, VEDALDI A, et al. Texture Networks: Feed-Forward Synthesis of Textures and Stylized Images // Proc of the 33rd International Conference on Machine Learning. San Diego, USA: JMLR, 2016: 1349-1357. [36] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative Adversarial Nets // Proc of the 27th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 2672-2680. [37] LI C, WAND M. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2479-2486. [38] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2022-01-12]. https://arxiv.org/pdf/1409.1556.pdf. [39] HUANG X, LIU M Y, BELONGIE S, et al. Multimodal Unsupervised Image-to-Image Translation // Proc of the European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2018: 179-196. [40] LEE H Y, TSENG H Y, HUANG J B, et al. Diverse Image-to-Image Translation via Disentangled Representations // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 36-52. [41] CHEN T Q, SCHMIDT M.Fast Patch-Based Style Transfer of Arbitrary Style[C/OL]. [2022-01-12].https://arxiv.org/pdf/1612.04337.pdf. [42] LI Y J, FANG C, YANG J M, et al. Universal Style Transfer via Feature Transforms // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2017: 385-395. [43] PARK D Y, LEE K H. Arbitrary Style Transfer with Style-Attentional Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5873-5881. [44] AN J, HUANG S Y, SONG Y B, et al. ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 862-871. [45] LIU M Y, BREUEL T, KAUTZ J. Unsupervised Image-to-Image Translation Networks // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2017: 700-708. [46] KINGMA D P, WELLING M.Auto-Encoding Variational Bayes[C/OL]. [2022-01-12].https://arxiv.org/pdf/1312.6114v10.pdf. [47] KIM J, KIM M, KANG H, et al. U-GAT-IT: Unsupervised Ge-nerative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation[C/OL].[2022-01-12]. https://arxiv.org/pdf/1907.10830.pdf. [48] NIZAN O, TAL A. Breaking the Cycle-Colleagues Are All You Need // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 7857-7866. [49] SHENG L, LIN Z Y, SHAO J, et al. Avatar-Net: Multi-scale Zero-Shot Style Transfer by Feature Decoration // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8242-8250. [50] PARK T, EFROS A A, ZHANG R, et al. Contrastive Learning for Unpaired Image-to-Image Translation // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 319-345. [51] SHAO X N, ZHANG W D. SPatchGAN: A Statistical Feature Based Discriminator for Unsupervised Image-to-Image Translation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 6546-6555. [52] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception Architecture for Computer Vision // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2818-2826.