Abstract:In the pix2pix framework, edge line images are transformed into realistic ones. Different from the above, hand-drawing sketches are transformed into realistic images in this paper, which is more convenient for human-computer interaction. Firstly, bi-level cascading generative adversarial networks(GAN) are designed to implement the conversion task. The first-level GAN generates coarse-grained realistic images based on the information of the sketches, such as shape and semantic content. The second-level GAN converts the results of the first-level into more vivid high-resolution realistic images. Secondly, in view of the rare availability of "sketch-image" datasets for training the mentioned network, a method is proposed to generate simulated sketch data from a given image automatically. The sketch profile is obtained by improving the holistically nested edge detection algorithm(HED) and then deformed via moving least squares strategy to simulate characteristics of a sketch, such as discernible intention, simple lines and randomness. The experimental results show that using hand-drawing sketches as input, the proposed method outperforms the edge line training based method in terms of rationality and visual reality of the converted results. Moreover, the proposed simulated sketch generating method can be extended to other application areas related to sketch processing.
[1] 孙正兴,冯桂焕,周若鸿.基于草图的人机交互技术研究进展.计算机辅助设计与图形学学报, 2005, 17(9): 1889-1899. (SUN Z X, FENG G H, ZHOU R H. Techniques for Sketch-Based User Interface: Review and Research. Journal of Computer-Aided Design & Computer Graphics, 2005, 17(9): 1889-1899.) [2] CHEN T, CHENG M M, TAN P, et al. Sketch2Photo: Internet Image Montage. ACM Transactions on Graphics, 2009, 28(5). DOI: 10.1145/1618452.1618470. [3] XU K, CHEN K, FU H B, et al. Sketch2Scene: Sketch-Based Co-retrieval and Co-placement of 3D Models. ACM Transactions on Graphics, 2013, 32(4). DOI: 10.1145/2461912.2461968. [4] ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-Image Translation with Conditional Adversarial Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5967-5976. [5] CHRISTOPHER H. Image-to-Image Demo[EB/OL]. [2018-04-25]. https://affinelayer.com/pixsrv. [6] EITZ M, HAYS J, ALEXA M. How Do Humans Sketch Objects? ACM Transactions on Graphics, 2012, 31(4). DOI: 10.1145/2185520.2185540. [7] YU Q, LIU F, SONG Y Z, et al. Sketch Me That Shoe // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 799-807. [8] SANGKLOY P, BURNELL N, HAM C, et al. The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. ACM Transactions on Graphics, 2016, 35(4). DOI: 10.1145/2897824.2925954. [9] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Networks[C/OL]. [2018-04-25]. https://arxiv.org/pdf/1406.2661.pdf. [10] XIE S N, TU Z W. Holistically-Nested Edge Detection[J/OL]. [2018-04-25]. https://arxiv.org/pdf/1504.06375.pdf. [11] SCHAEFER S, MCPHAIL T, WARREN J. Image Deformation Using Moving Least Squares // Proc of the ACM SIGGRAPH. New York, USA: ACM, 2006: 533-540. [12] 王坤峰,苟 超,段艳杰,等.生成式对抗网络GAN的研究进展与展望.自动化学报, 2017, 43(3): 321-332. (WANG K F, GOU C, DUAN Y J, et al. Generative Adversarial Networks: The State of the Art and Beyond. Acta Automatica Sinica, 2017, 43(3): 321-332.) [13] ZHANG H, XU T, LI H, et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks[C/OL]. [2018-04-25]. https://arxiv.org/pdf/1612.03242v1.pdf. [14] PATHAK D, KRHENBHL P, DONAHUE J, et al. Context Encoders: Feature Learning by Inpainting // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2536-2544. [15] RADFORD A, METZ L, CHINTALA S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J/OL]. [2018-04-25]. https://arxiv.org/pdf/1511.06434.pdf. [16] JOHNSON J, ALAHI A, LI F F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 694-711. [17] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241. [18] NILSBACK M E, ZISSERMAN A. Automated Flower Classification over a Large Number of Classes // Proc of the 6th Indian Confe-rence on Computer Vision, Graphics & Image Processing. Washington, USA: IEEE, 2008: 722-729. [19] FARBMAN Z, FATTAL R, LISCHINSKI D, et al. Edge-Preserving Decompositions for Multi-scale Tone and Detail Manipulation. ACM Transactions on Graphics, 2008, 27(3). DOI: 10.1145/1360612.1360666. [20] ARPA S, RITSCHEL T, MYSZKOWSKI K, et al. Purkinje Images: Conveying Different Content for Different Luminance Adaptations in a Single Image. Computer Graphics Forum, 2015, 34(1):116-126. [21] CHEN F M, XIAO X H, ZHANG D. Data-Driven Facial Beauty Analysis: Prediction, Retrieval and Manipulation. IEEE Transactions on Affective Computing, 2018, 9(2): 205-216. [22] ZARAGOZA J, CHIN T J, TRAN Q H, et al. As-Projective-As-Possible Image Stitching with Moving DLT. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1285-1298.