基于双层级联GAN的草图到真实感图像的异质转换

doi:10.16451/j.cnki.issn1003-6059.201810002

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1586 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要不同于pix2pix框架将边缘线条图转换为真实感图像的工作,文中探讨更便于人机交互的手绘草图到真实感图像的转换问题.首先设计双层级联生成对抗网络(GAN)结构实现转换任务.第一层GAN根据草图的形态结构、语义内容等信息生成粗粒度真实感图像,第二层GAN将第一层的结果转换为更生动形象的高分辨真实感图像.然后,针对上述网络训练中现有可用的“草图-图像”数据集十分稀缺的问题,提出依据给定的图像自动生成模拟草图数据的方法.通过改进HED边缘检测算法获得草图轮廓,并采用移动最小二乘策略对轮廓进行变形,模拟草图形态意向可辨、线条简洁、具有随意性的特点.实验表明,将手绘草图作为输入,文中方法转换结果图的合理性和视觉真实感优于基于边缘线条图训练的方法.此外,文中方法可以推广到涉及草图处理的其它应用领域.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	蔡雨婷
	陈昭炯
	叶东毅

关键词 ：手绘草图, 异质转换, 级联生成对抗网络, 真实感图像, HED边缘检测算法

Abstract：In the pix2pix framework, edge line images are transformed into realistic ones. Different from the above, hand-drawing sketches are transformed into realistic images in this paper, which is more convenient for human-computer interaction. Firstly, bi-level cascading generative adversarial networks(GAN) are designed to implement the conversion task. The first-level GAN generates coarse-grained realistic images based on the information of the sketches, such as shape and semantic content. The second-level GAN converts the results of the first-level into more vivid high-resolution realistic images. Secondly, in view of the rare availability of "sketch-image" datasets for training the mentioned network, a method is proposed to generate simulated sketch data from a given image automatically. The sketch profile is obtained by improving the holistically nested edge detection algorithm(HED) and then deformed via moving least squares strategy to simulate characteristics of a sketch, such as discernible intention, simple lines and randomness. The experimental results show that using hand-drawing sketches as input, the proposed method outperforms the edge line training based method in terms of rationality and visual reality of the converted results. Moreover, the proposed simulated sketch generating method can be extended to other application areas related to sketch processing.

收稿日期: 2018-05-14

ZTFLH:

TP 391.41

基金资助:国家自然科学基金项目(No.61672158)、福建省自然基金项目(No.2018J1798,2016J05155)资助

作者简介: 蔡雨婷,硕士研究生,主要研究方向为图像处理.E-mail:478301807@qq.com.;陈昭炯(通讯作者),硕士,教授,主要研究方向为智能图像处理、计算智能.E-mail:chenzj@fzu.edu.cn.;叶东毅,博士,教授,主要研究方向为计算智能、数据挖掘.E-mail:yiedy@fzu.edu.cn.

引用本文:

蔡雨婷,陈昭炯,叶东毅. 基于双层级联GAN的草图到真实感图像的异质转换[J]. 模式识别与人工智能, 2018, 31(10): 877-886. CAI Yuting, CHEN Zhaojiong, YE Dongyi. Bi-level Cascading GAN-Based Heterogeneous Conversion of Sketch-to-Realistic Images. , 2018, 31(10): 877-886.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201810002 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2018/V31/I10/877

[1] 孙正兴,冯桂焕,周若鸿.基于草图的人机交互技术研究进展.计算机辅助设计与图形学学报, 2005, 17(9): 1889-1899.
(SUN Z X, FENG G H, ZHOU R H. Techniques for Sketch-Based User Interface: Review and Research. Journal of Computer-Aided Design & Computer Graphics, 2005, 17(9): 1889-1899.)
[2] CHEN T, CHENG M M, TAN P, et al. Sketch2Photo: Internet Image Montage. ACM Transactions on Graphics, 2009, 28(5). DOI: 10.1145/1618452.1618470.
[3] XU K, CHEN K, FU H B, et al. Sketch2Scene: Sketch-Based Co-retrieval and Co-placement of 3D Models. ACM Transactions on Graphics, 2013, 32(4). DOI: 10.1145/2461912.2461968.
[4] ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-Image Translation with Conditional Adversarial Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5967-5976.
[5] CHRISTOPHER H. Image-to-Image Demo[EB/OL]. [2018-04-25]. https://affinelayer.com/pixsrv.
[6] EITZ M, HAYS J, ALEXA M. How Do Humans Sketch Objects? ACM Transactions on Graphics, 2012, 31(4). DOI: 10.1145/2185520.2185540.
[7] YU Q, LIU F, SONG Y Z, et al. Sketch Me That Shoe // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 799-807.
[8] SANGKLOY P, BURNELL N, HAM C, et al. The Sketchy Database: Learning to Retrieve Badly Drawn Bunnies. ACM Transactions on Graphics, 2016, 35(4). DOI: 10.1145/2897824.2925954.
[9] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Networks[C/OL]. [2018-04-25]. https://arxiv.org/pdf/1406.2661.pdf.
[10] XIE S N, TU Z W. Holistically-Nested Edge Detection[J/OL]. [2018-04-25]. https://arxiv.org/pdf/1504.06375.pdf.
[11] SCHAEFER S, MCPHAIL T, WARREN J. Image Deformation Using Moving Least Squares // Proc of the ACM SIGGRAPH. New York, USA: ACM, 2006: 533-540.
[12] 王坤峰,苟超,段艳杰,等.生成式对抗网络GAN的研究进展与展望.自动化学报, 2017, 43(3): 321-332.
(WANG K F, GOU C, DUAN Y J, et al. Generative Adversarial Networks: The State of the Art and Beyond. Acta Automatica Sinica, 2017, 43(3): 321-332.)
[13] ZHANG H, XU T, LI H, et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks[C/OL]. [2018-04-25]. https://arxiv.org/pdf/1612.03242v1.pdf.
[14] PATHAK D, KRHENBHL P, DONAHUE J, et al. Context Encoders: Feature Learning by Inpainting // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2536-2544.
[15] RADFORD A, METZ L, CHINTALA S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J/OL]. [2018-04-25]. https://arxiv.org/pdf/1511.06434.pdf.
[16] JOHNSON J, ALAHI A, LI F F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 694-711.
[17] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241.
[18] NILSBACK M E, ZISSERMAN A. Automated Flower Classification over a Large Number of Classes // Proc of the 6th Indian Confe-rence on Computer Vision, Graphics & Image Processing. Washington, USA: IEEE, 2008: 722-729.
[19] FARBMAN Z, FATTAL R, LISCHINSKI D, et al. Edge-Preserving Decompositions for Multi-scale Tone and Detail Manipulation. ACM Transactions on Graphics, 2008, 27(3). DOI: 10.1145/1360612.1360666.
[20] ARPA S, RITSCHEL T, MYSZKOWSKI K, et al. Purkinje Images: Conveying Different Content for Different Luminance Adaptations in a Single Image. Computer Graphics Forum, 2015, 34(1):116-126.
[21] CHEN F M, XIAO X H, ZHANG D. Data-Driven Facial Beauty Analysis: Prediction, Retrieval and Manipulation. IEEE Transactions on Affective Computing, 2018, 9(2): 205-216.
[22] ZARAGOZA J, CHIN T J, TRAN Q H, et al. As-Projective-As-Possible Image Stitching with Moving DLT. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1285-1298.