平行图像:图像生成的一个新型理论框架<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201707001

摘要
图/表
参考文献(0)
相关文章 (15)

全文: PDF (4273 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要为了提高计算机视觉系统的泛化能力，要求利用大规模、多样化、带标注的图像数据集，对视觉模型进行充分的学习与评估.由于从实际场景中获取图像具有局限性，文中提出一种图像生成理论框架，称为平行图像.平行图像的核心单元是软件定义的人工图像系统.从实际场景中获取特定的图像“小数据”，输入人工图像系统，生成大量新的人工图像数据.文中总结平行图像的实现方法，包括图形渲染、图像风格迁移、生成式模型等，并且对比分析人工图像和实际图像的特点，讨论领域适应策略.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	王坤峰
	鲁越
	王雨桐
	熊子威
	王飞跃

关键词 ：平行图像, 模型学习, 图形渲染, 图像风格迁移, 生成式模型

Abstract：To build computer vision systems with good generalization ability, large-scale, diversified, and annotated image data are required for learning and evaluating the in-hand computer vision models. Since it is difficult to obtain satisfying image data from real scenes, a new theoretical framework for image generation is proposed, which is called parallel imaging. The core component of parallel imaging is various software-defined artificial imaging systems. Artificial imaging systems receive small-scale image data collected from real scenes, and then generate large amounts of artificial image data. In this paper, the realization methods of parallel imaging are summarized, including graphics rendering, image style transfer, generative models, etc. Furthermore, the characteristics of artificial images and actual images are analyzed and the domain adaptation strategies are discussed.

Key words： Parallel Imaging Model Learning Graphics Rendering Image Style Transfer Generative Models

收稿日期: 2017-07-07

ZTFLH:

TP 391

基金资助:国家自然科学基金项目(No.61533019,91520301,71232006)资助

作者简介: 王坤峰,男,1982年生,博士,副研究员, 主要研究方向为智能交通系统、智能视觉计算、机器学习.E-mail:kunfeng.wang@ia.ac.cn.
鲁越,男,1994年生,硕士研究生,主要研究方向为平行视觉、机器学习、生成式对抗网络.E-mail:luyue2016@ia.ac.cn.
王雨桐,女,1994年生,博士研究生,主要研究方向为计算机图形学、图像处理、智能交通系统.E-mail:wangyutong2016@ia.ac.cn.
熊子威,男,1996年生,本科生,主要研究方向为机器学习、图像处理.E-mail:noahxiong@outlook.com.
王飞跃(通讯作者),男,1961年生,博士,研究员,主要研究方向为智能系统和复杂系统的建模、分析与控制.E-mail:feiyue.wang@ia.ac.cn.

引用本文:

王坤峰，鲁越，王雨桐，熊子威，王飞跃. 平行图像:图像生成的一个新型理论框架^*[J]. 模式识别与人工智能, 2017, 30(7): 577-587. WANG Kunfeng, LU Yue, WANG Yutong, XIONG Ziwei, WANG Fei-Yue. Parallel Imaging: A New Theoretical Framework for Image Generation. , 2017, 30(7): 577-587.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201707001 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2017/V30/I7/577

[1] WANG K F, LIU Y Q, GOU C, et al. A Multi-view Learning App-roach to Foreground Detection for Traffic Surveillance Applications. IEEE Transactions on Vehicular Technology, 2016, 65(6): 4144-4158.
[2] WANG K F, YAO Y J. Video-Based Vehicle Detection Approach with Data-Driven Adaptive Neuro-Fuzzy Networks. International Journal of Pattern Recognition and Artificial Intelligence, 2015, 29(7). DOI: 10.1142/S0218001415550150.
[3] GOU C, WANG K F, YAO Y J, et al. Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(4): 1096-1107.
[4] LIU Y Q, WANG K F, SHEN D Y. Visual Tracking Based on Dynamic Coupled Conditional Random Field Model. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(3): 822-833.
[5] 王坤峰,苟超,王飞跃.平行视觉:基于ACP的智能视觉计算方法.自动化学报, 2016, 42(10): 1490-1500.
(WANG K F, GOU C, WANG F Y. Parallel Vision: An ACP-Based Approach to Intelligent Vision Computing. Acta Automatica Sinica, 2016, 42(10): 1490-1500.)
[6] 王飞跃.平行系统方法与复杂系统的管理和控制.控制与决策, 2004, 19(5): 485-489.
(WANG F Y. Parallel System Methods for Management and Control of Complex Systems. Control and Decision, 2004, 19(5): 485-489.)
[7] WANG F Y. Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(3): 630-638.
[8] 王飞跃.平行控制:数据驱动的计算控制方法.自动化学报, 2013, 39(4): 293-302.
(WANG F Y. Parallel Control: A Method for Data-Driven and Computational Control. Acta Automatica Sinica, 2013, 39(4): 293-302.)
[9] MIAO Q H, ZHU F H, L Y S, et al. A Game-Engine-Based Platform for Modeling and Computing Artificial Transportation Systems. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(2): 343-353.
[10] WANG F Y, ZHANG J J, ZHENG X H, et al. Where Does AlphaGo Go: From Church-Turing Thesis to AlphaGo Thesis and Beyond. IEEE/CAA Journal of Automatica Sinica, 2016, 3(2): 113-120.
[11] ZHANG N, WANG F Y, ZHU F H, et al. DynaCAS: Computational Experiments and Decision Support for ITS. IEEE Intelligent Systems, 2008, 23(6): 19-23.
[12] WANG F Y, WANG X, LI L X, et al. Steps toward Parallel Inte-lligence. IEEE/CAA Journal of Automatica Sinica, 2016, 3(4): 345-348.
[13] TORRALBA A, EFROS A A. Unbiased Look at Dataset Bias // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2011: 1521-1528.
[14] MODEL I, SHAMIR L. Comparison of Data Set Bias in Object Recognition Benchmarks. IEEE Access, 2015, 3: 1953-1962.
[15] 李力,林懿伦,曹东璞,等.平行学习——机器学习的一个新型理论框架.自动化学报, 2017, 43(1): 1-8.
(LI L, LIN Y L, CAO D P, et al. Parallel Learning-A New Framework for Machine Learning. Acta Automatica Sinica, 2017, 43(1): 1-8.)
[16] LI L, LIN Y, ZHENG N, et al. Parallel Learning: a Perspective and a Framework. IEEE/CAA Journal of Automatica Sinica, 2017, 4(3): 389-395.
[17] LUAN F J, PARIS S, SHECHTMAN E, et al. Deep Photo Style Transfer[J/OL]. [2017-06-25]. https://www.cs.cornell.edu/~fujun/files/style-cvpr17/style-cvpr17.pdf.
[18] HERTZMANN A, JACOBS C E, OLIVER N, et al. Image Analogies // Proc of the 28th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA: ACM, 2001: 327-340.
[19] OKURA F, VANHOEY K, BOUSSEAU A, et al. Unifying Color and Texture Transfer for Predictive Appearance Manipulation // Proc of the Eurographics Symposium on Rendering. Berlin, Germany: Springer, 2015: 53-63.
[20] EFROS A A, FREEMAN W T. Image Quilting for Texture Synthesis and Transfer // Proc of the 28th Annual Conference on Compu-ter Graphics and Interactive Techniques. New York, USA: ACM, 2001: 341-346.
[21] ASHIKHMIN N. Fast Texture Transfer. IEEE Computer Graphics and Applications, 2003, 23(4): 38-43.
[22] GATYS L A, ECKER A S, BETHGE M. Image Style Transfer Using Convolutional Neural Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2414-2423.
[23] 王坤峰,苟超,段艳杰,等.生成式对抗网络GAN的研究进展与展望.自动化学报, 2017, 43(3): 321-332.
(WANG K F, GOU C, DUAN Y J, et al. Generative Adversarial Networks: The State of the Art and Beyond. Acta Automatica Sinica, 2017, 43(3): 321-332.)
[24] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Ge-nerative Adversarial Nets // GHAHRAMANI Z, WELLING M, CORTES C, et al., eds. Advances in Neural Information Processing Systems 27. Cambridge, USA: The MIT Press, 2014: 2672-2680.
[25] BERTHELOT D, SCHUMM T, METZ L. BEGAN: Boundary Equilibrium Generative Adversarial Networks[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1703.10717.pdf
[26] ZHAO J B, MATHIEU M, LECUN Y. Energy-Based Generative Adversarial Network[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1609.03126.pdf.
[27] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein GAN
[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1701.07875v1.pdf.
[28] SHRIVASTAVA A, PFISTER T, TUZEL O, et al. Learning from Simulated and Unsupervised Images through Adversarial Training[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1612.07828.pdf.
[29] ZHU J Y, PARK T, ISOLA P, et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1703.10593.pdf
[30] CHEN W Z, WANG H, LI Y Y, et al. Synthesizing Training Images for Boosting Human 3D Pose Estimation // Proc of the 4th International Conference on 3D Vision. Washington, USA: IEEE, 2016: 479-488.
[31] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-Adversarial Training of Neural Networks. Journal of Machine Learning Research, 2016, 17: 1-35.
[32] GHIFARY M. Domain Adaptation and Domain Generalization with Representation Learning. Ph.D Dissertation. Wellington, New Zealand: Victoria University of Wellington, 2016.
[33] ROS G, SELLART L, MATERZYNSKA J, et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3234-3243.
[34] MOVSHOVITZ-ATTIAS Y, KANADE T, SHEIKH Y. How Useful Is Photo-Realistic Rendering for Visual Learning? [C/OL]. [2017-06-25]. https://arxiv.org/pdf/1603.08152.pdf.
[35] V ZQUEZ D, L PEZ A M, MAR N J, et al. Virtual and Real World Adaptation for Pedestrian Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 797-809.
[36] XU J L, V ZQUEZ D, L PEZ AM, et al. Learning a Part-Based Pedestrian Detector in a Virtual World. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(5): 2121-2131.
[37] XU J L, RAMOS S, V ZQUEZ D, et al. Domain Adaptation of Deformable Part-Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(12): 2367-2380.
[38] HANDA A, P TR UCEAN V, BADRINARAYANAN V, et al. Understanding Real World Indoor Scenes with Synthetic Data[C/OL]. [2017-06-25]. https://arxiv.org/pdf/1511.07041.pdf.
[39] GAIDON A, WANG Q, CABON Y, et al. Virtual Worlds as Proxy for Multi-object Tracking Analysis // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA:
IEEE, 2016: 4340-4349.
[40] 王飞跃,莫红.关于二型模糊集合的一些基本问题.自动化学报, 2017, 43(7): 1114-1141.
(WANG F Y, MO H. Some Fundamental Issues on Type-2 Fuzzy Sets. Acta Automatica Sinica, 2017, 43(7): 1114-1141.)