Listwise Adversarial Domain Adaption Algorithm for Image Cropping
WANG Haowen1, SANG Nong1
1. Key Laboratory on Image Information Processing and Intelligent Control of Ministry of Education, School of Artificial In- telligence and Automation, Huazhong University of Science and Technology, Wuhan 430074
Abstract:Image cropping is short of training data for its high threshold for annotation. Current research on image cropping is confined on public datasets. Grounded on domain shift between training domain and practical application scene, a listwise adversarial domain adaption algorithm for image cropping is proposed in this paper. Firstly, the domain shift between two image cropping datasets, GAICD and CPC, is proved. Then, an image cropping model composed of an aesthetic evaluation module and an adversarial domain adaptation module is constructed. Aesthetic evaluation module is employed to predict the aesthetic score of current image and assist the model to extract the invariant features for cropping task. Adversarial domain adaptation module is exploited to realize adversarial based domain adaptation learning. Domain migration experiments between different cropping datasets and between different scene domains verify the effectiveness of proposed algorithm.
[1] YAN J Z, LIN S , KANG S B, et al. Learning the Change for Automatic Image Cropping // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2013: 971-978. [2] CHEN F, LIN Z, MECH R, et al. Automatic Image Cropping Using Visual Composition, Boundary Simplicity And Content Preservation Models // Proc of the 22nd ACM International Conference on Multimedia. New York, USA: ACM, 2014: 1105-1108. [3] NISHIYAMA M, OKABE T, SATO Y, et al. Sensation-Based Photo Cropping // Proc of the 17th ACM International Conference on Multimedia. New York, USA: ACM, 2009: 669-672. [4] PARK J, LEE J Y, TAI Y W, et al. Modeling Photo Composition and Its Application to Photo Re-arrangement // Proc of the 19th IEEE International Conference on Image Processing. Washington, USA: IEEE, 2012: 2741-2744. [5] MURRAY N, MARCHESOTTI L, PERRONNIN F. AVA: A Large-Scale Database for Aesthetic Visual Analysis // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2012: 2408-2415. [6] KONG S, SHEN X H, LIN Z, et al. Photo Aesthetics Ranking Network with Attributes and Content Adaptation // Proc of the Euro-pean Conference on Computer Vision. Berlin, Germany: Springer, 2016: 662-679. [7] LU P, ZHANG H, PENG X J, et al. An End-to-End Neural Network for Image Cropping by Learning Composition from Aesthetic Photos[C/OL]. [2021-01-25]. https://arxiv.org/pdf/1907.01432.pdf. [8] TU Y, NIU L, ZHAO W J, et al. Image Cropping with Composition and Saliency Aware Aesthetic Score Map // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2020: 12104-12111. [9] CHEN J S, BAI G C, LIANG S H, et al. Automatic Image Cro-pping: A Computational Complexity Study // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 507-515. [10] LI Z P, ZHANG X Y. Collaborative Deep Reinforcement Learning for Image Cropping // Proc of the IEEE International Conference on Multimedia and Expo. Washington, USA: IEEE, 2019: 254-259. [11] WANG W G, SHEN J B, LING H B. A Deep Network Solution for Attention and Aesthetics aware Photo Cropping. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(7): 1531-1544. [12] CHEN L Q, XIE X, FAN X, et al. A Visual Attention Model for Adapting Images on Small Displays. Multimedia Systems, 2003, 9(4): 353-364. [13] MARCHESOTTI L, CIFARELLI C, CSURKA G. A Framework for Visual Saliency Detection with Applications to Image Thumbnailing // Proc of the 12th IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2009, I: 2232-2239. [14] LU P, LIU J H, PENG X J, et al. Weakly Supervised Real-Time Image Cropping Based on Aesthetic Distributions // Proc of the 28th ACM International Conference on Multimedia. New York, USA: ACM, 2020: 710-731. [15] CHEN Y L, HUANG T W, CHANG K H, et al. Quantitative Analysis of Automatic Image Cropping Algorithms: A Dataset and Comparative Study // Proc of the IEEE Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2017: 226-234. [16] WEI Z J, ZHANG J M, SHEN X H, et al. Good View Hunting: Learning Photo Composition from Dense View Pairs // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 5437-5446. [17] ZENG H, LI L D, CAO Z S, et al. Reliable and Efficient Image Cropping: A Grid Anchor Based Approach // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5942-5950. [18] DENG Y B, CHEN C L, TANG X O. Image Aesthetic Asse-ssment: An Experimental Survey. IEEE Signal Processing Magazine, 2017, 34(4): 80-106. [19] CHEN Y L, KLOPP J, SUN M, et al. Learning to Compose with Professional Photographs on the Web // Proc of the 25th ACM International Conference on Multimedia. New York, USA: ACM, 2017: 37-45. [20] LI D B, WU H K, ZHANG J G, et al. A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 8193-8201. [21] TZENG E, HOFFMAN J, ZHANG N, et al. Deep Domain Confusion: Maximizing for Domain Invariance[C/OL]. [2021-01-25]. https://arxiv.org/pdf/1412.3474v1.pdf. [22] GHIFARY M, KLEIJN W B, ZHANG M J. Domain Adaptive Neural Networks for Object Recognition // Proc of the Pacific Rim International Conference on Artificial Intelligence. Berlin, Germany: Springer, 2014: 898-904. [23] LONG M S, CAO Y, WANG J M, et al. Learning Transferable Features with Deep Adaptation Networks // Proc of the 32nd International Conference on Machine Learning. New York, USA: ACM, 2015: 97-105. [24] GONG B Q, SHI Y, SHA F, et al. Geodesic Flow Kernel for Unsupervised Domain Adaptation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2012: 2066-2073. [25] LONG M S, ZHU H, WANG J M, et al. Unsupervised Domain Adaptation with Residual Transfer Networks // Proc of the 30th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2016: 136-144. [26] ZHANG Y B, TANG H, JIA K, et al. Domain-Symmetric Networks for Adversarial Domain Adaptation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5031-5040. [27] CICEK S, SOATTO S. Unsupervised Domain Adaptation via Regularized Conditional Alignment // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1416-1425. [28] LONG M S, CAO Z J, WANG J M, et al. Conditional Adversarial Domain Adaptation // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2018: 1640-1650. [29] TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to Adapt Structured Output Space for Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 7472-7481. [30] CHEN Y H, LI W, SAKARIDIS C, et al. Domain Adaptive Faster R-CNN for Object Detection in the Wild // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 3339-3348. [31] GANIN Y, LEMPITSKY V. Unsupervised Domain Adaptation by Backpropagation // Proc of the 32nd International Conference on Machine Learning. New York, USA: ACM, 2015: 1180-1189. [32] LI D B, ZHANG J G, HUANG K Q, et al. Composing Good Shots by Exploiting Mutual Relations // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 4212-4221. [33] LI D B, ZHANG J G, HUANG K Q, et al. Learning to Learn Cropping Models for Different Aspect Ratio Requirements // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2020: 12682-12691. [34] HWANG S, PARK J, KIM N, et al. Multispectral Pedestrian Detection: Benchmark Dataset and Baseline // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 1037-1045. [35] HE K M, CKIOXARI G, DOLLAR P, et al. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 386-397. [36] LU W R, XING X F, CAI B L, et al. Listwise View Ranking for Image Cropping. IEEE Access, 2019, 7: 91904-91911. [37] ZHOU B L, LAPEDRIZA A, KHOSLA A, et al. Places: A 10 Million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6): 1452-1464.