Automatic Image Annotation Combining Semantic Neighbors and Deep Features
KE Xiao, ZHOU Mingke, NIU Yuzhen
College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116 Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing,Fuzhou University, Fuzhou 350116
Abstract:In the traditional image annotation methods, the manual selection of features is time-consuming and laborious. In the traditional label propagation algorithm, semantic neighbors are ignored. Consequently visual similarity and semantic dissimilarity are caused and annotation results are affected. To solve these problems, an automatic image annotation method combining semantic neighbors and deep features is proposed. Firstly, a unified and adaptive depth feature extraction framework is constructed based on deep convolutional neural network. Then, the training set is divided into semantic groups and the neighborhood image sets of the unannotated images are set up. Finally, according to the visual distance, the contribution value of each label of the neighborhood images is calculated and the keywords are obtained by sorting their contribution values. Experiments on benchmark datasets show that compared with the traditional synthetic features, the proposed deep feature possesses lower dimension and better effect. The problem of visual similarity and semantic dissimilarity in visual nearest neighbor annotation method is improved, and the algorithm effectively enhances the accuracy and the number of accurate predicted tags.
[1] 史彩娟,阮秋琦.基于增强稀疏性特征选择的网络图像标注.软件学报, 2015, 26(7): 1800-1811. (SHI C J, RUAN Q Q. Feature Selection with Enhanced Sparsity for Web Image Annotation. Journal of Software, 2015, 26(7): 1800-1811.) [2] MORI Y, TAKAHASHI H, OKA R. Image-to-Word Transformation Based on Dividing and Vector Quantizing Images with Words[C/OL]. [2016-05-30]. http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.31.1704&rep=rep1&type=pdf. [3] FENG S L, MANMATHA R, LAWENKO V. Multiple Bernoulli Relevance Models for Image and Video Annotation // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2004, II: 1002-1009. [4] DUYGULU P, BARNARD K, DE FREITAS J F G, et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary // Proc of the 7th European Conference on Computer Vision. Berlin, Germany: Springer, 2002: 97-112. [5] WANG H, HUANG H, DING C. Image Annotation Using Bi-relational Graph of Images and Semantic Labels // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 793-800. [6] AMIRI S H, JAMZAD M. Efficient Multi-modal Fusion on Super-graph for Scalable Image Annotation. Pattern Recognition, 2015, 48(7): 2241-2253. [7] GAO Y L, FAN J P, XUE X Y, et al. Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers // Proc of the 14th ACM International Conference on Multimedia. New York, USA: ACM, 2006: 901-910. [8] YANG C B, DONG M, HUA J. Region-Based Image Annotation Using Asymmetrical Support Vector Machine-Based Multiple-Instance Learning // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2006: 2057-2063. [9] 邱泽宇,方 全,桑基韬,等.基于区域上下文感知的图像标注.计算机学报, 2014, 37(6): 1390-1397. (QIU Z Y, FANG Q, SAN J T, et al. Regional Context-Aware Image Annotation. Chinese Journal of Computers, 2014, 37(6): 1390-1397.) [10] FU H, ZHANG Q, QIU G P. Random Forest for Image Annotation // Proc of the 12th European Conference on Computer Vision. Berlin, Germany: Springer, 2012, VI: 86-99. [11] MAKADIA A, PAVLOVIC V, KUMAR S. A New Baseline for Image Annotation // Proc of the 10th European Conference on Computer Vision. Berlin, Germany: Springer, 2008, III: 316-329. [12] GUILLAUMIN M, MENSINK T, VERBEEK J, et al. Tagprop: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-annotation[C/OL]. [2016-05-30]. http://lear.inrialpes.fr/pubs/2009/GMVS09/GMVS09.pdf. [13] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet Cla-ssification with Deep Convolutional Neural Networks[C/OL]. [2016-05-30]. http://image-net.org/challenges/LSVRC/2012/supervision.pdf. [14] JIA Y Q, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[C/OL]. [2016-05-30]. https://arxiv.org/pdf/1408.5093v1.pdf. [15] ZHANG S T, HUANG J Z, HUANG Y C, et al. Automatic Image Annotation Using Group Sparsity // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2010: 3312-3319. [16] HARIHARAN B, ZELNIK-MANOR L, VISHWANATHAN S V N, et al. Large Scale Max-Margin Multi-label Classification with Priors[C/OL]. [2016-05-30]. http://research.microsoft.com/en-us/um/people/manik/pubs/hariharan10.pdf. [17] CUI C R, MA J, LIAN T, et al. Ranking-Oriented Nearest-Neighbor Based Method for Automatic Image Annotation // Proc of the 36th International ACM SIGIR Conference on Research and Deve-lopment in Information Retrieval. New York, USA: ACM, 2013: 957-960. [18] WANG Y R, DAWOOD H, YIN Q, et al. A Comparative Study of Different Feature Mapping Methods for Image Annotation // Proc of the International Conference on Advanced Computational Intelligence. Washington, USA: IEEE, 2015: 335-339. [19] KURIC E, BIELIKOVA M. ANNOR: Efficient Image Annotation Based on Combining Local and Global Features. Computers and Graphics, 2015, 47: 1-15. [20] ZHANG X C, LIU C C. Image Annotation Based on Feature Fusion and Semantic Similarity. Neurocomputing, 2015, 149(C): 1658-1671. [21] LI Z C, LIU J, XU C S, et al. MLRank: Multi-correlation Lear-ning to Rank for Image Annotation. Pattern Recognition, 2013, 46(10): 2700-2710.