Abstract:An effective approach to semantic-based image retrieval is to find the correlation between low-level visual features and high-level semantics expressed by free text. Inspired by kernel method and graph Laplacian, the correlation space embedding algorithm(CSEA) is proposed in this paper. The latent semantic indexing and the visual word are used to construct the correlation between low-level image feature and semantic text feature which are heterogeneous with each other. The underlying cross-modal relationship between the free text and the image is established,and then the semantic-based image retrieval can be realized naturally. The consistency of manifold structure is regarded as a prior constraint in CSEA. By using CSEA, both the low-level image feature and the semantic text feature are embedded into a same intermediate space. Compared with the canonical correlation analysis, the proposed method models the correlation between two different feature spaces and preserves the manifold structure of each data distribution. Thus, the reliability of the proposed algorithm is improved. The experimental results show the effectiveness and the feasibility of the proposed algorithm in image retrieval.
庄凌,王超,周峰,鲁伟明,吴江琴. 相关空间嵌入算法及其在图像检索中的应用[J]. 模式识别与人工智能, 2014, 27(4): 363-371.
ZHUANG Ling, WANG Chao, ZHOU Feng, LU Wei-Ming, WU Jiang-Qin. Correlation Space Embedding Algorithm and Its Application to Image Retrieval. , 2014, 27(4): 363-371.
[1] Hirata K, Kato T. Query by Visual Example-Content-Based Image Retrieval // Proc of the 3rd International Conference on Extending Database Technology: Advances in Database Technology. Vienna, Austria, 1992: 56-71 [2] Wu F, Zhuang Y T. Cross Media Analysis and Retrieval on the Web: Theory and Algorithm. Journal of Computer-Aided Design & Computer Graphics, 2010, 22(1): 1-9 (in Chinese) (吴 飞,庄越挺.互联网跨媒体分析与检索:理论与算法.计算机辅助设计与图形学学报, 2010, 22(1): 1-9) [3] Jeon J, Lavrenko V, Manmatha R. Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models // Proc of the 26th Annual International ACM SIGIR Conference on Research and Deve-lopment in Information Retrieval. Toronto, Canada, 2003: 119-126 [4] Liu J, Wang B, Li M G, et al. Dual Cross-Media Relevance Model for Image Annotation // Proc of the 15th International Conference on Multimedia. Augsburg, Germany, 2007: 605-614 [5] Blei D M, Jordan M I. Modeling Annotated Data // Proc of the 26th Annual International ACM SIGIR Conference on Research and Deve-lopment in Information Retrieval. Toronto, Canada, 2003: 127-134 [6] Vogel J, Schiele B. Natural Scene Retrieval Based on a Semantic Modeling Step // Proc of the International Conference on Image and Video Retrieval. Dublin, Ireland, 2004: 207-215 [7] Wang M, Zhou X D, Chua T S. Automatic Image Annotation via Local Multi-label Classification // Proc of the International Conference on Content-Based Image and Video Retrieval. Niagara Falls, Canada, 2008: 17-26 [8] Fu H, Qiu G P. Fast Semantic Image Retrieval Based on Random Forest // Proc of the 20th ACM International Conference on Multimedia. Nara, Japan, 2012: 909-912 [9] Deng J, Berg A C, Li F F. Hierarchical Semantic Indexing for Large Scale Image Retrieval // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2011: 785-792 [10] Wu L, Jin R, Jain A K. Tag Completion for Image Retrieval. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(3): 716-727 [11] Hardoon D R, Saunders C, Szedmak S, et al. A Correlation Approach for Automatic Image Annotation // Proc of the 2nd International Conference on Advanced Data Mining and Applications. Xi'an, China, 2006: 681-692 [12] Zhang H, Wu F, Zhuang Y T, et al. Cross-Media Retrieval Method Based on Content Correlations. Chinese Journal of Computers, 2008, 31(5): 820-826 (in Chinese) (张 鸿,吴 飞,庄越挺,等.一种基于内容相关性的跨媒体检索方法.计算机学报, 2008, 31(5): 820-826) [13] Torres D, Turnbull D, Barrington L, et al. Identifying Words That Are Musically Meaningful[EB/OL]. [2012-10-20]. http://ismir2007.ismir.net/proceedings/ISMIR2007_p405_torres.pdf [14] Torres D A, Turnbull D, Sriperumbudur B K, et al. Finding Musically Meaningful Words by Sparse CCA[EB/OL]. [2012-10-20]. http://www.iro.umontreal.ca/~pift6080/H09/documents/papers/Turnbull_MusicVocab_NIPS_MBC07.pdf [15] Zhuang L, Zhuang Y T, Wu J Q, et al. Image Retrieval Approach Based on Sparse Canonical Correlation Analysis. Journal of Software, 2012, 23(5): 1295-1304 (in Chinese) (庄 凌,庄越挺,吴江琴,等.一种基于稀疏典型性相关分析的图像检索方法.软件学报, 2012, 23(5): 1295-1304) [16] Rasiwasia N, Pereira J C, Coviello E, et al. A New Approach to Cross-Modal Multimedia Retrieval // Proc of the 18th International Conference on Multimedia. Klagenfurt, Austria, 2010: 251-260 [17] Shawe-Taylor J, Cristianini N. Kernel Methods for Pattern Analysis. 1st Edition. Cambridge, UK: Cambridge University Press, 2004 [18] Bishop C M. Pattern Recognition and Machine Learning. 1st Edition. New York, USA: Springer-Verlag, 2006 [19] Hofmann T, Schlkopf B, Smola A J. Kernel Methods in Machine Learning. The Annals of Statistics, 2008, 36(3): 1171-1220 [20] Rosasco L, Caponnetto A, Vito E D, et al. Learning, Regularization and Ill-Posed Inverse Problems // Proc of the 18th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2004: 1145-1152 [21] Vapnik V N. The Nature of Statistical Learning Theory. 1st Edition. New York, USA: Springer-Verlag, 1995 [22] Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B, 1996, 58(1): 267-288 [23] Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference and Prediction. 2nd Edition. New York, USA: Springer-Verlag, 2009 [24] Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000, 290(5500): 2323-2326 [25] Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering // Proc of the 15th Annual Confe-rence on Neural Information Processing Systems. Vancouver, Canada, 2001: 585-591 [26] Blum A, Chawla S. Learning from Labeled and Unlabeled Data Using Graph Mincuts // Proc of the 18th International Conference on Machine Learning. Williamstown, USA, 2001: 19-26 [27] Zhu X J, Ghahramani Z, Lafferty J. Semi-supervised Learning Using Gaussian Fields and Harmonic Functions // Proc of the 20th International Conference on Machine Learning. Washington, USA, 2003: 912-919 [28] Zhou D Y, Bousquet O, Lal T N, et al. Learning with Local and Global Consistency // Proc of the 18th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2004: 321-328 [29] Zha Z J, Mei T, Wang J D, et al. Graph-Based Semi-supervised Learning with Multiple Labels // Proc of the IEEE International Conference on Multimedia and Expo. Hannover, Germany, 2008: 1321-1324 [30] Wu M R, Schlkopf B. A Local Learning Approach for Clustering // Proc of the 20th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2006: 1529-1536 [31] Wu M R, Schlkopf B. Transductive Classification via Local Learning Regularization // Proc of the 11th International Conference on Artificial Intelligence and Statistics. San Juan, USA, 2007: 628-635 [32] Belkin M, Niyogi P, Sindhwani V. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples. Journal of Machine Learning Research, 2006, 7: 2399-2434 [33] Belkin M, Niyogi P, Sindhwani V. On Manifold Regularization // Proc of the 10th International Workshop on Artificial Intelligence and Statistics. Savannah, Barbados, 2005: 17-24 [34] Goldberg A B, Li M, Zhu X J. Online Manifold Regularization: A New Learning Setting and Empirical Study // Proc of the European Conference on Machine Learning and Knowledge Discovery in Databases. Antwerp, Belgium, 2008: 393-407 [35] Wu F, Wang W H, Yang Y, et al. Classification by Semi-supervised Discriminative Regularization. Neurocomputing, 2010, 73(10/11/12): 1641-1651 [36] Grubinger M, Clough P, Müller H, et al. The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems[EB/OL]. [2012-10-20]. http://Thomas.deselaers.de/publications/papers/grubinger_lrec06.pdf [37] Hardoon D R, Shawe-Taylor J. Sparse Canonical Correlation Ana-lysis. Machine Learning, 2011, 83(3): 331-353