Abstract:Vocabulary tree based Bag-of-Words (BoW) representation becomes popular for image retrieval recently. Aiming at the absence of spatial context information in conventional vocabulary tree approaches,an image retrieval approach using spatial context weighting based vocabulary tree is proposed. Within the framework of vocabulary tree,this approach firstly describes the spatial context information of SIFT features. Then,the matching scores between SIFT features are weighted based on spatial context similarity,and similarities between images are achieved. Finally,image retrieval results are obtained according to the ranking of similarities. The experimental results indicate that the retrieval performance is improved and the proposed approach applies to large scale databases.
[1] Sivic J, Zisserman A. Video Google: A Text Retrieval Approach to Object Matching in Videos // Proc of the 9th IEEE International Conference on Computer Vision. Nice, France, 2003, II: 1470-1477 [2] Kesorn K, Poslad S. An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images. IEEE Trans on Multimedia, 2012, 14(1): 211-222 [3] Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110 [4] Philbin J, Chum O, Isard M, et al. Object Retrieval with Large Vocabularies and Fast Spatial Matching // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA, 2007: 1-8 [5] López-Sastre R J, Tuytelaars T, Acevedo-Rodríguez F J, et al. Towards a More Discriminative and Semantic Visual Vocabulary. Computer Vision and Image Understanding, 2011, 115(3): 415-425 [6] Zhang Yin, Jin Rong, Zhou Zhihua. Understanding Bag-of-Words Model: A Statistical Framework. International Journal of Machine Learning and Cybernetics, 2010, 1(1/2/3/4): 43-52 [7] Nistér D, Stewénius H. Scalable Recognition with a Vocabulary Tree // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. New York, USA, 2006, I: 2161-2168 [8] Philbin J, Chum O, Isard M, et al. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA, 2008: 1-8 [9] Jégou H, Douze M, Schmid C. Improving Bag-of-Features for Large Scale Image Search. International Journal of Computer Vision, 2010, 87(3): 316-336 [10] Liu Lingqiao, Wang Lei, Liu Xinwang. In Defense of Soft-Assignment Coding // Proc of the 13th IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 2486-2493 [11] Qin Jianzhao, Yung N H C. Scene Categorization via Contextual Visual Words. Pattern Recognition, 2010, 43(5): 1874-1888 [12] Tang Wenbin, Cai Rui, Li Zhiwei, et al. Contextual Synonym Dictionary for Visual Object Retrieval // Proc of the 19th International Conference on Multimedia. Scottsdale, USA, 2011: 503-512 [13] Wang Xiaoyu, Yang Ming, Cour T, et al. Contextual Weighting for Vocabulary Tree Based Image Retrieval // Proc of the 13th IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 209-216 [14] Jiang Yuning, Meng Jingjing, Yuan Junsong. Randomized Visual Phrases for Object Search // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 3100-3107 [15] Deng Jia, Dong Wei, Sochen R, et al. ImageNet: A Large-Scale Hierarchical Image Database // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 248-255