基于多视觉码本的图像表示

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (424 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要基于词袋模型的图像表示方法的有效性主要受限于局部特征的量化误差。文中提出一种基于多视觉码本的图像表示方法，通过综合考虑码本构建和编码方法这两个方面的因素加以改进。具体包括:1)多视觉码本构建，以迭代方式构建多个紧凑且具有互补性的视觉码本;2)图像表示，首先针对多码本的情况，依次从各码本中选择相应的视觉单词并采用线性回归估计编码系数，然后结合图像的空间金字塔结构形成最终的图像表示。在一些标准测试集合的图像分类结果验证文中方法的有效性。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	宋彦
	蒋兵
	戴礼荣

关键词 ：图像分类, 视觉码本, 聚类分析, 图像表示

Abstract：The effectiveness of the image representation based on bag-of-visual words(BoW) model is majorly limited by the quantization error. To address this issue, an improved image representation based on multiple visual codebooks is proposed in this paper, which considers both visual codebook construction and feature coding. The proposed method specifically consists of 1) multiple visual codebooks construction, in which the compact and complementary visual codebooks are iteratively generated; 2) image representation, in which the visual words are firstly selected from each individual visual codebook, then the coding coefficients are determined by using the regularized linear regression method, and finally the image is represented by combining the spatial pyramid structure. The experimental results on several benchmark image classification datasets demonstrate the consistent and significant improvement of the proposed method.

Key words： Image Classification Visual Codebook Clustering Analysis Image Representation

收稿日期: 2012-08-20

基金资助:国家自然科学基金资助项目(No.61172158)

作者简介: 宋彦(通讯作者)，男，1972年生，博士，讲师，主要研究方向为多媒体信息处理.E-mail:songy@ustc.edu.cn.蒋兵，男，1987年生，博士研究生，主要研究方向为多媒体信息处理.戴礼荣，男，1962年生，博士，教授，主要研究方向为数字信号处理、模式识别.

引用本文:

宋彦，蒋兵，戴礼荣. 基于多视觉码本的图像表示[J]. 模式识别与人工智能, 2013, 26(10): 909-915. SONG Yan, JIANG Bing, DAI Li-Rong. Image Representation Based on Multiple Visual Codebooks. , 2013, 26(10): 909-915.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2013/V26/I10/909

[1] Lazebnik S, Schmid C, Ponce J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories / / Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006, II: 2169-2178
[2] Boureau Y, Bach F, LeCun Y, et al. Learning MidLevel Features for Recognition / / Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA,2010: 2559-2566
[3] Lowe D. Distinctive Image Features from ScaleInvariant Keypoints.International Journal of Computer Vision, 2004, 60(2): 91-110
[4] Sivic J, Zisserman A. Video Google: A Text Retrieval Approach to Object Matching in Videos / / Proc of the 9th IEEE International Conference on Computer Vision. Nice, France, 2003,域: 1470-1477
[5] Aharon M, Elad M, Bruckstein A. KSVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation.IEEE Trans on Signal Processing, 2006, 54(11): 4311-4322
[6] Jiang Yuguang, Ngo C W. Visual Word Proximity and Linguistics for Semantic Video Indexing and NearDuplicate Retrieval. Compu ter Vision and Image Understanding, 2009, 113(3): 405-414
[7] Jurie F, Triggs B. Creating Efficient Codebooks for Visual Recogni tion / / Proc of the 10th International Conference on ComputerVision. Beijing, China, 2005, I: 604-610
[8] Boiman O, Shechtman E, Irani M. In Defense of NearestNeighbor Based Image Classification / / Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Ancho rage, USA, 2008: 1-8
[9] Gemert J, Geusebroek J, Veenman C, et al. Kernel Codebooks for Scene Categorization / / Proc of the 10th European Conference on Computer Vision. Marseille, France, 2008: 696-709
[10] Coates A, Ng A Y. The Importance of Encoding versus Training with Sparse Coding and Vector Quantization / / Proc of the 28th International Conference on Machine Learning. Bellevue, USA, 2011: 921-928
[11] Yang Jianchao, Yu Kai, Gong Yihong, et al. Linear Spatial Pyra mid Matching Using Sparse Coding for Image Classification / / Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 1794-1801
[12] Wang Jinjun, Yang Jianchao, Yu Kai, et al. LocalityConstrained Linear Coding for Image Classification / / Proc of the IEEE Com puter Society Conference on Computer Vision and Pattern Recogni tion. San Francisco, USA, 2010: 3360-3367
[13] Jegou H, Douze M, Schmid C. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search / / Proc of the 10th European Conference on Computer Vision. Marseille,France, 2008: 304-317
[14] Zhou Xi, Yu Kai, Zhang Tong, et al. Image Classification Using SuperVector Coding of Local Image Descriptors / / Proc of the 11thEuropean Conference on Computer Vision. Heraklion, Greece,2010: 141-154
[15] Yu Kai, Zhang Tong, Gong Yihong. Nonlinear Learning Using Local Coordinate Coding / / Proc of the Annual Conference on Neural Information Systems. Vancouver, Canada, 2009: 2223-2231