双分支多粒度局部对齐的实例级草图图像检索

doi:10.16451/j.cnki.issn1003-6059.202308003

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1625 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要实例级草图图像检索旨在使用草图检索图像.草图与真实图像之间存在模态差异大和特征不对齐问题,现有方法不能有效减小草图和图像之间模态差异,并且只在单个粒度上获取信息,无法有效进行特征对齐.因此,文中提出双分支多粒度局部对齐网络(Two Stream Multi-granularity Local Alignment Network, TSMLA),引入双分支特征提取器,提取模态共享和模态特异的局部特征,同时利用这两种特征计算草图和真实图像间的距离,减少不同模态间的差异.同时,提出多粒度局部对齐模块,对距离矩阵进行不同粒度的池化操作,在不同尺度上对齐局部特征,进一步解决特征不对齐问题.TSMLA能够充分利用草图和真实图像的信息,同时有效利用不同粒度特征间的联系.在多个数据集上的实验验证TSMLA的有效性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	韩雪昆
	苗夺谦
	张红云
	张齐贤

关键词 ：草图图像检索, 特征提取, 特征融合, 跨模态检索

Abstract：The goal of instance-level sketch-based image retrieval is to retrieve images by sketches. There is a significant modality gap and feature misalignment issue between sketches and images. In the existing methods, the modality gap between sketches and images cannot be effectively reduced, and only information at a single granularity is captured. Thus, features cannot be aligned effectively. To address these issues, a two stream multi-granularity local alignment network(TSMLA) is proposed. A two-stream feature extractor is introduced to extract both modality-shared and modality-specific local features. These features are simultaneously utilized to calculate the distance between the sketch and the image and reduce the differences between different modalities. Moreover, a multi-granularity local alignment module is adopted to pool the distance matrix at various granularities. Local features are aligned at different scales to effectively address the problem of feature misalignment. TSMLA can fully utilize the information of sketches and real images, while effectively utilizing the connections between features of different granularities. Experiments on multiple datasets validate the effectiveness of TSMLA.

Key words： Sketch-Based Image Retrieval Feature Extraction Feature Fusion Cross-Modal Retrieval

收稿日期: 2023-06-08

ZTFLH:

TP389.1

基金资助:国家重点研发计划项目(No.2022YFB3104700)、国家自然科学基金项目(No.61976158,61976160,62076182)

通讯作者: 苗夺谦,博士,教授,主要研究方向为机器学习、数据挖掘、大数据分析、粒计算、人工智能、文本图像处理.E-mail:dqmiao@tongji.edu.cn.

作者简介: 韩雪昆,硕士研究生,主要研究方向为图像检索、草图识别、机器学习.E-mail:2132958@tongji.edu.cn. 张红云,博士,副教授,主要研究方向为主曲线算法、粒计算、模糊集.E-mail:zhanghongyun@tongji.edu.cn. 张齐贤,博士研究生,主要研究方向为行人搜索、计算机视觉、目标检测.E-mail:zhangqx@tongji.edu.cn.

引用本文:

韩雪昆, 苗夺谦, 张红云, 张齐贤. 双分支多粒度局部对齐的实例级草图图像检索[J]. 模式识别与人工智能, 2023, 36(8): 701-711. HAN Xuekun, MIAO Duoqian, ZHANG Hongyun, ZHANG Qixian. Instance-Level Sketch-Based Image Retrieval Based on Two Stream Multi-granularity Local Alignment Network. Pattern Recognition and Artificial Intelligence, 2023, 36(8): 701-711.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202308003 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2023/V36/I8/701

[1] XU P, SONG Z Y, YIN Q Y, et al. Deep Self-Supervised Representation Learning for Free-Hand Sketch. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(4): 1503-1513.
[2] RIBEIRO L S F, BUI T, COLLOMOSSE J, et al. Sketchformer: Transformer-Based Representation for Sketched Structure// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 14141-14150.
[3] QI Y G, SU G Y, CHOWDHURY P N, et al. SketchLattice: La-tticed Representation for Sketch Manipulation// Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 933-941.
[4] CAO N, YAN X, SHI Y, et al. AI-Sketcher: A Deep Generative Model for Producing High-Quality Sketches. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 2564-2571.
[5] BHUNIA A K, DAS A, MUHAMMAD U R, et al. Pixelor: A Competitive Sketching AI Agent. So You Think You Can Sketch? ACM Transactions on Graphics, 2020, 39(6). DOI: 10.1145/3414685.3417840.
[6] BHUNIA A K, KHAN S, CHOLAKKAL H, et al. DoodleFormer: Creative Sketch Drawing with Transformers// Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Springer, 2022: 338-355.
[7] SHEN Y M, LIU L, SHEN F M, et al. Zero-Shot Sketch-Image Hashing// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 3598-3607.
[8] YU Q, LIU F, SONG Y Z, et al. Sketch Me That Shoe// Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 799-807.
[9] SONG J F, YU Q, SONG Y Z, et al. Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval// Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 5552-5561.
[10] LIN H Y, FU Y W, LU P, et al. TC-Net for iSBIR: Triplet Cla-ssification Network for Instance-Level Sketch Based Image Retrieval// Proc of the 27th ACM International Conference on Multimedia. New York, USA: ACM, 2019: 1676-1684.
[11] XU J Q, SUN H F, QI Q, et al. DLA-Net for FG-SBIR: Dynamic Local Aligned Network for Fine-Grained Sketch-Based Image Retrieval// Proc of the 29th ACM International Conference on Multimedia. New York, USA: ACM, 2021: 5609-5618.
[12] SUN H F, XU J Q, WANG J Y, et al. DLI-Net: Dual Local Interaction Network for Fine-Grained Sketch-Based Image Retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(10): 7177-7189.
[13] LING Z X, XING Z, LI J T, et al. Multi-level Region Matching for Fine-Grained Sketch-Based Image Retrieval// Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 462-470.
[14] BHUNIA A K, KOLEY S, KHILJI A F U R, et al. Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 989-998.
[15] LING Z X, XING Z, ZHOU J, et al. Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval// Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Springer, 2022: 722-738.
[16] PANG K Y, YANG Y X, HOSPEDALES T M, et al. Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 10344-10352.
[17] BHUNIA A K, CHOWDHURY P N, SAIN A. More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 4245-4254.
[18] YE M, LAN X Y, LI J W, et al. Hierarchical Discriminative Learning for Visible Thermal Person Re-identification// Proc of the 32nd AAAI Conference on Artificial Intelligence and 30th Innovative Applications of Artificial Intelligence Conference and 8th AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI, 2018: 7501-7508.
[19] YE M, WANG Z, LAN X Y, et al. Visible Thermal Person Re-identification via Dual-Constrained Top-Ranking// Proc of the 27th International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI, 2018: 1092-1099.
[20] ZHANG S Z, YANG Y F, WANG P, et al. Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation. IEEE Transactions on Image Processing, 2021, 30: 8861-8872.
[21] LU Y, WU Y, LIU B. Cross-Modality Person Re-identification with Shared-Specific Feature Transfer// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 13376-13386.
[22] SANGKLOY P, BURNELL N, HAM C, et al. The Sketchy Data-base: Learning to Retrieve Badly Drawn Bunnies. ACM Transactions on Graphics, 2016, 35(4). DOI: 10.1145/2897824.2925954.