模式识别与人工智能
2025年4月4日 星期五   首 页     期刊简介     编委会     投稿指南     伦理声明     联系我们                                                                English
模式识别与人工智能  2022, Vol. 35 Issue (3): 195-206    DOI: 10.16451/j.cnki.issn1003-6059.202203001
论文与报告 最新目录| 下期目录| 过刊浏览| 高级检索 |
基于多模态图和对抗哈希注意力网络的跨媒体细粒度表示学习
梁美玉1, 王笑笑1, 杜军平1
1.北京邮电大学 计算机学院(国家示范性软件学院) 智能通信软件与多媒体北京市重点实验室 北京100876
Cross-Media Fine-Grained Representation Learning Based on Multi-modal Graph and Adversarial Hash Attention Network
LIANG Meiyu1, WANG Xiaoxiao1, DU Junping1
1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science(National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876

全文: PDF (732 KB)   HTML (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 跨媒体数据搜索中不同媒体类型的数据间存在特征异构和语义鸿沟问题,且社交网络数据往往呈现语义稀疏性、多样性等特性.针对上述问题,文中提出基于多模态图和对抗哈希注意力网络的跨媒体细粒度表示学习模型,获取统一的跨媒体语义表示,应用于社交网络跨媒体搜索.首先,构建图像-单词关联图,并基于图随机游走策略挖掘图像和文本单词间直接语义关联和隐含语义关联,实现语义关系扩展.然后,构建基于跨媒体协同注意力机制的跨媒体细粒度特征学习网络,通过互相指导的跨媒体注意力机制协同学习图像和文本的细粒度语义关联.最后,构建跨媒体对抗哈希网络,联合跨媒体细粒度语义关联学习和对抗哈希学习,获取高效紧凑的跨媒体统一哈希语义表示.实验表明,文中模型在两个公开标准跨媒体数据集上均取得较优的跨媒体搜索性能.
服务
把本文推荐给朋友
加入我的书架
加入引用管理器
E-mail Alert
RSS
作者相关文章
梁美玉
王笑笑
杜军平
关键词 跨媒体表示学习对抗哈希注意力网络细粒度表示学习跨媒体协同注意力机制跨媒体搜索    
Abstract:There are problems of feature heterogeneity and semantic gap between data of different media types in cross-media data search, and social network data often exhibits semantic sparsity and diversity. Aiming at these problems, a cross-media fine-grained representation learning model based on multi-modal graph and adversarial Hash attention network(CMFAH) is proposed to obtain a unified cross-media semantic representation and applied to social network cross-media search. Firstly, an image-word association graph is constructed, and direct and implicit semantic associations between image and text words are mined based on the graph random walk strategy to expand the semantic relationship. A cross-media fine-grained feature learning network based on cross-media attention is constructed, and the fine-grained semantic association between images and texts is learned collaboratively through the cross-media attention mechanism. A cross-media adversarial hash network is constructed, and an efficient and compact cross-media unified hash semantic representation is obtained by the joint cross-media fine-grained semantic association learning and adversarial hash learning. Experimental results show that CMFAH achieves better cross-media search performance on two benchmark cross-media datasets.
Key wordsCross-Media Representation Learning    Adversarial Hash Attention Network    Fine-Grained Representation Learning    Cross-Media Collaborative Attention Mechanism    Cross-Media Search   
收稿日期: 2021-04-28     
ZTFLH: TP 391  
基金资助:国家重点研发计划项目(No.2018YFB1402600)、国家自然科学基金项目(No.61877006,62192784)、中国人工智能学会-华为MindSpore学术奖励基金项目(No.S2021264)资助
通讯作者: 杜军平,博士,教授,主要研究方向为人工智能、机器学习、模式识别等.E-mail: junpingdu@126.com.   
作者简介: 梁美玉,博士,副教授,主要研究方向为人工智能、数据挖掘、多媒体信息处理、计算机视觉等.E-mail:meiyu1210@bupt.edu.cn.
王笑笑,硕士,主要研究方向为跨媒体语义学习和搜索、深度学习等.E-mail:buxiaoyy6437@163.com.
引用本文:   
梁美玉, 王笑笑, 杜军平. 基于多模态图和对抗哈希注意力网络的跨媒体细粒度表示学习[J]. 模式识别与人工智能, 2022, 35(3): 195-206. LIANG Meiyu, WANG Xiaoxiao, DU Junping. Cross-Media Fine-Grained Representation Learning Based on Multi-modal Graph and Adversarial Hash Attention Network. Pattern Recognition and Artificial Intelligence, 2022, 35(3): 195-206.
链接本文:  
http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202203001      或     http://manu46.magtech.com.cn/Jweb_prai/CN/Y2022/V35/I3/195
版权所有 © 《模式识别与人工智能》编辑部
地址:安微省合肥市蜀山湖路350号 电话:0551-65591176 传真:0551-65591176 Email:bjb@iim.ac.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn