基于多模态图和对抗哈希注意力网络的跨媒体细粒度表示学习

doi:10.16451/j.cnki.issn1003-6059.202203001

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (732 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract There are problems of feature heterogeneity and semantic gap between data of different media types in cross-media data search, and social network data often exhibits semantic sparsity and diversity. Aiming at these problems, a cross-media fine-grained representation learning model based on multi-modal graph and adversarial Hash attention network(CMFAH) is proposed to obtain a unified cross-media semantic representation and applied to social network cross-media search. Firstly, an image-word association graph is constructed, and direct and implicit semantic associations between image and text words are mined based on the graph random walk strategy to expand the semantic relationship. A cross-media fine-grained feature learning network based on cross-media attention is constructed, and the fine-grained semantic association between images and texts is learned collaboratively through the cross-media attention mechanism. A cross-media adversarial hash network is constructed, and an efficient and compact cross-media unified hash semantic representation is obtained by the joint cross-media fine-grained semantic association learning and adversarial hash learning. Experimental results show that CMFAH achieves better cross-media search performance on two benchmark cross-media datasets.

Key words： Cross-Media Representation Learning Adversarial Hash Attention Network Fine-Grained Representation Learning Cross-Media Collaborative Attention Mechanism Cross-Media Search

Received: 28 April 2021

ZTFLH:

TP 391

Fund:Key Research and development Program of China(No.2018YFB1402600), National Natural Science Foundation of China(No.61877006,62192784), CAAI-Huawei MindSpore Open Fund(No.S2021264)

Corresponding Authors: DU Junping, Ph.D., professor. Her research interests include artificial intelligence, machine learning and pa-ttern recognition.

About author:: LIANG Meiyu, Ph.D., associate profe-ssor. Her research interests include artificial intelligence, data mining, multimedia information processing and computer vision.
WANG Xiaoxiao, master. Her research interests include cross-media semantic learning and search, and deep learning.

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	LIANG Meiyu
	WANG Xiaoxiao
	DU Junping

Cite this article:

LIANG Meiyu,WANG Xiaoxiao,DU Junping. Cross-Media Fine-Grained Representation Learning Based on Multi-modal Graph and Adversarial Hash Attention Network[J]. Pattern Recognition and Artificial Intelligence, 2022, 35(3): 195-206.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202203001 OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2022/V35/I3/195