Pattern Recognition and Artificial Intelligence

模式识别与人工智能

Wednesday, Aug. 13, 2025

Home About Journal Editorial Board Instructions Ethics Statement Contact Us 中文

	Judgement and Disposal of Academic Misconduct Article

	Copyright Transfer Agreement

	Proof of Confidentiality

	Requirements for Electronic Version

More....

	Chinese Association of Automation

	National ResearchCenter for Intelligent Computing System

	Institute of Intelligent Machines,Chinese Academy of Sciences

More....

	2021 Vol.34 Issue.11, Published 2021-11-25

	Deep Learning Design and Application

Deep Learning Design and Application

	969	Information Diffusion Prediction Based on Cascade Spatial-Temporal Feature
		LIANG Shaobin, CHEN Zhihao, WEI Jingjing, WU Yunbing, LIAO Xiangwen
		The existing information diffusion prediction methods model the cascade sequences and topological structure independently. And thus it is difficult to learn the interactive expression of cascade temporal and structural features in the embedded space, and the portrayal of dynamic evolution of information diffusion is insufficient. Aiming at this problem, an information diffusion prediction method based on cascade spatial-temporal feature is proposed. Based on the social network and diffusion paths, the heterogeneous graphs are constructed. The structural context of nodes of heterogeneous graphs and social network is learned by graph neural network, while the cascade temporal feature is captured by gated recurrent unit. To make microscopic information prediction, the cascade spatial-temporal feature is constructed by fusing structure context and temporal feature. The experimental results on Twitter and Memes datasets demonstrate that the performance of the proposed method is improved to a certain extent.
		2021 Vol. 34 (11): 969-978 [Abstract] ( 752 ) [HTML 1KB] [ PDF 866KB] ( 677 )

	979	Gaussian Denoising Method with Tunable Noise Level Based on Dilated Convolutional Neural Network
		JIN Yifan, YU Lei, FEI Shumin2
		Artifacts on sharp edges are easily generated when the dilated convolutional neural network(CNN) is utilized in the existing image denoising methods based on deep learning. Moreover, multiple specific denoising models are required to be trained to deal with different noise levels. Aiming at these problems, a Gaussian denoising method with tunable noise level based on dilated CNN is proposed. A noise level map is employed to make the noise level tunable. Besides, the improved dilated CNN and the reversible downsampling technology are employed, and thus the problem of artifacts on sharp edges caused by traditional dilated CNN is alleviated. The downsampled sub-images and corresponding noise level map are input into the nonlinear mapping model. Then, the model is trained by the improved CNN with the dilated rate reduced. Experiments show that the proposed method gains GPU acceleration and the ability of adjusting the noise level with the artifacts on sharp edges improved and more image details retained.
		2021 Vol. 34 (11): 979-989 [Abstract] ( 531 ) [HTML 1KB] [ PDF 3371KB] ( 516 )

	990	Image Dehazing Based on Generative Adversarial Network
		HUANG Shuying, WANG Bin, LI Hongxia, YANG Yong, HU Wei
		Compared with the image dehazing methods based on image enhancement or physical model, the current image dehazing methods based on deep learning improve the computational efficiency to a certain extent. Nevertheless, the problems of incomplete dehazing and color distortion still exist in complex scenes. Aiming at the different perceptions of human eyes on global and local features, an algorithm of image dehazing based on generative adversarial networks is proposed. Firstly, a multi-scale generator network is designed. The full-size image and the segmented image block are taken as the input to extract the global contour information and local detail information of the image. Then, a feature fusion module is constructed to fuse the global and local information, and the authenticity of the generated dehazing image is judged by the discriminant network. To make the generated dehazing image closer to the corresponding real haze-free image, a multivariate joint loss function is designed by combining the dark channel prior loss, the adversarial loss, the structural similarity loss and the smooth L1 loss to train the network. Experimental results show that the proposed algorithm is superior to some state-of-the-art dehazing algorithms.
		2021 Vol. 34 (11): 990-1003 [Abstract] ( 1099 ) [HTML 1KB] [ PDF 5484KB] ( 989 )

	1004	Deep Snake with 2D-Circular Convolution and Difficulty Sensitive Contour-IoU Loss
		LI Hao, YUAN Guanglin, LI Congli, QIN Xiaoyan, ZHU Hong
		The initial bounding box is deformed to the object contour end-to-end by Deep Snake, and the performance of instance segmentation is significantly improved. However, the problems of sensitivity to the initial bounding box and independent regression of contour parameters emerge. To address these issues, Deep Snake with 2D-circular convolution and difficulty sensitive intersection over union(contour-IoU) loss is proposed. Firstly, 2D-circular convolution is designed based on the spatial context information of the contour to solve the problem of sensitivity to the initial bounding box. Secondly, difficulty sensitive contour-IoU loss function is proposed according to the geometric meaning of the definite integral and the difficulty of the sample to regress the contour parameters as a whole unit. Finally, instance segmentation is accomplished by the proposed 2D-circular convolution and difficulty sensitive contour-IoU loss function. Experiments on Cityscapes, Kins and Sbd datasets show that the proposed method achieves better segmentation accuracy.
		2021 Vol. 34 (11): 1004-1016 [Abstract] ( 458 ) [HTML 1KB] [ PDF 6282KB] ( 484 )

	1017	Multi-modal and Multi-label Emotion Detection for Comics Based on Two-Stream Network
		LIN Zhentao, ZENG Bi , PAN Zhihao, WEN Song
		Comic is widely applied for metaphorizing social phenomena and expressing emotion in social media. To solve the problem of label ambiguity in multi-modal and multi-label emotion detection of comic scenes, a multi-modal and multi-label emotion detection model for comics based on two-stream network is proposed. The inter-modal information is compared using cosine similarity and combined with a self-attention mechanism to merge image features and text features. Then, the backbone of the method is a two-stream structure taking the Transformer model as the image backbone network to extract image features and taking the Roberta pre-training model as the text backbone network to extract text features. The improved cosine similarity is combined with cosine self-attention mechanism and multi-head self-attention mechanism(COS-MHSA) to extract the high-level features of the image. Finally, the multi-modal features of the high-level features and COS-MHSA are fused. The effectiveness of the proposed method is verified on EmoRecCom dataset, and the emotion detection result is presented in a visual manner.
		2021 Vol. 34 (11): 1017-1027 [Abstract] ( 547 ) [HTML 1KB] [ PDF 3921KB] ( 460 )

	1028	End-to-End Infrared and Visible Image Fusion Method Based on GhostNet
		CHENG Chunyang, WU Xiaojun, XU Tianyang
		Most of the existing deep learning based infrared and visible image fusion methods are grounded on manual fusion strategies in the fusion layer. Therefore ,they are incapable of designing an appropriate fusion strategy for the specific image fusion task.To overcome this problem, an end-to-end infrared and visible image fusion method based on GhostNet is proposed.The Ghost module is employed to replace the ordinary convolution layer in the network architecture, and thus it becomes a lightweight model.The constraint of the loss function makes the network learn the adaptive image features for the fusion task, and consequently the feature extraction and fusion are accomplished at the same time. In addition, the perceptual loss is introduced into the design of the loss function. The deep semantic information of source images is utilized in the image fusion as well.Source images are concatenated in the channel dimension and then fed into the deep network.A densely connected encoder is applied to extract deep features of source images. The fusion result is obtained through the reconstruction of the decoder. Experiments show that the proposed method is superior in subjective comparison and objective image quality evaluation metrics.
		2021 Vol. 34 (11): 1028-1037 [Abstract] ( 586 ) [HTML 1KB] [ PDF 1948KB] ( 611 )

	1038	Lightweight Model Construction Based on Neural Architecture Search
		YAO Xiao, SHI Yewei, HUO Guanying, XU Ning
		The traditional deep neural network cannot be deployed on the edge devices with limited computing capacity due to numerous parameters and high computation. In this paper, a lightweight network based on neural architecture search is specially designed to solve this problem. Convolution units of different groups are regarded as search space, and neural architecture search is utilized to obtain both the group structure and the overall architecture of the network. In the meanwhile, a cycle annealing search strategy is put forward to solve the multi-objective optimization problem of neural architecture search with the consideration of the accuracy and the computation cost of the model. Experiments on datasets show that the proposed network model achieves a better performance than the state-of-the-art methods.
		2021 Vol. 34 (11): 1038-1048 [Abstract] ( 662 ) [HTML 1KB] [ PDF 790KB] ( 1112 )

	1049	Feature Fusion Learning Network for Aspect-Level Sentiment Classification
		CHEN Jinguang, ZHAO Yinge, MA Lili
		In the aspect-level sentiment classification task, the abilities of the existing methods to enhance aspect terms information and utilize local feature information are weak. To settle this problem, a feature fusion learning network(FFLN) is proposed. Firstly, comments are processed into text, aspect and text-aspect as input. After obtaining vector representation of the input by bidirectional encoder representation from the transformers model, the attention encoder is utilized to obtain the hidden state of the context and aspect items and extract the semantic information. Then, based on the hidden state feature, aspect-specific text vector representation is generated using aspect-specific transformation component to integrate aspect terms information into context representation. Finally, the local features are extracted from aspect-specific text vector by the context position weighted module. The final representation features are obtained by the fusion learning of global and local features, and sentiment classification is conducted. Experiments on classical English datasets and Chinese review datasets show that FFLN improves the classification effect.
		2021 Vol. 34 (11): 1049-1057 [Abstract] ( 466 ) [HTML 1KB] [ PDF 627KB] ( 847 )

	1058	U-Net Based Feature Interaction Segmentation Method
		SUN Junding, HUI Zhenkun, TANG Chaosheng, WU Xiaosheng
		To address the problems of mis-segmentation and missing segmentation of small targets in liver segmentation, a U-Net based feature interaction segmentation method is proposed using ResNet34 as the backbone network. To achieve non-local interactions between different scales, a transformer-based feature interaction pyramid module is designed as the bridge of the network to obtain feature maps with richer contextual information. A multi-scale attention mechanism is designed to replace the jumping connection in U-Net, considering the small targets in the image and sufficiently acquiring the contextual information of the target layer. Experiments on the public dataset LiTS and the dataset consisting of 3Dircadb and CHAOS demonstrate that the proposed method achieves good segmentation results.
		2021 Vol. 34 (11): 1058-1068 [Abstract] ( 570 ) [HTML 1KB] [ PDF 2174KB] ( 573 )

模式识别与人工智能

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press

Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech 　Email:support@magtech.com.cn