Pattern Recognition and Artificial Intelligence

模式识别与人工智能

Wednesday, Aug. 13, 2025

Home About Journal Editorial Board Instructions Ethics Statement Contact Us 中文

	Judgement and Disposal of Academic Misconduct Article

	Copyright Transfer Agreement

	Proof of Confidentiality

	Requirements for Electronic Version

More....

	Chinese Association of Automation

	National ResearchCenter for Intelligent Computing System

	Institute of Intelligent Machines,Chinese Academy of Sciences

More....

	2020 Vol.33 Issue.10, Published 2020-10-25

	867	Cross-Modal Person Re-identification Based on Local Heterogeneous Collaborative Dual-Path Network
		ZHENG Aihua, ZENG Xiaoqiang, JIANG Bo, HUANG Yan, TANG Jin
		The coordinating fusion between modalities is ignored in the existing cross-modal person re-identification methods in the learning process. In this paper, a strategy for cross-modal person re-identification(Re-ID) based on local heterogeneous collaborative dual-path network is proposed. Firstly, the global features of each modality are extracted by the dual-path network for local refinement, and the structured local information of pedestrians is mined. Then, the local information of different modalities is correlated with the label and prediction information to achieve cooperative adaptive fusion and learn more discriminative features. The effectiveness of the proposed method is demonstrated through comprehensive
		2020 Vol. 33 (10): 867-878 [Abstract] ( 817 ) [HTML 1KB] [ PDF 2224KB] ( 566 )

	879	Multi-type Features Network for Person Re-identification
		WANG Peng, SONG Xiaoning, WU Xiaojun, YU Dongjun
		The attention mechanism is effective in person re-identification. However, the performance of the combined use of different types of attention mechanisms needs to be improved, such as spatial attention and self-attention. An improved convolutional block attention model(CBAM-PRO) is proposed, and then a multi-type features network(MTFN) is proposed. The features of different interested domains are extracted through the integration of CBAM-Pro and self-attention mechanism, and the local features with different granularities are introduced concurrently to perform person re-identification jointly. The validity and reliability of MTFN are verified by the experiments on the existing general benchmark datasets.
		2020 Vol. 33 (10): 879-888 [Abstract] ( 641 ) [HTML 1KB] [ PDF 964KB] ( 721 )

	889	Target Detector with Channel Attention and Residual Learning
		CHU Jun, ZHU Xiaoyang, LENG Lu, MIAO Jun
		The feature information of feature maps of different scales cannot be fully utilized by the existing feature pyramid of target detectors, and these detectors are not suitable for the detection of low-resolution image targets and small targets. To solve this problem, a target detector with channel attention mechanism and residual learning block is proposed. Firstly, the channel global attention mechanism is introduced to learn the weights of different channel features in the feature map through the network and thus the global feature information is enhanced effectively. Then, lightweight residual blocks are exploited to highlight small changes of features and improve the detection performance for small targets in low-resolution images. In addition, deep features are merged into the shallow feature maps for prediction to improve the detection accuracy of small targets. The experimental results on standard test datasets show that the proposed target detector is suitable for low-resolution images and obtains a better detection result for small targets.
		2020 Vol. 33 (10): 889-897 [Abstract] ( 656 ) [HTML 1KB] [ PDF 1803KB] ( 617 )

	898	Deep Networks Detection Algorithm Fusing Multiple Dilated Convolution Operator and Multi-level Characteristics
		ZHANG Xinliang, XIE Heng, ZHAO Yunji, WANG Wanru, WEI Shengqiang
		The exclusive usage of sequential convolution operation in the deep networks results in the lack of the target detailed information of feature layers and global characteristics. The detection performance for small objects and the detection accuracy are reduced. In this paper, a deep networks detection algorithm fusing multiple dilated convolution(MDC) operator and multi-level characteristics is proposed based on the residual network structure. The convolution kernel is composed of 5 different receptive fields and 8 different semantic feature maps can be generated. The MDC operator is introduced into the feature extraction block to build a new feature layer. The transposition convolution is employed to increase the dimension of the detection layer and make a collage of multi-level feature layers. Thus, the original features of the targets can be retained in the newly generated detection layer to the most extent. Finally, the detection model is constructed by the non-maximal suppression. The experimental results show that the proposed model with the multi-leveled features and MDC operator can effectively improve the mean average precision and detection performance for small targets.
		2020 Vol. 33 (10): 898-905 [Abstract] ( 494 ) [HTML 1KB] [ PDF 2964KB] ( 507 )

	906	Salient Object Detection Based on Stack Edge-Aware Module
		YANG Jiaxin, HU Xiao, XIANG Junjiang
		To improve the poor performance of the existing salient object detection algorithms in edge perception, a salient object detection algorithm based on stack edge-aware module is proposed to utilize high-level semantic information and low-level texture information effectively. Multi-scale backbone network is utilized as the backbone network to extract the multi-scale and multi-target salient features. In stacked edge-aware module, the high-level information and low-level information of the image are combined in an asymmetric manner to enhance the area of the salient object. The network outputs salient object detection results. The experiments on five public datasets indicate that the proposed algorithm produces better detection results and better performance in objective evaluation indicators and subjective visual effects.
		2020 Vol. 33 (10): 906-916 [Abstract] ( 628 ) [HTML 1KB] [ PDF 2089KB] ( 611 )

	917	Real-Time Road Element Detection Based on Keypoints Estimation
		LIU Xianmei, JING Yahong, TIAN Feng, LIU Fang
		Aiming at the problems of high cost of manually designing neural network structure, large amount of calculation of the classification and regression task based on the anchor boxes, and weak detection ability for small targets, a real-time road element detection model based on keypoint estimation is proposed. NAS-based EfficientNet-B3 is employed as the feature extraction network. An improved bi-directional feature pyramid network(BiFPN) method is exploited as the feature fusion network. Instead of anchor boxes, keypoint estimation is utilized for classification and regression tasks. The experiment on BDD100K dataset shows that the proposed model achieves a good precision in real-time detection and a high precision for small objects.
		2020 Vol. 33 (10): 917-925 [Abstract] ( 435 ) [HTML 1KB] [ PDF 1629KB] ( 490 )

	926	Bridge Surface Crack Detection Algorithm Based on YOLOv3 and Attention Mechanism
		CAI Fenghuang, ZHANG Yuexin, HUANG Jie
		To realize fast and accurate detection of bridge surface cracks for the timely repair, a bridge surface crack detection algorithm based on improved YOLOv3 (Crack-YOLO) is proposed. Crack-YOLO is combined with depthwise separable convolutions and attention mechanism to detect bridge surface cracks in real time. The standard convolution of YOLOv3 is replaced with the depthwise separable convolution to reduce the number of network parameters. Moreover, the inverted residual block of MobileNet V2 is introduced to solve the problem of precision decline caused by depthwise separable convolution. In Crack-YOLO, both channel attention and spatial attention of the image are taken into account through the convolution block attention module to learn the feature selectively. The experimental results show that Crack-YOLO detects the cracks on the surface of the bridge in real time. Compared with YOLOv3, Crack-YOLO produces smaller weights and higher detection accuracy at a higher detection speed.
		2020 Vol. 33 (10): 926-933 [Abstract] ( 1000 ) [HTML 1KB] [ PDF 1231KB] ( 1385 )

	934	Point Cloud Recognition Based on Local Surface Feature Histogram
		LU Jun, HUA Bowen, ZHU Bo
		Aiming at fast recognition of 3D point clouds, a point cloud recognition algorithm based on local surface feature histogram is proposed. Firstly, the cyclic voxel filtering algorithm is applied to filter the point clouds with different resolutions to the specified resolution. Secondly, the points with obvious local characteristics are selected as the key points based on the key point search algorithm with the maximum mean curvature of the neighborhood. The feature descriptor of the key point is calculated according to the relationship between the center of gravity of the point clouds in the neighborhood and the normal and distance of each point in the neighborhood surface. Then, the features are matched according to the spatial relationship between the adjacent key points and the Euclidean distance of the feature descriptor. Finally, the multithread recognition framework is adopted to speed up the online recognition. The experimental results show that the recognition speed is high.
		2020 Vol. 33 (10): 934-943 [Abstract] ( 489 ) [HTML 1KB] [ PDF 1485KB] ( 527 )

	944	Multi-person Human Pose Estimation Based on Deformable Convolution
		ZHAO Yunxiao, QIAN Yuhua, WANG Keqi
		Deep neural networks for human pose estimation all sample at the fixed position of the feature map, and therefore it is difficult to model the geometric transformation of human pose. The generalization ability of the network is poor with the variation of the size, pose and shooting angle of the human instance. To solve this problem, multi-person human pose estimation based on deformable convolution is proposed.Based on the strong ability of deformable convolution in modeling geometric transformation of targets, a feature extraction module is designed to ensure the detection accuracy under the geometric changes of human key points. To further improve the performance of the network, the prediction value of the model and the truth value generated by the two-dimensional Gaussian model are employed to calculate the loss, and the model is trained iteratively. The human key points are detected effectively by the proposed model under the complex conditions, such as shooting angle, attachment and character scale changes. The experiment shows that the proposed model effectively improves the accuracy of human key point detection.
		2020 Vol. 33 (10): 944-950 [Abstract] ( 542 ) [HTML 1KB] [ PDF 5282KB] ( 534 )

	951	Video-Based Temporal Enhanced Action Recognition
		ZHANG Haobo, FU Dongmei, ZHOU Ke
		Aiming at the spatio-temporal modeling in video action recognition, a temporal enhanced action recognition algorithm based on fused spatio-temporal features is proposed under the deep learning framework. To lower the cost of video-level temporal modeling, a sparse sampling strategy is employed to adapt to video duration changes. In the recognition stage, temporal difference between adjacent feature maps is calculated to enhance the motion information in the feature level. The combination of residual structure and temporal enhanced structure is introduced to further improve the representation ability of the network. Experimental results show that the proposed algorithm obtains higher accuracy on UCF101 and HMDB51 datasets and achieves better results in the actual industrial operation recognition scene with a smaller network scale.
		2020 Vol. 33 (10): 951-958 [Abstract] ( 576 ) [HTML 1KB] [ PDF 3103KB] ( 718 )

模式识别与人工智能

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press

Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech 　Email:support@magtech.com.cn