1.Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education, Anhui University, Heifei 230601; 2.Key Laboratory of Industrial Image Processing and Analysis of Anhui Province, Science and Technology Department of Anhui Province, Hefei 230039; 3.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190
Abstract:The coordinating fusion between modalities is ignored in the existing cross-modal person re-identification methods in the learning process. In this paper, a strategy for cross-modal person re-identification(Re-ID) based on local heterogeneous collaborative dual-path network is proposed. Firstly, the global features of each modality are extracted by the dual-path network for local refinement, and the structured local information of pedestrians is mined. Then, the local information of different modalities is correlated with the label and prediction information to achieve cooperative adaptive fusion and learn more discriminative features. The effectiveness of the proposed method is demonstrated through comprehensive
[1] LAN X Y, MA A J, YUEN P C, et al. Joint Sparse Representation and Robust Feature-Level Fusion for Multi-cue Visual Tracking. IEEE Transactions on Image Processing, 2015, 24(12): 5826-5841. [2] CHANG X J, HUANG P Y, SHEN Y D, et al. RCAA: Relational Context-Aware Agents for Person Search // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 86-102. [3] FARENZENA M, BAZZANI L, PERINA A, et al. Person Re-identification by Symmetry-Driven Accumulation of Local Features // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2010: 2360-2367. [4] GRAY D, TAO H. Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features // Proc of the European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2008: 262-275. [5] KVIATKOVSKY I, ADAM A, RIVLIN E. Color Invariants for Person Re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(7): 1622-1634. [6] LIAO S C, HU Y, ZHU X Y, et al. Person Re-identification by Local Maximal Occurrence Representation and Metric Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 2197-2206. [7] MA B P, SU Y, JURIE F. Local Descriptors Encoded by Fisher Vectors for Person Re-identification // Proc of the European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2012: 413-422. [8] MATSUKAWA T, OKABE T, SUZUKI E, et al. Hierarchical Gaussian Descriptor for Person Re-identification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1363-1372. [9] ZHAO R, OUYANG W L, WANG X G. Learning Mid-level Filters for Person Re-identification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 144-151. [10] LI W, ZHAO R, XIAO T, et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 152-159. [11] RISTANI E, SOLERA F, ZOU R, et al. Performance Measures and a Data Set for Multi-target, Multi-camera Tracking // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 17-35. [12] ZHENG L, BIE Z, SUN Y F, et al. MARS: A Video Benchmark for Large-Scale Person Re-identification // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 868-884. [13] ZHENG L, SHEN L Y, TIAN L, et al. Scalable Person Re-identification: A Benchmark // Proc of the IEEE International Confe-rence on Computer Vision. Washington, USA: IEEE, 2015: 1116-1124. [14] SUN Y F, ZHENG L, DENG W J, et al. SVDNet for Pedestrian Retrieval // Proc of the IEEE International Conference on Compu-ter Vision. Washington, USA: IEEE, 2017: 3820-3828. [15] XIAO T, LI H S, OUYANG W L, et al. Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1249-1258. [16] ZHENG L, YANG Y, HAUPTMANN A G. Person Re-identification: Past, Present and Future[C/OL]. [2020-07-12].https://arxiv.org/pdf/1610.02984.pdf. [17] HOU R B, MA B P, CHANG H, et al. Interaction-and-Aggregation Network for Person Re-identification // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 9317-9326. [18] ZHANG Z Z, LAN C L, ZENG W J, et al. Densely Semantically Aligned Person Re-identification // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 667-676. [19] SUN Y F, XU Q, LI Y L, et al. Perceive Where to Focus: Lear-ning Visibility-Aware Part-Level Features for Partial Person Re-identification // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 393-402. [20] DAI Z Z, CHEN M Q, GU X D, et al. Batch DropBlock Network for Person Re-identification and Beyond // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 3690-3700. [21] ZHENG F, DEND C, SUN X, et al. Pyramidal Person Re-identification via Multi-loss Dynamic Training // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 8506-8514. [22] VARIOR R R, HALOI M, WANG G. Gated Siamese Convolutio-nal Neural Network Architecture for Human Re-identification // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 791-808. [23] YI D, LEI Z, LIAO S C, et al. Deep Metric Learning for Person Re-identification // Proc of the 22nd International Conference on Pattern Recognition. Washington, USA: IEEE, 2014: 34-39. [24] GENG M Y, WANG Y W, XIANG T, et al. Deep Transfer Lear-ning for Person Re-identification [C/OL].[2020-07-12]. https://arxiv.org/pdf/1611.05244.pdf. [25] ZHENG M, KARANAM S, WU Z Y, et al. Re-identification with Consistent Attentive Siamese Networks // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5735-5744. [26] CHEN W H, CHEN X T, ZHANG J G, et al. Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2017: 1320-1329. [27] CHENG D, GONG Y H, ZHOU S P, et al. Person Re-identification by Multi-channel Parts-Based CNN with Improved Triplet Loss Function // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1335-1344. [28] DING S Y, LIN L, WANG G R, et al. Deep Feature Learning with Relative Distance Comparison for Person Re-identification. Pattern Recognition, 2015, 48(10): 2993-3003. [29] HERMANS A, BEYER L, LEIBE B. In Defense of the Triplet Loss for Person Re-identification[C/OL]. [2020-07-12].https://arxiv.org/pdf/1703.07737.pdf. [30] SHI H L, YANG Y, ZHU X Y, et al. Embedding Deep Metric for Person Re-identification: A Study Against Large Variations // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 732-748. [31] 蒋桧慧,张荣,李小宝,等.融合直接度量和间接度量的行人再识别.模式识别与人工智能, 2018, 31(2): 167-174. (JIANG H H, ZHANG R, LI X B, et al. Pedestrian Re-identification Fusing Direct Metric and Indirect Metric. Pattern Recognition and Artificial Intelligence, 2018, 31(2): 167-174.) [32] WU A C, ZHENG W S, YU H X, et al. RGB-Infrared Cross-Modality Person Re-identification // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 5390-5399. [33] YE M, LAN X Y, LI J W, et al.Hierarchical Discriminative Learning for Visible Thermal Person Re-identification // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2018: 7501-7508. [34] YE M, WANG Z, LAN X Y, et al. Visible Thermal Person Re-identification via Dual-Constrained Top-Ranking // Proc of the 27th International Joint Conference on Artificial Intelligence. New York, USA: ACM, 2018: 1092-1099. [35] YE M, LAN X Y, WANG Z, et al. Bi-directional Center-Constrained Top-Ranking for Visible Thermal Person Re-identification. IEEE Transactions on Information Forensics and Security, 2019, 15: 407-419. [36] DAI P Y, JI R R, WANG H B, et al. Cross-Modality Person Re-identification with Generative Adversarial Training // Proc of the 27th International Joint Conference on Artificial Intelligence. New York, USA: ACM, 2018: 677-683. [37] HAO Y, WANG N N, LI J, et al. HSME: Hypersphere Manifold Embedding for Visible Thermal Person Re-identification // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 8385-8392. [38] WANG Z X, WANG Z, ZHENG Y Q, et al. Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-identification // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 618-626. [39] LIU H J, CHENG J, WANG W, et al. Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-identification. Neurocomputing, 2020, 398: 11-19. [40] WANG G A, ZHANG T Z, CHENG J, et al. RGB-Infrared Cross-Modality Person Re-identification via Joint Pixel and Feature Alignment // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2019: 3623-3632. [41] WANG G A, ZHANG T Z, YANG Y, et al.Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-identification // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2020: 12144-12151. [42] LU Y, WU Y, LIU B, et al. Cross-Modality Person Re-identification with Shared-Specific Feature Transfer // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 13376-13386. [43] ZHU Y X, YANG Z, WANG L, et al. Hetero-Center Loss for Cross-Modality Person Re-identification. Neurocomputing, 2020, 386: 97-109. [44] HE D, XIA Y C, QIN T, et al. Dual Learning for Machine Translation // LEE D D, SUGIYAMA M, LUXBURG U V, et al., eds. Advances in Neural Information Processing Systems 29. Cambridge, USA: The MIT Press, 2016: 820-828. [45] BATRA T, PARIKH D. Cooperative Learning with Visual Attri-butes[C/OL]. [2020-07-12].https://arxiv.org/pdf/1705.05512.pdf. [46] LAN X, ZHU X T, GONG S G. Knowledge Distillation by On-the-Fly Native Ensemble // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2018: 7528-7538. [47] SONG G C, CHAI W. Collaborative Learning for Deep Neural Networks // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2018: 1837-1846. [48] SUN Y F, ZHENG L, YANG Y, et al. Beyond Part Models: Person Retrieval with Refined Part Pooling (and a Strong Convolutional Baseline) // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 501-518. [49] WANG G S, YUAN Y F, CHEN X, et al. Learning Discriminative Features with Multiple Granularities for Person Re-identification // Proc of the 26th ACM International Conference on Multimedia. New York, USA: ACM, 2018: 274-282. [50] FU Y, WEI Y C, ZHOU Y Q, et al.Horizontal Pyramid Matching for Person Re-identification // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 8295-8302. [51] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [52] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely Connected Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2261-2269. [53] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2020-07-12]. https://arxiv.org/pdf/1409.1556.pdf. [54] NGUYEN D T, HONG H G, KIM K W, et al. Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras. Sensors, 2017, 17(3). DOI:10.3390/s17030605.