Cross-Channel Feature-Enhanced Graph Convolutional Network for Skeleton-Based Action Recognition
WU Zhize1, CHEN Sheng1, TAN Ming1,2, SUN Fei1, YANG Jing1
1. School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601; 2. School of Mechanical and Electrical Engineering, Hefei Technology College, Hefei 238010
Abstract:Traditional graph convolutional networks for skeleton-based action recognition struggle to model long-range joint relationships and long-term temporal information due to their local operation mode, failing to capture subtle variations between actions. To address this problem, a cross-channel feature-enhanced graph convolutional network(CFE-GCN) for skeleton-based action recognition is proposed including a dual part-wise grouping graph convolution(DPG-GC) module, a cross-stage partial dense connections(CS-PDC) module and a multi-scale temporal convolution(MS-TC) module. The DPG-GC module models the human body joints by a grouping strategy to extract multi-granularity features and capture the subtle local differences between the joints. The CS-PDC module establishes associations between nodes and the previous network layers, enriching the early information and capturing the potential long-term relationships between the moving joints, and thereby contextual features are learned more comprehensively. The MS-TC module performs temporal convolution with different receptive fields to capture both short-term and long-term dependencies in the temporal domain. Experiments show that CFE-GCN achieves superior performance on multiple benchmark datasets.
[1] SONG S J, LAN C L, XING J L, et al. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Ske-leton Data. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 4263-4270. [2] DU Y, WANG W, WANG L.Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 1110-1118. [3] ZHANG P F, LAN C L, XING J L, et al. View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2136-2145. [4] LI C K, WANG P C, WANG S, et al. Skeleton-Based Action Re-cognition Using LSTM and CNN // Proc of the IEEE International Conference on Multimedia and Expo Workshops. Washington, USA: IEEE, 2017: 585-590. [5] LI B, DAI Y C, CHENG X L, et al. Skeleton Based Action Recognition Using Translation-Scale Invariant Image Mapping and Multi-scale Deep CNN // Proc of the IEEE International Conference on Multimedia and Expo Workshops. Washington, USA: IEEE, 2017: 601-604. [6] LIU J, SHAHROUDY A, XU D,et al. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition // Proc of the 14th European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2016: 816-833. [7] KIM T S, REITER A.Interpretable 3D Human Action Analysis with Temporal Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2017: 1623-1631. [8] YAN S J, XIONG Y J, LIN D H.Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(2): 7444-7452. [9] LI M S, CHEN S H, CHEN X, et al. Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3590-3598. [10] CHEN Y X, ZHANG Z Q, YUAN C F, et al. Channel-Wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 13339-13348. [11] 夏利民,时晓亭.基于关键帧的复杂人体行为识别.模式识别与人工智能, 2016, 29(2): 154-162. (XIA L M, SHI X T.Recognition of Complex Human Behavior Based on Key Frames. Pattern Recognition and Artificial Intelligence, 2016, 29(2): 154-162.) [12] WU Z Z, SUN P P, CHEN X, et al. SelfGCN: Graph Convolution Network with Self-Attention for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2024, 33: 4391-4403. [13] SHI L, ZHANG Y F, CHENG J, et al. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 12018-12027. [14] CHENG K, ZHANG Y F, HE X Y, et al. Skeleton-Based Action Recognition with Shift Graph Convolutional Network // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 180-189. [15] LIU Z Y, ZHANG H W, CHEN Z H, et al. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2020: 140-149. [16] MIAO S Y, HOU Y H, GAO Z M, et al. A Central Difference Graph Convolutional Operator for Skeleton-Based Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4893-4899. [17] YE F F, PU S L, ZHONG Q Y, et al. Dynamic GCN: Context-Enriched Topology Learning for Skeleton-Based Action Recognition // Proc of the 28th ACM International Conference on Multimedia. New York, USA: ACM, 2020: 55-63. [18] CHEN Z, LI S C, YANG B, et al. Multi-scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(2): 1113-1122. [19] YANG H, YAN D, ZHANG L, et al. Feedback Graph Convolutional Network for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2022, 31: 164-175. [20] KE L P, PENG K C, LYU S W.Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 1131-1139. [21] WANG M S, NI B B, YANG X K.Learning Multi-view Interactional Skeleton Graph for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6940-6954. [22] WEN Y H, GAO L, FU H B, et al. Motif-GCNs with Local and Non-local Temporal Blocks for Skeleton-Based Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2009-2023. [23] SONG Y F, ZHANG Z, SHAN C F, et al. Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1474-1488. [24] ZHU Y S, SHUAI H, LIU G C, et al. Multilevel Spatial-Temporal Excited Graph Network for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2023, 32: 496-508. [25] YUN X, XU C L, RIOU K, et al. Behavioral Recognition of Skeletal Data Based on Targeted Dual Fusion Strategy. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(7): 6917-6925. [26] SI C Y, CHEN W T, WANG W, et al. An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 1227-1236. [27] CHENG K, ZHANG Y F, CAO C Q, et al. Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 536-553. [28] CHI H G, HA M H, CHI S G, et al. InfoGCN: Representation Learning for Human Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 20154-20164. [29] CHENG Q, CHENG J, REN Z L, et al. Multi-scale Spatial-Temporal Convolutional Neural Network for Skeleton-Based Action Re-cognition. Pattern Analysis and Applications, 2023, 26(3): 1303-1315.