|
|
Cross-Channel Feature-Enhanced Graph Convolutional Network for Skeleton-Based Action Recognition |
WU Zhize1, CHEN Sheng1, TAN Ming1,2, SUN Fei1, YANG Jing1 |
1. School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601; 2. School of Mechanical and Electrical Engineering, Hefei Technology College, Hefei 238010 |
|
|
Abstract Traditional graph convolutional networks for skeleton-based action recognition struggle to model long-range joint relationships and long-term temporal information due to their local operation mode, failing to capture subtle variations between actions. To address this problem, a cross-channel feature-enhanced graph convolutional network(CFE-GCN) for skeleton-based action recognition is proposed including a dual part-wise grouping graph convolution(DPG-GC) module, a cross-stage partial dense connections(CS-PDC) module and a multi-scale temporal convolution(MS-TC) module. The DPG-GC module models the human body joints by a grouping strategy to extract multi-granularity features and capture the subtle local differences between the joints. The CS-PDC module establishes associations between nodes and the previous network layers, enriching the early information and capturing the potential long-term relationships between the moving joints, and thereby contextual features are learned more comprehensively. The MS-TC module performs temporal convolution with different receptive fields to capture both short-term and long-term dependencies in the temporal domain. Experiments show that CFE-GCN achieves superior performance on multiple benchmark datasets.
|
Received: 28 June 2024
|
|
Fund:National Natural Science Foundation of China(No.62406095), Natural Science Foundation of Anhui Province(No.2308085MF213), Key Research Plan of Anhui Province(No.2022K07020011), University Scientific Research Innovation Team Project of Anhui Province(No.2022AH010095) |
Corresponding Authors:
YANG Jing, Ph.D., associate professor. Her research interests include deep learning, image processing and graph neural networks.
|
About author:: WU Zhize, Ph.D., associate professor. His research interests include deep learning-driven image and video processing and ana-lysis. CHEN Sheng, Master student. His research interests include graph neural network and skeleton based action recognition. TAN Ming, Master, professor. His research interests include pattern recognition and computer vision. SUN Fei, Master, professor. His research interests include pattern recognition and computer vision. |
|
|
|
[1] SONG S J, LAN C L, XING J L, et al. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Ske-leton Data. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 4263-4270. [2] DU Y, WANG W, WANG L.Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 1110-1118. [3] ZHANG P F, LAN C L, XING J L, et al. View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2136-2145. [4] LI C K, WANG P C, WANG S, et al. Skeleton-Based Action Re-cognition Using LSTM and CNN // Proc of the IEEE International Conference on Multimedia and Expo Workshops. Washington, USA: IEEE, 2017: 585-590. [5] LI B, DAI Y C, CHENG X L, et al. Skeleton Based Action Recognition Using Translation-Scale Invariant Image Mapping and Multi-scale Deep CNN // Proc of the IEEE International Conference on Multimedia and Expo Workshops. Washington, USA: IEEE, 2017: 601-604. [6] LIU J, SHAHROUDY A, XU D,et al. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition // Proc of the 14th European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2016: 816-833. [7] KIM T S, REITER A.Interpretable 3D Human Action Analysis with Temporal Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2017: 1623-1631. [8] YAN S J, XIONG Y J, LIN D H.Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(2): 7444-7452. [9] LI M S, CHEN S H, CHEN X, et al. Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3590-3598. [10] CHEN Y X, ZHANG Z Q, YUAN C F, et al. Channel-Wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 13339-13348. [11] 夏利民,时晓亭.基于关键帧的复杂人体行为识别.模式识别与人工智能, 2016, 29(2): 154-162. (XIA L M, SHI X T.Recognition of Complex Human Behavior Based on Key Frames. Pattern Recognition and Artificial Intelligence, 2016, 29(2): 154-162.) [12] WU Z Z, SUN P P, CHEN X, et al. SelfGCN: Graph Convolution Network with Self-Attention for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2024, 33: 4391-4403. [13] SHI L, ZHANG Y F, CHENG J, et al. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 12018-12027. [14] CHENG K, ZHANG Y F, HE X Y, et al. Skeleton-Based Action Recognition with Shift Graph Convolutional Network // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 180-189. [15] LIU Z Y, ZHANG H W, CHEN Z H, et al. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2020: 140-149. [16] MIAO S Y, HOU Y H, GAO Z M, et al. A Central Difference Graph Convolutional Operator for Skeleton-Based Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4893-4899. [17] YE F F, PU S L, ZHONG Q Y, et al. Dynamic GCN: Context-Enriched Topology Learning for Skeleton-Based Action Recognition // Proc of the 28th ACM International Conference on Multimedia. New York, USA: ACM, 2020: 55-63. [18] CHEN Z, LI S C, YANG B, et al. Multi-scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(2): 1113-1122. [19] YANG H, YAN D, ZHANG L, et al. Feedback Graph Convolutional Network for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2022, 31: 164-175. [20] KE L P, PENG K C, LYU S W.Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 1131-1139. [21] WANG M S, NI B B, YANG X K.Learning Multi-view Interactional Skeleton Graph for Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6940-6954. [22] WEN Y H, GAO L, FU H B, et al. Motif-GCNs with Local and Non-local Temporal Blocks for Skeleton-Based Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2009-2023. [23] SONG Y F, ZHANG Z, SHAN C F, et al. Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1474-1488. [24] ZHU Y S, SHUAI H, LIU G C, et al. Multilevel Spatial-Temporal Excited Graph Network for Skeleton-Based Action Recognition. IEEE Transactions on Image Processing, 2023, 32: 496-508. [25] YUN X, XU C L, RIOU K, et al. Behavioral Recognition of Skeletal Data Based on Targeted Dual Fusion Strategy. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(7): 6917-6925. [26] SI C Y, CHEN W T, WANG W, et al. An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 1227-1236. [27] CHENG K, ZHANG Y F, CAO C Q, et al. Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 536-553. [28] CHI H G, HA M H, CHI S G, et al. InfoGCN: Representation Learning for Human Skeleton-Based Action Recognition // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 20154-20164. [29] CHENG Q, CHENG J, REN Z L, et al. Multi-scale Spatial-Temporal Convolutional Neural Network for Skeleton-Based Action Re-cognition. Pattern Analysis and Applications, 2023, 26(3): 1303-1315. |
|
|
|