Key Laboratory of Computer Vision and System of Ministry of Education,Tianjin University of Technology, Tianjin 300384 Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300384
Abstract:Much attention is paid to action description algorithm based on depth data now. However, there is no robust, efficient and distinguishing feature representation for depth data. To solve the problem, human action description algorithm based on depth dense spatio-temporal interest point is proposed. Multi-scale depth dense feature spatio-temporal interest points are selected and then tracked, and the trajectories of these points are saved. Finally, the trajectory information is utilized to represent human action. Through the evaluation on DHA, MSR Action 3D and UTKinect depth action dataset, the proposed algorithm show better performance compared with some state-of-the-art algorithms.
[1] Lin Y C, Hu M C, Cheng W H, et al. Human Action Recognition and Retrieval Using Sole Depth Information // Proc of the 20th ACM International Conference on Multimedia. Nara, Japan, 2012: 1053-1056 [2] Wang J, Liu Z C, Wu Y, et al. Mining Actionlet Ensemble for Action Recognition with Depth Cameras // Proc of the IEEE Con-ference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 1290-1297 [3] Li W Q, Zhang Z Y, Liu Z C. Action Recognition Based on a Bag of 3D Points // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. San Francisco, USA, 2010: 9-14 [4] Ni B B, Wang G, Moulin P. RGBD-HuDaAct: A Color-Depth Vi-deo Database for Human Daily Activity Recognition // Proc of the IEEE International Conference on Computer Vision Workshops. Barcelona, Spain, 2011: 1147-1153 [5] Megavannan V, Agarwal B, Venkatesh Babu R. Human Action Recognition Using Depth Maps // Proc of the International Conference on Signal Processing and Communications. Bangalore, India, 2012. DOI: 10.1109/SPCOM.2012.6290032 [6] Bobick A F, Davis J W. The Recognition of Human Movement Using Temporal Templates. IEEE Trans on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257-267 [7] Kellokumpu V, Pietikinen M, Heikkil J. Human Activity Recognition Using Sequences of Postures // Proc of the IAPR Conference on Machine Vision Applications. Tsukuba Science City, Japan, 2005: 570-573 [8] Dollar P, Rabaud V, Cottrell G, et al. Behavior Recognition via Sparse Spatio-Temporal Features // Proc of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. Beijing, China, 2005: 65-72 [9] Laptev I, Lindeberg T. Space-Time Interest Points // Proc of the 9th IEEE International Conference on Computer Vision. Nice, France, 2003, I: 432-439 [10] Chen M Y, Hauptmann A. MoSIFT: Recognizing Human Actions in Surveillance Videos. Technical Report, CMU-CS-09-161. Pittsburgh, USA: Carnegie Mellon University, 2009 [11] Gao Z, Song J M, Zhang H, et al. Human Action Recognition via Multi-modality Information. Journal of Electrical Engineering and Technology, 2014, 9(2): 739-748 [12] Yang X D, Zhang C Y, Tian Y L. Recognizing Actions Using Depth Motion Maps-Based Histograms of Oriented Gradients // Proc of the 20th ACM International Conference on Multimedia. Nara, Japan, 2012: 1057-1060 [13] Xia L, Aggarwal J K. Spatio-Temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013: 2834-2841 [14] Oreifej O, Liu Z C. HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013: 716-723 [15] Luo J J, Wang W, Qi H R. Spatio-Temporal Feature Extraction and Representation for RGB-D Human Action Recognition. Pattern Recognition Letters, 2014, 50: 139-148 [16] Gao Z, Zhang H, Liu A A, et al. Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning. KSII Trans on Internet and Information Systems, 2014, 8(2): 483-503 [17] Yang X D, Tian Y L. EigenJoints-Based Action Recognition Using Nave-Bayes-Nearest-Neighbor // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 14-19 [18] Li F F, Perona P. A Bayesian Hierarchical Model for Learning Na-tural Scene Categories // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Providence, USA, 2005, II: 524-531 [19] Nowak E, Jurie F, Triggs B. Sampling Strategies for Bag-of-Features Image Classification // Proc of the 9th European Conference on Computer Vision. Graz, Austria, 2006, IV: 490-503 [20] Wang H, Ullah M M, Klser A, et al. Evaluation of Local Spatio-Temporal Features for Action Recognition // Proc of the British Machine Vision Conference. London, UK, 2009. DOI: 10.5244/C.23.124 [21] Farnebck G. Two-Frame Motion Estimation Based on Polynomial Expansion // Proc of the 13th Scandinavian Conference on Image Analysis. Halmstad, Sweden, 2003: 363-370 [22] Willems G, Tuytelaars T, Van Gool L. An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector // Proc of the 10th European Conference on Computer Vision. Marseille, France, 2008: 650-663 [23] Laptev I, Marszaek M, Schmid C, et al. Learning Realistic Human Actions from Movies // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA, 2008. DOI: 10.1109/CVPR.2008.4587756 [24] Xia L, Chen C C, Aggarwal J K. View Invariant Human Action Recognition Using Histograms of 3D Joints // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Re-cognition. Providence, USA, 2012: 20-27 [25] Klser A, Marszaek M, Schmid C. A Spatio-Temporal Descriptor Based on 3D-Gradients // Proc of the British Machine Vision Conference. Leeds, UK, 2008. DOI: 10.5244/C.22.99 [26] Dalal N, Triggs B. Histograms of Oriented Gradients for Human Detection // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA, 2005, I: 886-893 [27] Shen X X, Zhang H, Gao Z, et al. Behavior Recognition Algorithm Based on Depth Information and RGB Image. Pattern Recognition and Artificial Intelligence, 2013, 26(8): 722-728 (in Chinese) (申晓霞,张 桦,高 赞,等.基于深度信息和RGB图像的行为识别算法.模式识别与人工智能, 2013, 26(8): 722-728) [28] Therodorakopoulos I, Kastaniotis D, Economou G, et al. Pose-Based Human Action Recognition via Sparse Representation in Dissimilarity Space. Journal of Visual Communication and Image Re-presentation, 2014, 25(1): 12-23