1.Key Laboratory of Computer Vision and System of Ministry of Education, Tianjin University of Technology, Tianjin 300384
2.Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300384
3.Microsoft, One Microsoft Way, Redmond, WA 98052 USA
Human behavior recognition is a hot issue in computer vision. However, most of the existing algorithms only use RGB or depth video sequence, and few of them are combined for behavior recognition. Due to their own advantages and complementary information, the characteristics of depth images and RGB images are studied, and two kinds of robust descriptors and some fusion schemes for them are proposed in this paper. Then, the support vector machine classifiers with different kernels are adopted. Results of extensive experiments on the challenging DHA dataset show that the accuracies of the proposed descriptors are higher than those of the state-of-the-art algorithms. Meanwhile, the performance of the algorithm with the combination of depth information and RGB is improved, and it is better than that of the algorithm with sole descriptor. Moreover, the proposed descriptors have strong robustness, discriminability and stability.
[1] Gavrila D M. Vision-Based 3-D Tracking of Human in Action. Ph.D Dissertation. Maryland, USA: University of Maryland, 1996
[2] Sminchisescu C, Kanaujia A, Li Zhiguo, et al. Conditional Models for Contextual Human Motion Recognition // Proc of the 10th International Conference on Computer Vision. Beijing, China, 2005, II: 1808-1815
[3]Belongie S, Malik J, Puzicha J. Shape Matching and Object Recognition Using Shape Context. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(4): 509-522
[4] Kumar S, Hebert M. Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification // Proc of the 9th IEEE Conference on Computer Vision. Nice, France, 2003, II: 1150-1157
[5] McCallum A, Freitag D, Pereira F. Maximum Entropy Markov Mo-dels for Information Extraction and Segmentation // Proc of the 17th International Conference on Machine Learning. Stanford, USA, 2000: 591-598
[6] Laptev I, Lindeberg T. Space-Time Interest Points // Proc of the 9th IEEE Conference on Computer Vision. Nice, France, 2003, I: 432-439
[7] Bobick A F, Davis J W. The Recognition of Human Movement Using Temporal Templates. IEEE Trans on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257-267
[8]Kellokumpu V, Pietikainen M, Heikkila J. Human Activity Recognition Using Sequences of Postures // Proc of the International Association for Pattern Recognition Conference on Machine Vision Applications. Tsukuba Science City, Japan, 2005: 570-573
[9] Wang Liang, Suter D. Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition. IEEE Trans on Image Processing, 2007, 16(6): 1646-1661
[10] Megavannan V, Agarwal B, Venkatesh B R. Human Action Re-cognition Using Depth Maps // Proc of the International Conference on Signal Processing and Communications. Bangalore, India, 2012: 1-5
[11] Lin Y C, Hu Minchun, Cheng Wenhuang, et al. Human Action Recognition and Retrieval Using Sole Depth Information // Proc of the 20th International Conference on Multimedia. Nara, Japan, 2012: 1053-1056
[12] Wang Jiang, Liu Zicheng, Wu Ying, et al. Mining Actionlet Ensemble for Action Recognition with Depth Cameras // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 1290-1297
[13] Li Wanging, Zhang Zhengyou, Liu Zicheng. Action Recognition Based on a Bag of 3D Points // Proc of the IEEE International Conference on Human Communicative Behavior Analysis. San Francisco, USA, 2010: 9-14
[14] Lu Xia,Chen C C, Aggarwal J K. Human Detection and Action Recognition Using Depth Information by Kinect. Ph.D Dissertation. Austin, USA: The University of Texas, 2012
[15] Dalal N, Triggs B. Histograms of Oriented Gradients for Human Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. San Diego, USA, 2005, I: 886-893
[16] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167
[17] Yamato J, Ohya J, Ishii K. Recognition Human Action in Time-Sequential Images Using Hidden Markov Model // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Champaign, USA, 1992: 379-385
[18] Siminchisescu C, Kanaujia A, Metaxas D. Conditional Models for Contextual Human Motion Recognition. Computer Vision and Image Understanding, 2006, 104(2/3): 210-220
[19] Gao Zan, Zhang Hua, Cai Anni. Discussion on the Assessment Strategy of Action Recognition Algorithms. Journal of Optoelectro-nics·Laser, 2012, 23(6): 1166-1172 (in Chinese)
(高 赞,张 桦,蔡安妮.动作识别算法的评估策略探讨.光电子·激光, 2012, 23(6): 1166-1172)