Abstract:Action recognition has become a hotspot in the fields of video surveillance, virtual reality, human-computer interaction and others recently. In this paper, action recognition is comprehended as a process of detecting action data, called symbols of action message, and distinct actions based on action feature extraction and reception are further classified. On the basis, an overview of vision-based full-body action recognition techniques is presented within the domain of moving object detection, action feature extraction and action feature perception, and the corresponding methods are classified. Besides, the research trend of action recognition is discussed.
李瑞峰,王亮亮,王珂. 人体动作行为识别研究综述[J]. 模式识别与人工智能, 2014, 27(1): 35-48.
LI Rui-Feng, WANG Liang-Liang, WANG Ke. A Survey of Human Body Action Recognition. , 2014, 27(1): 35-48.
[1] Mokhber A, Achard C, Milgram M. Recognition of Human Behavior by Space-Time Silhouette Characterization. Pattern Recognition Letters, 2008, 29(1): 81-89 [2] Polat E, Yeasin M, Sharma R. Robust Tracking of Human Body Parts for Collaborative Human Computer Interaction. Computer Vision and Image Understanding, 2003, 89(1): 44-69 [3] Kjellstrm H, Romero J, Kragic' D. Visual Object-Action Recognition: Inferring Object Affordances from Human Demonstration. Computer Vision and Image Understanding, 2011, 115(1): 81-90 [4] Suma E A, Krum D M, Lange B, et al. Adapting User Interfaces for Gestural Interaction with the Flexible Action and Articulated Skeleton Toolkit. Computers & Graphics, 2012, 37(3): 193-201 [5] Ayers D, Shah M. Monitoring Human Behavior from Video Taken in an Office Environment. Image and Vision Computing, 2001, 19(12): 833-846 [6] López M T, Fernández-Caballero A, Fernández M A, et al. Visual Surveillance by Dynamic Visual Attention Method. Pattern Recognition, 2006, 39(11): 2194-2211 [7] Wang Liang, Hu Weiming, Tan Tieniu. A Survey of Visual Analysis of Human Motion. Chinese Journel of Computers, 2002, 25(3): 225-237 (in Chinese) (王 亮,胡卫明,谭铁牛.人运动的视觉分析综述.计算机学报, 2002, 25(3): 225-237) [8] Aggarwal J K, Park S. Human Motion: Modeling and Recognition of Actions and Interactions // Proc of the 2nd International Symposium on 3D Data Processing, Visualization and Transmission. Thessaloniki, Greece, 2004: 640-647 [9] Moeslund T B , Hilton A, Krüger V. A Survey of Advances in Vision-Based Human Motion Capture and Analysis. Computer Vision and Image Understanding, 2006, 104(2/3): 90-126 [10] Ling Zhigang, Zhao Chunhui, Liang Yan, et al. Survey on Vision-Based Human Action Understanding. Application Research of Computers, 2008, 25(9): 2570-2578 (in Chinese) (凌志刚,赵春晖,梁 彦,等.基于视觉的人行为理解综述.计算机应用研究, 2008, 25(9): 2570-2578) [11] Poppe R. A Survey on Vision-Based Human Action Recognition. Image and Vision Computing, 2010, 28(6): 976-990 [12] Weinland D, Ronfard R, Boyer E. A Survey of Vision-Based Methods for Action Representation, Segmentation and Recognition. Computer Vision and Image Understanding, 2011, 115(2): 224-241 [13] Tseng H C, Shyu J J, Chang J Y, et al. Exploiting Automatic Image Segmentation to Human Detection and Depth Estimation // Proc of the IEEE Symposium on Computational Intelligence for Multimedia, Signal and Vision Processing. Paris,France, 2011: 19-25 [14] Cheng Jinyong, Liu Yihui. Human Body Image Segmentation Based on Wavelet Analysis and Active Contour Models // Proc of the International Conference on Wavelet Analysis and Pattern Recognition. Beijing, China, 2007: 265-269 [15] Zhao Tao, Nevatia R. Bayesian Human Segmentation in Crowded Situations // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Madison, USA, 2003,II: 459-466 [16] Gulshan V, Lempitsky V, Zisserman A. Humanising GrabCut: Learning to Segment Humans Using the Kinect // Proc of the IEEE International Conference on Computer Vision Workshops. Barcelona, Spain, 2011: 1127-1133 [17] Ando H, Fujiyoshi H. Human-Area Segmentation by Selecting Similar Silhouette Images Based on Weak-Classifier Response // Proc of the 20th International Conference on Pattern Recognition. Istanbul, Turkey, 2010: 3444-3447 [18] Raut S, Raghuvanshi M, Dharaskar R, et al. Image Segmentation-A State-of-Art Survey for Prediction // Proc of the International Conference on Advanced Computer Control. Singapore, Singapore, 2009: 420-424 [19] Ali A, Aggarwal J K. Segmentation and Recognition of Continuous Human Activity // Proc of the IEEE Workshop on Detection and Recognition of Events in Video. Vancouver, Canada, 2001: 28-35 [20] Hanjalic A, Lagendijk R L, Biemond J. Automated High-Level Movie Segmentation for Advanced Video-Retrieval Systems. IEEE Trans on Circuits and Systems for Video Technology, 1999, 9(4): 580-588 [21] Wang Jinjun, Xiao Jing. Human Behavior Segmentation and Recognition Using Continuous Linear Dynamic Systems // Proc of the IEEE Workshop on Application of Computer Vision. Tampa, USA, 2013: 61-67 [22] Weinland D, Ronfard R, Boyer E. Automatic Discovery of Action Taxonomies from Multiple Views // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006, II: 1639- 1645 [23] Davis J W, Bobick A F. The Representation and Recognition of Human Movement Using Temporal Templates // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Minneapolis, USA, 1997: 928-934 [24] Rubin J M, Richards W A. Boundaries of Visual Motion. Technical Report, AIM-835. Cambridge, USA: Massachusetts Institute of Technology, 1985 [25] Rui Yong, Anandan P. Segmenting Visual Actions Based on Spatio-Temporal Motion Patterns // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head Island, USA, 2000, I: 111-118 [26] Kim W H, Jeong T I, Kim J N. Video Segmentation Algorithm Using Threshold and Weighting Based on Moving Sliding Window // Proc of the 11th International Conference on Advanced Communication Technology. Pyeongchang County, Repulic of Korea, 2009: 1781-1784 [27] Zhai Y, Shah M. Video Scene Segmentation Using Markov Chain Monte Carlo. IEEE Trans on Multimedia, 2006, 8(4): 686-697 [28] Niu Feng, Abdel-Mottaleb M. HMM-Based Segmentation and Recognition of Human Activities from Video Sequences // Proc of the International Conference on Multimedia and Expo. Amsterdam, Holland, 2005: 804-807 [29] Qinfeng Shi, Li Wang, Li Cheng, et al. Discriminative Human Action Segmentation and Recognition Using Semi-Markov Model [EB/OL].[2013-3-17] .http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4587557 [30] Schuldt C, Laptev I, Caputo B. Recognizing Human Actions: A Local SVM Approach // Proc of the 17th International Conference on Pattern Recognition. Cambridge, USA, 2004, III: 32-36 [31] Blank M, Gorelick L, Shechtman E, et al. Actions as Space-Time Shapes // Proc of the 10th IEEE International Conference on Computer Vision. Beijing, China, 2005, II: 1395-1402 [32] Rodriguez M D, Ahmed J, Shah M. Action MACH: A Spatiotemporal Maximum Average Correlation Height Filter for Action Recognition [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4587727 [33] Weinland D, Ronfard R, Boyer E. Free Viewpoint Action Recognition Using Motion History Volumes. Computer Vision and Image Understanding, 2006, 104(2/3): 249-257 [34] Ahmad M, Lee S W. Variable Silhouette Energy Image Representations for Recognizing Human Actions. Image and Vision Computing, 2010, 28(5): 814-824 [35] Li Guan, Franco J S, Pollefeys M. 3D Occlusion Inference from Silhouette Cues [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4270170 [36] Cheung K M G, Baker S, Kanade T. Shape-from-Silhouette of Articulated Objects and Its Use for Human Body Kinematics Estimation and Motion Capture // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Madison, USA, 2003, I: 77-84 [37] Agarwal A, Triggs B. Recovering 3D Human Pose from Monocular Images.IEEE Trans on Pattern Analysis and Machine Intelligence, 2006, 28(1): 44-58 [38] Kolev K, Brox T, Cremers D. Fast Joint Estimation of Silhouettes and Dense 3D Geometry from Multiple Images. IEEE Trans on Pattern Analysis and Machine Intelligence, 2012, 34(3): 493-505 [39] Howe N R. Silhouette Lookup for Monocular 3D Pose Tracking. Image and Vision Computing, 2007, 25(3): 331-341 [40] Chris D G, Peter D, Christopher P M, et al. A Compact Optical Flow Cell for Nurse in Aqueous Halide Determination. Measurement Science and Technology, 1999, 10(4): N34-N37 [41] Tagliasacchi M. A Genetic Algorithm for Optical Flow Estimation. Image and Vision Computing, 2007, 25(2): 141-147 [42] Sun Changming. Fast Optical Flow Using 3D Shortest Path Techniques. Image and Vision Computing, 2002, 20(13/14): 981-991 [43] Francomano E, Tortorici A, Calderone V. Regularization of Optical Flow with M-Band Wavelet Transform. Computers & Mathematics with Applications, 2003, 45(1/2/3): 437-452 [44] Chen Lifen, Liao H M, Lin Jachen. Wavelet-Based Optical Flow Estimation. IEEE Trans on Circuits and Systems for Video Technology, 2002, 12(1): 1-12 [45] Efros A A, Berg A C, Mori G, et al. Recognizing Action at A Distance // Procc of the 9th IEEE International Conference on Computer Vision. Nice, France, 2003, II: 726-733 [46] Zhang Haiyan. Multiple Moving Objects Detection and Tracking Based on Optical Flow in Polar-Log Images // Proc of the International Conference on Machine Learning and Cybernetics. Qingdao, China, 2010: 1577-1582 [47] Mahbub U, Imtiaz H, Rahman Ahad M A. An Optical Flow Based Approach for Action Recognition // Proc of the 14th International Conference on Computer and Information Technology. Dhaka, Bangladesh, 2011: 646-651 [48] Denman S, Fookes C, Sridharan S. Improved Simultaneous Computation of Motion Detection and Optical Flow for Object Tracking // Proc of the Conference on Digital Image Computing: Techniques and Applications. Melbourne, Australia, 2009: 175-182 [49] Kinoshita K, Enokidani M, Izumida M, et al. Tracking of a Moving Object Using One-Dimensional Optical Flow with a Rotating Observer // Proc of the 9th International Conference on Control, Automation, Robotics and Vision. Singapore, Singapore, 2006: 1-6 [50] Ali S, Shah M. Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning. IEEE Trans on Pattern Analysis and Machine Intelligence, 2010, 32(2): 288-303 [51] Yuan Chunfeng, Li Xi, Hu Weiming, et al. 3D R Transform on Spatio-Temporal Interest Points for Action Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013: 724-730 [52] Salmane H, Ruichek Y, Khoudour L. Object Tracking Using Harris Corner Points Based Optical Flow Propagation and Kalman filter // Proc of the 14th International IEEE Conference on Intelligent Transportation Systems. Washington, USA, 2011: 67-73 [53] Zhang Zhuo, Liu Jia. Recognizing Human Action and Identity Based on Affine-SIFT // Proc of the IEEE Symposium on Electrical & Electronics Engineering. Kuala Lumpur, Malaysia, 2012: 216-219 [54] Sun Xinghua, Chen Mingyu, Hauptmann A. Action Recognition via Local Descriptors and Holistic Features // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 58-65 [55] Biederman I. Perceiving Real-World Scenes. Science, 1992, 177(4043): 77-80 [56] Moore D J, Essa I A, Hayes M H. Exploiting Human Actions and Object Context for Recognition Tasks // Proc of the 7th IEEE International Conference on Computer Vision. Kerkyra, Greece, 1999, I: 80-86 [57] Wu Chen, Aghajan H. Using Context with Statistical Relational Models: Object Recognition from Observing User Activity in Home Environment // Proc of the Workshop on Use of Context in Vision Process. Boston, USA, 2009: 22-27 [58] Wu Jianxin, Osuntogun A, Choudhury T, et al. A Scalable Approach to Activity Recognition Based on Object Use // Proc of the 11th IEEE International Conference on Computer Vision. Rio De Janeiro, Brazil, 2007: 1-8 [59] Seo H J, Milanfar P. Detection of Human Actions from a Single Example // Proc of the 12th IEEE International Conference on Computer Vision. Kyoto, Japan, 2009: 1965-1970 [60] Seo H J, Milanfar P. Action Recognition from One Example. IEEE Trans on Pattern Analysia and Machine Intelligence, 2011, 33(5): 867-882 [61] Zheng Xiao, Fu Mengyin, Yang Yi, et al. 3D Human Postures Recognition Using Kinect // Proc of the 4th International Conference on Intelligent Human-Machine Systems and Cybernetics. Nanchang, China, 2012: 344-347 [62] Khan S, Javed O, Rasheed Z, et al. Human Tracking in Multiple Cameras // Proc of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada, 2001, I: 331-336 [63] Khan S, Shah M. Consistent Labeling of Tracked Objects in Multiple Cameras with Overlapping Fields of View. IEEE Trans on Pattern Analysis and Machine Intelligence, 2003, 25(10): 1355-1360 [64] Alcoverro M, Lopez-Mendez A, Pardas M, et al. Connected Operators on 3D Data for Human Body Analysis // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA, 2011: 9-14 [65] Lu Xia, Chen C C, Aggarwal J K. Human Detection Using Depth Information by Kinect // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Colorado Springs, USA, 2011: 15-22 [66] Smisek J, Jancosek M, Pajdla T. 3D with Kinect // Proc of the IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 1154-1160 [67] de Castro L N, Von Zuben F J. Learning and Optimization Using the Clonal Selection Principle. IEEE Trans on Evolutionary Computation, 2002, 6(3): 239-251 [68] Mika S, Ratsch G, Weston J, et al. Fisher Discriminant Analysis with Kernels // Proc of the IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing IX. Madison, USA, 1999: 41-48 [69] Zhang Xinhua. An Information Model and Method of Feature Fusion // Proc of the International Conference on Signal Process. Dalian, China, 1998: 1389-1392 [70] Battiti R. Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE Trans on Neural Network, 1994, 5(4): 537-550 [71] Shi Yan, Zhang Tianxu. Feature Analysis: Support Vector Machines Approaches // Proc of the SPIE Conference on Image Extraction, Segmentation and Recognition. Wuhan, China, 2001: 245-251 [72] Liu Chengjun, Wechsler H. A Shape and Texture Based Enhanced Fisher Classifier for Face Recognition. IEEE Trans on Image Processing, 2001, 10(4): 598-608 [73] Yang Jian, Yang Jingyu, Zhang D, et al. Feature Fusion: Parallel Strategy vs.Serial Strategy. Pattern Recognition, 2003, 36(6): 1369-1381 [74] Brasnett P, Mihaylova L, Bull D, et al. Sequential Monte Carlo Tracking by Fusing Multiple Cues in Video Sequences. Image and Vision Computing, 2007, 25(8): 1217-1227 [75] Serby D, Meier E K, Van Gool L. Probabilistic Object Tracking Using Multiple Features // Proc of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004, II: 184-187 [76] Zhou Quming, Aggarwal J K. Object Tracking in an Outdoor Environment Using Fusion of Features and Cameras. Image and Vision Computing, 2006, 24(11): 1244-1255 [77] Folgado E, Rincón M, Carmona E J, et al. A Block-Based Model for Monitoring of Human Activity. Neurocomputing, 2011, 74(8): 1283-1289 [78] Thome N, Merad D, Miguet S. Learning Articulated Appearance Models for Tracking Humans: A Spectral Graph Matching Approach. Signal Processing: Image Communication, 2008, 23(10): 769-787 [79] Rohr K. Towards Model-Based Recognition of Human Movements in Image Sequences. CVGIP: Image Understanding, 1994, 59(1): 94-115 [80] Leung M K, Yang Y H. First Sight: A Human Body Outline Labeling System. IEEE Trans on Pattern Analysis and Machine Intelligence, 1995, 17(4): 359-377 [81] Sharma V. A Blob Representation for Tracking Robust to Merging and Fragmentation // Proc of the IEEE Workshop on Applications of Computer Vision. Breckenridge, USA, 2012: 161-168 [82] Da Xu R Y, Kemp M. Multiple Curvature Based Approach to Human Upper Body Parts Detection with Connected Ellipse Model Fine-Tuning // Proc of the 16th International Conference on Image Processing. Cairo, Egypt, 2009: 2577-2580 [83] Wren C R, Azarbayejani A, Darrell T, et al. Pfinder: Real-Time Tracking of the Human Body. IEEE Trans on Pattern Analysis and Machine Intelligence, 1997, 19(7): 780-785 [84] Sato T, Kanbara M, Yokoya N, et al. Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera. International Journal of Computer Vision, 2002, 47(1/2/3): 119-129 [85] Minglei Tong, Yuncai Liu, Huang T S. 3D Human Model and Joint Parameter Estimation from Monocular Image. Pattern Recognition Letters, 2007, 28(7): 797-805 [86] Dekker L D. 3D Human Body Modeling from Range Data. Ph.D Dissertation. London, UK: University of London, 2000 [87] Gagalowicz A, Quah C K. 3D Model-Based Marker-Less Human Motion Tracking in Cluttered Environment // Proc of the IEEE 12th International Conference on Computer Vision Workshops. Kyoto, Japan, 2009: 1042-1049 [88] Kakadiaris I A, Metaxas D. 3D Human Body Model Acquisition from Multiple Views // Proc of the 15th International Conference on Computer Vision. Cambridge, USA, 1995: 618-623 [89] Osawa T, Wu Xiaojun, Wakabayashi K, et al. Human Tracking by Particle Filtering Using Full 3D Model of Both Target and Environ-ment // Proc of the 18th International Conference on Pattern Recognition. Hong Kong, China, 2006, II: 25-28 [90] Lanitis A, Taylor C J, Cootes T F. Automatic Interpretation and Coding of Face Images Using Flexible Models. IEEE Trans on Pattern Analysis and Machine Intelligence, 1997, 19(7): 743-756 [91] Cootes T F, Edwards G J, Taylor C J. Active Appearance Models. IEEE Trans on Pattern Analysis and Machine Intelligence, 2001, 23(6): 681-685 [92] Kokkinos I, Maragos P. Synergy between Object Recognition and Image Segmentation Using the Expectation-Maximization Algorithm. IEEE Trans on Pattern Analysis and Machine Intelligence, 2009, 31(8): 1486-1501 [93] Ma Jia, Ren Fuji. Detect and Track the Dynamic Deformation Human Body with the Active Shape Model Modified by Motion Vectors // Proc of the International Conference on Cloud Computing and Intelligence Systems. Beijing, China, 2011: 587-591 [94] Bobick A F, Wilson A D. A State-Based Approach to the Representation and Recognition of Gesture. IEEE Trans on Pattern Analysis and Machine Intelligence, 1997, 19(12): 1325-1337 [95] Bobick A F, Davis J W. The Recognition of Human Movement Using Temporal Templates. IEEE Trans on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257-267 [96] Meng Hongying, Pears N, Bailey C. A Human Action Recognition System for Embedded Computer Vision Application [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4270418 [97] Naiel M A, Abdelwahab M M, El-Saban M. Multi-View Human Action Recognition System Employing 2DPCA // Proc of the IEEE Workshop on Applications of Computer Vision. Kona, USA, 2011: 270-275 [98] Tian Yingli, Cao Liangliang, Liu Zicheng, et al. Hierarchical Filtered Motion for Action Recognition in Crowded Videos. IEEE Trans on Systems, Man, and Cybernetics-Part C: Applications and Reviews, 2012, 42(3): 313-323 [99] Hoai M, Lan Zhenzhong, de la Torre F. Joint Segmentation and Classification of Human Actions in Video // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2011: 3265-3272 [100] Kang D J, Ha J E, Kweon I S. Fast Object Recognition Using Dynamic Programming from Combination of Salient Line Groups. Pattern Recognition, 2003, 36(1): 79-90 [101] Gherabi N, Gherabi A, Bahaj M. A New Algorithm for Shape Matching and Pattern Recognition Using Dynamic Programming // Proc of the International Conference on Multimedia Computing and Systems. Ouarzazate, Morocco, 2011: 1-6 [102] Alajlan N, El Rube I, Kamel M S, et al. Shape Retrieval Using Triangle-Area Representation and Dynamic Space Warping. Pattern Recognition, 2007, 40(7): 1911-1920 [103] Wang Jing, Zheng Huicheng. View-Robust Action Recognition Based on Temporal Self-similarities and Dynamic Time Warping // Proc of the IEEE Conference on Computer Science and Automation Engineering. Zhangjiajie, China, 2012: 498-502 [104] Yang Jie, Xu Yangsheng, Chen C S. Hidden Markov Model Approach to Skill Learning and Its Application to Telerobotics. IEEE Trans on Robotics and Automation, 1994, 10(5): 621-631 [105] Yamato J, Ohya J, Ishii K. Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Champagne, USA, 1992: 379-385 [106] Hongeng S, Nevada R, Bremond F. Video-Based Event Recognition: Activity Representation and Probabilistic Recognition Methods. Computer Vision and Image Understanding, 2004, 96(2): 129-162 [107] McCowan L, Gatica-Perez D, Bengio S, et al. Automatic Analysis of Multimodal Group Actions in Meetings. IEEE Trans on Pattern Analysis and Machine Intelligence, 2005, 27(3): 305-317 [108] Mccallum A, Freitag D, Pereira F. Maximum Entropy Markov Models for Information Extraction and Segmentation // Proc of the 17th International Conference on Machine Learning. Stanford, USA, 2000: 591-598 [109] Lafferty J D, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data // Proc of the 18th International Conference on Machine Learning. Montreal, Canada, 2001: 282-289 [110] Sminchisescu C, Kanaujia A, Li Zhiguo, et al. Conditional Models for Contextual Human Motion Recognition // Proc of the 10th IEEE International Conference on Computer Vision. Beijing, China, 2005, II: 1808-1815 [111] Galleguillos C, Rabinovich A, Belongie S. Object Categorization Using Co-occurrence, Location and Appearance [EB/OL].[2013-3-19]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4587799 [112] Taycher L, Demirdjian D, Darrell T, et al. Conditional Random People: Tracking Humans with CRFs and Grid Filters // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006: 222-229 [113] Natarajan P, Nevatia R. View and Scale Invariant Action Recognition Using Multiview Shape-Flow Models [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4587716 [114] Wang Yang, Ji Qiang. A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA, 2005, I: 264-270 [115] Pavlovic V, Rehg J M, Tat-Jen Cham, et al. A Dynamic Bayesian Network Approach to Figure Tracking Using Learned Dynamic Models // Proc of the 7th IEEE International Conference on Computer Vision. Kerkyra, Greece, 1999, I: 94-101 [116] Youtian Du, Chen Feng, Xu Wenli, et al. Recognizing Interaction Activities Using Dynamic Bayesian Network // Proc of the 18th International Conference on Pattern Recognition. Hong Kong, China, 2006, I: 618-621 [117] Pavlovic V, Frey B J, Huang T S. Time-Series Classification Using Mixed-State Dynamic Bayesian Networks // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Fort Collins, USA, 1999: 2609-2615 [118] Niebles J C, Li Feifei. A Hierarchical Model of Shape and Appearance for Human Action Classification [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4270157 [119] Demirdjian D, Wang S. Recognition of Temporal Events Using Multiscale Bags of Features // Proc of the IEEE Workshop on Computational Intelligence for Visual Intelligence. Nashville, USA, 2009: 8-13 [120] Liu Jingen, Shah M. Learning Human Actions via Information Maximization [EB/OL].[2013-3-17]. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4587723 [121] Marszalek M, Laptev I, Schmid C. Actions in Context // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 2929-2936 [122] Lazebnik S, Schmid C, Ponce J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006, II: 2169- 2178 [123] Lin Liang, Gong Haifeng, Li Li, et al. Semantic Event Representation and Recognition Using Syntactic Attribute Graph Grammar. Pattern Recognition Letters, 2009, 30(2): 180-186 [124] Summers-Stay D, Teo C L, Yang Yezhou, et al. Using a Minimal Action Grammar for Activity Understanding in the Real World // Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Vilamoura, Portugal, 2012: 4104-4111 [125] Ryoo M S, Aggarwal J K. Recognition of Composite Human Activities through Context-Free Grammar Based Representation // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006: 1709- 1718 [126] Fogassi L, Ferrari P F, Gesierich B, et al. Parietal Lobe: From Action Organization to Intention Understanding. Science, 2005, 308(5722): 662-667 [127] Pontil M, Verri A. Support Vector Machines for 3D Object Recognition. IEEE Trans on Pattern Analysis and Machine Intelligence, 1998, 20(6): 637-646 [128] Schuldt C, Laptev I, Caputo B. Recognizing Human Actions: A Local SVM Approach // Proc of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004, III: 32- 36 [129] Wang Mengyue, Zhang Changlin, Song Yan. An Improved Multiple Instance Learning Algorithm for Object Extraction // Proc of the Chinese Conference on Pattern Recognition. Chongqing, China, 2010: 1-5 [130] Simon T, Nguyen M H, Cohn J F, et al. Action Unit Detection with Segment-Based SVMs // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA, 2010: 2737-2744 [131] Liu Jingen, Shah M, Kuipers B, et al. Cross-View Action Recognition via View Knowledge Transfer // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA, 2011: 3209-3216 [132] Gong Dian, Medioni G. Dynamic Manifold Warping for View Invariant Action Recognition // Proc of the IEEE Conference on Computer Vision. Barcelona, Spain, 2011: 571-578 [133] Weinland D, zuysal M, Fua P. Making Action Recognition Robust to Occlusions and Viewpoint Changes // Proc of the 11th European Conference on Computer Vision. Heraklion, Greece, 2010: 635-648