Abstract:To realize the accurate recognition of sign language in the video, an algorithm based on depth image CamShift(DI_CamShift) and speeded up robust features-bag of words (SURF-BoW) is proposed. Kinect is used as the sign language video capture device to obtain both of the color video and depth image information of sign language gestures. Firstly, spindle direction angle and mass center position of the depth images are calculated and the search window is adjusted to track gesture. Next, an OTSU algorithm based on depth integral image is used for gesture segmentation, and the SURF features are extracted. Finally, SURF-BoW is built as the feature of sign language and SVM is utilized for recognition. The best recognition rate of single manual alphabet reaches 99.37%, and the average recognition rate is up to 96.24%.
杨全,彭进业. 基于深度信息和SURF-BoW的中国手语识别算法*[J]. 模式识别与人工智能, 2014, 27(8): 741-749.
YANG Quan, PENG Jin-Ye. Chinese Sign Language Recognition Method Based on Depth Image Information and SURF-BoW. , 2014, 27(8): 741-749.
[1] Wachs J P, Kolsch M, Stern H, et al. Vision-Based Hand-Gesture Applications. Communications of the ACM, 2011, 54(2): 60-71 [2] Ren Z, Yuan J S, Zhang Z Y. Robust Hand Gesture Recognition Based on Finger-Earth Mover's Distance with a Commodity Depth Camera // Proc of the 19th ACM International Conference on Multimedia. Scottsdale, USA, 2011: 1093-1096 [3] Doliotis P, Stefan A, McMurrough C, et al. Comparing Gesture Recognition Accuracy Using Color and Depth Information [EB/OL]. [2013-01-30]. http:// eprints.pascal-network.org/archive/00008428/01/doliotis.petra2011.pdf [4] Chen Q, Georganas N D, Petriu E M. Real-Time Vision-Based Hand Gesture Recognition Using Haar-Like Features // Proc of the IEEE Instrumentation and Measurement Technology Conference. Warsaw, Poland, 2007.DOI:10.1109/IMTC.2007.379068 [5] Niebles J C, Wang H C, Li F F. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. International Journal of Computer Vision, 2008, 79(3): 299-318 [6] Dardas N H, Petriu E M. Hand Gesture Detection and Recognition Using Principal Component Analysis // Proc of the IEEE International Conference on Computational Intelligence for Measurement Systems and Applications. Ottawa, Canada, 2011. DOI: 10.1109/CIMSA2011.6059935 [7] Huang K Q, Chen B J, Zheng B, et al. Application of Kinect in the Video Conference System. Journal of Guangxi University: Natural Science Edition, 2011, 36(Sl): 308-313 (in Chinese) (黄康泉,陈壁金,郑 博,等.Kinect在视频会议系统中的应用.广西大学学报:自然科学版, 2011, 36(Sl): 308-313) [8] Zheng X, Fu M Y, Yang Y, et al. 3D Human Postures Recognition Using Kinect // Proc of the 4th International Conference on Intelligent Human-Machine Systems and Cybernetics. Nanchang, China, 2012, I: 344-347 [9] Raheja J L, Chaudhary A, Signal K. Tracking of Fingertips and Centers of Palm Using KINECT // Proc of the 3rd International Conference on Computational Intelligence, Modeling and Simulation. Langkawi, Malaysia, 2011: 248-252 [10] Zhu Z L, Liu F G, Tao X Y, et al. Skin Segmentation Based on Integral Image and Particle Swarm Optimization[EB/OL]. [2013-01-29]. http://www.cnki.net/kcms/detail/11.2127.TP.20130129.1543.016.html (in Chinese) (朱志亮,刘富国,陶向阳,等.基于积分图和粒子群优化的肤色分割[EB/OL]. [2013-01-29]. http://www.cnki.net/kcms/detail/11.2127.TP.20130129.1543.016.html) [11] Lang X P, Zhu F, Hao Y M, et al. Fast Two-Dimensional Otsu Algorithm Based on Integral Image. Chinese Journal of Scientific Instrument, 2009, 30(1): 39-43(in Chinese) (郎咸朋,朱 枫,郝颖明,等.基于积分图像的快速二维Otsu算法.仪器仪表学报, 2009, 30(1): 39-43) [12] Wang Y S, Gao W. Kernel-Based Image Classification Using the Context of Visual Words. Journal of Image and Graphics, 2010, 15(4): 607-616 (in Chinese) (王宇石,高 文.用基于视觉单词上下文的核函数对图像分类.中国图象图形学报, 2010, 15(4): 607-616) [13] Liu Y W, Huo H, Fang T. Visual Words Ambiguity Analysis in BOW Model. Computer Engineering, 2011, 37(19): 204-206,209 (in Chinese) (刘扬闻,霍 宏,方 涛.词包模型中视觉单词歧义性分析.计算机工程, 2011, 37(19): 204-206, 209) [14] Zhang Q Y, Wang D D, Zhang M Y, et al. Hand Gesture Recognition Based on Bag of Features and Support Vector Machine. Journal of Computer Applications, 2012, 32(12): 3329-3396 (in Chinese) (张秋余,王道东,张墨逸,等.基于特征包支持向量机的手势识别.计算机应用, 2012, 32(12): 3392-3396) [15] Juan L, Gwun O. A Comparison of SIFT, PCA-SIFT and SURF. International Journal of Image Processing, 2009, 3(4): 143-152 [16] Bastanlar Y, Temizel A, Yardimci Y. Improved SIFT Matching for Image Pairs with Scale Difference[EB/OL].[2013-02-05]. http://ieeexplore.ieee.org/stamp.jsp?arnumber=5426976 [17] Zhang R J, Zhang J Q, Yang C. Image Registration Approach Based on SURF. Infrared and Laser Engineering, 2009, 38(1): 160-165 (in Chinese) (张锐娟,张建奇,杨 翠.基于SURF的图像配准方法研究.红外与激光工程, 2009, 38(1): 160-165) [18] Lin C J. LibSVM: A Library for Support Vector Machines[EB/OL]. [2012-03-29]. http://www.csie.ntu.edu.tw/~cjlin/libsvm