|
|
3D Object Recognition via Convolutional-Recursive Neural Network and Kernel Extreme Learning Machine |
LIU Yangyang, ZHANG Jun, GAO Xinjian, ZHANG Xudong, GAO Jun |
School of Computer and Information, Hefei University of Technology, Hefei 230009 |
|
|
Abstract To tackle the issues of depth quality and non-linear classification in the large-scale RGB-D dataset, a 3D object recognition method is designed on the basis of convolutional-recursive neural network(CNN-RNN) and kernel extreme learning machine(KELM). Firstly, a depth coding algorithm is introduced to correct the numerical losses and noises in the original depth cue and unify the point cloud into the standard angle. And the original depth and the encoded depth are fused as the new depth cue. Secondly,multi-cue hierarchical features are learned using CNN-RNN. Meanwhile, the two-way spatial pyramid pooling method is exploited for each cue. Finally, KELM is constructed as the classifier to recognize 3D objects. The experimental results demonstrate the proposed method effectively improves the 3D object recognition accuracy and the classification efficiency.
|
Received: 27 July 2017
|
|
Fund:Supported by National Natural Science Foundation of China(No.61403116), China Postdoctoral Science Foundation(No.2014M560507), Fundamental Research Funds for the Central Universities(No.JZ2016HGBZ0762,JZ2016HGTB0721) |
About author:: (LIU Yangyang, born in 1992, master student. His research interests include intelligent information processing.) (ZHANG Jun(corresponding author), born in 1984, Ph.D., associate professor. Her research interests include computer vision, pattern recognition and cognitive science.) (GAO Xinjian, born in 1990, Ph. D. candidate. His research interests include intelligent information processing.) (ZHANG Xudong, born in 1966, Ph.D., professor. His research interests include intelligent information processing and pattern re-cognition.) (GAO Jun, born in 1963, Ph.D., professor. His research interests include intelligent information processing and pattern recognition.) |
|
|
|
[1] CSURKA G, DANCE C R, FAN L X, et al. Visual Categorization with Bags of Keypoints // Proc of the ECCV International Workshop on Statistical Learning in Computer Vision. New York, USA: ACM, 2004: 1-22. [2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks[C/OL]. [2017-10-23]. https://papers.nips.cc/paper/4824-imagenet-classifica tion-with-deep-convolutional-neural-networks.pdf. [3] TANG S, WANG X Y, L X T, et al. Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor // Proc of the 11th Asian Conference on Computer Vision. Berlin, Germany: Springer, 2012: 525-538. [4] BENGIO Y, COURVILLE A, VINCENT P. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828. [5] BENGIO Y, COURVILLE A, VINCENT P. Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives. IEEE Transactions on Software Engineering, 2012, 35(8): 1-30. [6] BO L F, REN X F, FOX D. Unsupervised Feature Learning for RGB-D Based Object Recognition // Proc of the 13th International Symposium on Experimental Robotics. Berlin, Germany: Springer, 2013: 387-402. [7] SOCHER R, HUVAL B, BHAT B, et al. Convolutional-Recursive Deep Learning for 3D Object Classification // Proc of the IEEE International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2012: 665-673. [8] CHENG Y H, ZHAO X, HUANG K Q, et al. Semi-supervised Learning for RGB-D Object Recognition // Proc of the 22nd International Conference on Pattern Recognition. Washington, USA: IEEE, 2014: 2377-2382. [9] BLUM A, MITCHELL T. Combining Labeled and Unlabeled Data with Co-training // Proc of the 11th Annual Conference on Computational Learning Theory. New York, USA: ACM, 1998: 92-100. [10] BAI J, WU Y, ZHANG J M, et al. Subset Based Deep Learning for RGB-D Object Recognition. Neurocomputing, 2015, 165: 280-292. [11] ZAKI H F M, SHAFAIT F, MIAN A. Localized Deep Extreme Learning Machines for Efficient RGB-D Object Recognition // Proc of the International Conference on Digital Image Computing: Techniques and Applications. Washington, USA: IEEE, 2015. DOI: 10.1109/DICTA.2015.7371280. [12] HUANG G B, ZHU Q Y, SIEW C K. Extreme Learning Machine: Theory and Applications. Neurocomputing, 2006, 70(1/2/3): 489-501. [13] SCHWARZ M, SCHULZ H, BEHNKE S. RGB-D Object Recognition and Pose Estimation Based on Pre-trained Convolutional Neural Network Features // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2015: 1329-1335. [14] HE K M, ZHANG X Y, REN S Q, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(9): 1904-1916. [15] CHENG Y H, ZHAO X, HUANG K Q, et al. Semi-supervised Learning and Feature Evaluation for RGB-D Object Recognition. Computer Vision and Image Understanding, 2015, 139: 149-160. [16] HUANG G B, ZHOU H M, DING X J, et al. Extreme Learning Machine for Regression and Multiclass Classification. IEEE Transactions on Systems, Man, and Cybernetics(Cybernetics), 2012, 42(2): 513-529. [17] HUANG G B, ZHU Q Y, SIEW C K. Extreme Learning Machine: A New Learning Scheme of Feedforward Neural Networks // Proc of the IEEE International Joint Conference on Neural Networks. Washington, USA: IEEE, 2004, II: 985-990. [18] HUANG G, HUANG G B, SONG S J, et al. Trends in Extreme Learning Machines: A Review. Neural Networks, 2015, 61: 32-48. [19] LO B P L, YANG G Z. Neuro-Fuzzy Shadow Filter // Proc of the 7th European Conference on Computer Vision. London, UK: Springer-Verlag, 2002: 381-392. [20] COATES A, LEE H, NG A Y. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. Journal of Machine Learning Research, 1991, 15: 215-223. [21] 时增林,叶阳东,吴云鹏,等.基于序的空间金字塔池化网络的人群计数方法.自动化学报, 2016, 42(6): 866-874. (SHI Z L, YE Y D, WU Y P, et al. Crowd Counting Using Rank-Based Spatial Pyramid Pooling Network. Acta Automatica Sinica, 2016, 42(6): 866-874.) [22] ZENG Y J, XU X, SHEN D Y, et al. Traffic Sign Recognition Using Kernel Extreme Learning Machines with Deep Perceptual Features. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(6): 1647-1653. [23] LAI K, BO L F, REN X F, et al. A Large-Scale Hierarchical Multi-view RGB-D Object Dataset // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2011: 1817-1824. [24] ZHENG W N, BO P B, LIU Y, et al. Fast B-spline Curve Fitting by L-BFGS. Computer Aided Geometric Design, 2011, 29(7): 448-462. |
|
|
|