Abstract:In scene classification based on convolutional neural network, over-fitting is caused due to the large number of network training and poor convergence performance with the small training dataset. To eliminate the negative effect, an algorithm for scene classification with adaptive learning rate and sample training mode is proposed. The network learning rate is adaptively adjusted on the framework of convolutional neural network according to the variation of the error function in the network training. When the error function changes slightly, the learning rate of the batch is unchanged. When the error function changes more remarkably, the variation of the learning rate is inversely proportional to the variation of the error function. Meanwhile, according to the network output, the sample training mode is switched, and the inaccurately recognized images are emphatically trained. The experimental results on Scene-15 and Cifar-10 scene datasets show that the proposed method improves the convergence of neural networks and effectively improves the classification accuracy, especially the classification accuracy of complex scenes such as indoor scenes.
储珺, 苏亚伟, 王璐. 自适应调节学习率和样本训练方式的场景分类[J]. 模式识别与人工智能, 2018, 31(7): 625-633.
CHU Jun, SU Yawei, WANG Lu. Scene Classification with Adaptive Learning Rate and Sample Training Mode. , 2018, 31(7): 625-633.
[1] 何 清,李 宁,罗文娟,等.大数据下的机器学习算法综述.模式识别与人工智能, 2014, 27(4): 327-336. (HE Q, LI N, LUO W J, et al. A Survey of Machine Learning Algorithms for Big Data. Pattern Recognition and Artificial Intelligence, 2014, 27 (4): 327-336.) [2] TAIGMAN Y, YANG M, RANZATO M, et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 1701-1708. [3] HUVAL B, WANG T, TANDON S, et al. An Empirical Evaluation of Deep Learning on Highway Driving[J/OL]. [2018-02-05]. https://arxiv.org/pdf/1504.01716.pdf. [4] LOWE D G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110. [5] OLIVA A, TORRALBA A. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision, 2001, 42(3): 145-175. [6] 杨 昭,高 隽,谢 昭,等.局部Gist特征匹配核的场景分类.中国图像图形学报, 2013, 18(3): 264-270. (YANG Z, GAO J, XIE Z, et al. Scene Categorization of Local Gist Feature Match Kernel. Journal of Image and Graphics, 2013, 18(3): 264-270.) [7] LI L J, SU H, LIM Y, et al. Object Bank: An Object-Level Image Representation for High-Level Visual Recognition. International Journal of Computer Vision, 2014, 107(1): 20-39. [8] LI L J, SU H, LI F F, et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification // LAFFETRTY J D, WILLIAMS C K I, SHAWE-TAYLOR J, et al., eds. Advances in Neural Information Processing Systems 23. Cambridge, USA: The MIT Press, 2010: 1378-1386. [9] JUNEJA M, VEDALDI A, JAWAHAR C V, et al. Blocks That Shout: Distinctive Parts for Scene Classification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2013: 923-930. [10] DIXIT M, CHEN S, GAO D S, et al. Scene Classification with Semantic Fisher Vectors // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 2974-2983. [11] WEI Y C, XIA W, LIN M, et al. HCP: A Flexible CNN Framework for Multi-label Image Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(9): 1901-1907. [12] OQUAB M, BOTTOU L, LAPTEV I, et al. Is Object Localization for Free? Weakly-Supervised Learning with Convolutional Neural Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 685-694. [13] HE K M, ZHANG X Y, REN S Q, et al. Identity Mappings in Deep Residual Networks // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 630-645. [14] SCHAUL T, ZHANG S X, LECUN Y. No More Pesky Learning Rates[J/OL]. [2018-02-05]. http://proceedings.mlr.press/v28/schaul13.pdf. [15] CHANG H S, LEARNED-MILLER E, MCCALLUM A. Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples // GUYON L, LUXBURG U V, BENGIO S, et al., eds. Advances in Neural Information Processing Systems 30. Cambridge, USA: The MIT Press, 2017: 1003-1013. [16] JACOBS R A. Increased Rates of Convergence through Learning Rate Adaptation. Neural Networks, 1988, 1(4): 295-307. [17] SOLOMON R, VAN HEMMEN J L. Accelerating Backpropagation through Dynamic Self-adaptation. Neural Networks, 1996, 9(4): 589-601. [18] LAZEBNIK S, SCHMID C, PONCE J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2006, II: 2169-2178. [19] BOSCH A, ZISSERMAN A, MUOZ X. Scene Classification Using a Hybrid Generative/Discriminative Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(4): 712-727. [20] WU J X, REHG J M. Beyond the Euclidean Distance: Creating Effective Visual Codebooks Using the Histogram Intersection Kernel // Proc of the 12th IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2009: 630-637. [21] LIN D, LU C W, LIAO R J, et al. Learning Important Spatial Pooling Regions for Scene Classification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 3726-3733. [22] BOUREAU Y L, BACH F, LECUN Y, et al. Learning Mid-level Features for Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2010: 2559-2566. [23] SADEGHI F, TAPPEN M F. Latent Pyramidal Regions for Recognizing Scenes // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2012: 228-241. [24] JIANG Y N, YUAN J S, YU G. Randomized Spatial Partition for Scene Recognition // Proc of the European Conference on Compu-ter Vision. Berlin, Germany: Springer, 2012: 730-743. [25] GAO S H, TSANG I W H, CHIA L T, et al. Local Features Are Not Lonely Laplacian Sparse Coding for Image Classification // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2010: 3555-3561. [26] VEDALDI A, FULKERSON B. VLFeat: An Open and Portable Library of Computer Vision Algorithms // Proc of the 18th ACM International Conference on Multimedia. New York, USA: ACM, 2010: 1469-1472. [27] ZHOU B L, LAPEDRIZA A, XIAO J X, et al. Learning Deep Features for Scene Recognition Using Places Database // GHAHRAMANI Z, WELLING M, CORTES C, et al., eds. Advances in Neural Information Processing Systems 27. Cambridge, USA: The MIT Press, 2014: 487-495. [28] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [29] BENNETT K P, BREDENSTEINER E J. Duality and Geometry in SVM Classifiers // Proc of the 17th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann Publishers, 2000: 57-64. [30] COATES A, LEE H, NG A Y. An Analysis of Single-Layer Networks in Unsupervised Feature Learning // Proc of the 14th International Conference on Artificial Intelligence and Statistics. New York, USA: JMLR, 2011: 215-223. [31] CHAN T H, JIA K, GAO S H, et al. PCANet: A Simple Deep Learning Baseline for Image Classification? IEEE Transactions on Image Processing, 2015, 24(12): 5017-5032. [32] MAIRAL J, KONIUSZ P, HARCHAOUI Z, et al. Convolutional Kernel Networks[C/OL]. [2018-02-05]. https://arxiv.org/pdf/1406.3332.pdf. [33] ZEILER M D, FERGUS R. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks[J/OL]. [2018-02-05].https://arxiv.org/pdf/1301.3557.pdf. [34] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 2017, 60(6): 84-90. [35] ROMERO A, BALLAS N, KAHOU S E, et al. Fitnets: Hints for Thin Deep Nets[J/OL]. [2018-02-05]. https://arxiv.org/pdf/1412.6550.pdf. [36] SRIVASTAVA R K, GREFF K, SCHMIDHU BER J. Highway Networks[J/OL]. [2018-02-05]. https://arxiv.org/pdf/1505.00387.pdf.