Abstract:The key to image classification methods based on convolutional neural networks is to extract distinctive important features. To focus on crucial features and enhance the generalization ability of the model, double-branch multi-attention mechanism based sharpness-aware classification network(DAMSNet) is proposed. Based on the ResNet-34 residual network, the size of the convolutional kernel in the input layer of the network is modified and the max pooling layer is removed to reduce the loss of original image features. Then, the double-branch multi-attention mechanism module is designed and embedded into the residual branch to extract the global and local contextual information in both channel and spatial domains. Additionally, sharpness-aware minimization(SAM) algorithm is introduced and combined with stochastic gradient descent optimizer to simultaneously minimize both loss value and loss sharpness, seeking for neighboring parameters with consistently low loss to enhance the generalization ability of the network. Experiments on CIFAR-10, CIFAR-100 and SVHN datasets demonstrate that DAMSNet achieves high classification accuracy and effectively enhances the generalization ability of the network.
[1] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Lear-ning Applied to Document Recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks // Proc of the 25th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2012, I: 1097-1105. [3] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2023-01-25]. https://arxiv.org/pdf/1409.1556.pdf. [4] SZEGEDY C, LIU W, JIA Y Q, et al. Going Deeper with Convolutions // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2015. DOI: 10.1109/CVPR.2015.7298594. [5] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [6] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely Connected Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2261-2269. [7] ZHANG K, GUO Y R, WANG X S, et al. Channel-Wise and Feature-Points Reweights Densenet for Image Classification // Proc of the IEEE International Conference on Image Processing. Washington, USA: IEEE, 2019: 410-414. [8] ABDI M, NAHAVANDI S.Multi-residual Networks: Improving the Speed and Accuracy of Residual Networks[C/OL]. [2023-01-25].https://arxiv.org/pdf/1609.05672.pdf. [9] 郭玉荣,张珂,王新胜,等.端到端双通道特征重标定DenseNet图像分类.中国图象图形学报, 2020, 25(3): 486-497. (GUO Y R, ZHANG K, WANG X S, et al. Image Classification Method Based on End-to-End Dual Feature Reweight DenseNet. Journal of Image and Graphics, 2020, 25(3): 486-497.) [10] 陈超凡,张红云,蔡克参,等.基于三支决策的二阶段图像分类方法.模式识别与人工智能, 2021, 34(8): 768-776. (CHEN C F, ZHANG H Y, CAI K C, et al. Two-Stage Image Classification Method Based on Three-Way Decisions. Pattern Re-cognition and Artificial Intelligence, 2021, 34(8): 768-776.) [11] 姚潇,史叶伟,霍冠英,等.基于神经网络结构搜索的轻量化网络构建.模式识别与人工智能, 2021, 34(11): 1038-1048. (YAO X, SHI Y W, HUO G Y, et al. Lightweight Model Construction Based on Neural Architecture Search. Pattern Recognition and Artificial Intelligence, 2021, 34(11): 1038-1048.) [12] 付晓,沈远彤,李宏伟,等.基于半监督编码生成对抗网络的图像分类模型.自动化学报, 2020, 46(3): 531-539. (FU X, SHEN Y T, LI H W, et al. A Semi-Supervised Encoder Generative Adversarial Networks Model for Image Classification. Acta Automatica Sinica, 2020, 46(3): 531-539.) [13] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local Neural Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 7794-7803. [14] HU J, SHEN L, SUN G. Squeeze-and-Excitation Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern and Recognition. Washington, USA: IEEE, 2018: 7132-7141. [15] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient Cha-nnel Attention for Deep Convolutional Neural Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 11531-11539. [16] QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: Frequency Cha-nnel Attention Networks // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 763-772. [17] HOU Q B, ZHOU D Q, FENG J S. Coordinate Attention for Efficient Mobile Network Design // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 13708-13717. [18] HU X F, ZHANG Z H, JIANG Z Y, et al. SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 312-328. [19] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6000-6010. [20] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale[C/OL].[2023-01-25]. https://arxiv.org/pdf/2010.11929.pdf. [21] LAN H, WANG X H, SHEN H, et al. Couplformer: Rethinking Vision Transformer with Coupling Attention // Proc of the IEEE/CVF Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2023: 6464-6473. [22] KONSTANTINIDIS D, PAPASTRATIS I, DIMITROPOULOS K, et al. Multi-manifold Attention for Vision Transformers[C/OL].[2023-01-25]. https://arxiv.org/pdf/2207.08569.pdf. [23] FORET P, KLEINER A, MOBAHI H, et al. Sharpness-Aware Minimization for Efficiently Improving Generalization[C/OL].[2023-01-25]. https://arxiv.org/pdf/2010.01412.pdf. [24] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 3-19. [25] TAN M X, LE Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks // Proc of the 36th International Conference on Machine Learning. San Diego, USA: JMLR, 2019: 6105-6114. [26] HAN K, WANG Y H, TIAN Q, et al. GhostNet: More Features from Cheap Operations // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 1577-1586. [27] CHOROMANSKI K, LIKHOSHERSTOV V, DOHAN D, et al. Rethinking Attention with Performers[C/OL].[2023-01-25]. https://arxiv.org/pdf/2009.14794.pdf.