Abstract:With the widespread application of deep learning, facial expression recognition technology develops rapidly. However, how to extract multi-scale features and utilize key features efficiently is still a challenge for facial expression recognition network. To solve these problems, pyramid convolution is employed to extract multi-scale features effectively, and spatial channel attention mechanism is introduced to enhance the expression of key features. An expression recognition network based on residual attention mechanism and pyramidal convolution is constructed to improve the recognition accuracy. Multi-task convolutional neural network is utilized for face detection, face clipping and face alignment, and then the preprocessed images are sent to the feature extraction network. Meanwhile, the network is trained by combining Softmax Loss and the Center Loss to narrow the difference between the same expressions and enlarge the distance between different expressions. Experiments show that the accuracy of the proposed network on Fer2013 dataset and CK+ dataset is high, the number of network parameters is small and the proposed method is more suitable for the application of realistic scenarios of expression recognition.
[1] PANTIC M, ROTHKRANTZ L J M. Expert System for Automatic Analysis of Facial Expressions. Image and Vision Computing, 2000, 18(11): 881-905. [2] LI T H, DU C F, NAREN T, et al. Using Feature Points and Angles between Them to Recognise Facial Expressions by a Neural Network Approach. IET Image Processing, 2018, 12(11): 1951-1955. [3] BARTLETT M S, LITTLEWORT G, FASEL I, et al. Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction // Proc of the Confe-rence on Computer Vision and Pattern Recognition Workshop. Washington, USA: IEEE, 2003. DOI: 10.1109/CVPRW.2003.10057. [4] MASE K, PENTLAND A. Automatic Lipreading by Optical-Flow Analysis. Systems and Computers in Japan, 1991, 22(6): 67-76. [5] TANG Y C. Deep Learning Using Linear Support Vector Machines[C/OL]. [2021-12-22]. https://arxiv.org/pdf/1306.0239.pdf. [6] JUNG H, LEE S, YIM J, et al. Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 2983-2991. [7] LIU Y Y, YUAN X H, GONG X, et al. Conditional Convolution Neural Network Enhanced Random Forest for Facial Expression Re-cognition. Pattern Recognition, 2018, 84: 251-261. [8] HARIRI W, FARAH N. Recognition of 3D Emotional Facial Expression Based on Handcrafted and Deep Feature Combination. Pa-ttern Recognition Letters, 2021, 148: 84-91. [9] LI M, XU H, HUANG X C, et al. Facial Expression Recognition with Identity and Emotion Joint Learning. IEEE Transactions on Affective Computing, 2021, 12(2): 544-550. [10] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. [11] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 3-19. [12] SUN X, XIA P P, ZHANG L M, et al. A ROI-Guided Deep Architecture for Robust Facial Expressions Recognition. Information Sciences, 2020, 522: 35-48. [13] LI J, JIN K, ZHOU D L, et al. Attention Mechanism-Based CNN for Facial Expression Recognition. Neurocomputing, 2020, 411: 340-350. [14] SUN W Y, ZHAO H T, JIN Z. A Visual Attention Based ROI Detection Method for Facial Expression Recognition. Neurocompu-ting, 2018, 296: 12-22. [15] GAN C Q, XIAO J H, WANG Z Y, et al. Facial Expression Re-cognition Using Densely Connected Convolutional Neural Network and Hierarchical Spatial Attention. Image and Vision Computing, 2022, 117. DOI: 10.1016/j.imavis.2021.104342. [16] DUTA I C, LIU L, ZHU F, et al. Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition[C/OL].[2021-12-22]. https://arxiv.org/pdf/2006.11538.pdf. [17] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [18] WEN Y D, ZHANG K P, LI Z F, et al. A Discriminative Feature Learning Approach for Deep Face Recognition // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 499-515. [19] ZHANG K P, ZHANG Z P, LI Z F, et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. [20] GOODFELLOW I J, ERHAN D, CARRIER P L, et al. Challenges in Representation Learning: A Report on Three Machine Lear-ning Contests. Neural Network, 2015, 64: 59-63. [21] LUCEY P, COHN J F, KANADE T, et al. The Extended Cohn-Kanade Dataset(CK+): A Complete Dataset for Action Unit and Emotion-Specified Expression // Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Wa-shington, USA: IEEE, 2010: 94-101. [22] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications[C/OL].[2021-12-22]. https://arxiv.org/pdf/1704.04861.pdf. [23] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4510-4520. [24] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely Connected Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2261-2269. [25] HAYALE W, NEGI P S, MAHOOR M. Deep Siamese Neural Networks for Facial Expression Recognition in the Wild. IEEE Transactions on Affective Computing, 2021. DOI: 10.1109/TAFFC.2021.3077248. [26] XIE W C, SHEN L L, DUAN J M. Adaptive Weighting of Handcrafted Feature Losses for Facial Expression Recognition. IEEE Transactions on Cybernetics, 2021, 51(5): 2787-2800. [27] JAIN D K, SHAMSOLMOALI P, SEHDEV P. Extended Deep Neural Network for Facial Emotion Recognition. Pattern Recognition Letters, 2019, 120: 69-74. [28] YANG Q, LIU F, ZHAO Z L. Expression Recognition Based on Attention Mechanism and Length Feature of Facial Landmark // Proc of the 13th International Conference on Wireless Communications and Signal Processing. Washington, USA: IEEE, 2021. DOI: 10.1109/WCSP52459.2021.9613469.