Label-Guided Dual-Attention Deep Neural Network Model
PENG Zhanwang1, ZHU Xiaofei1, GUO Jiafeng2
1. College of Computer Science and Engineering, Chongqing Uni-versity of Technology, Chongqing 400054; 2. Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190
Abstract:Since the text information of labels is not included in some datasets, the semantic relationship between text words and labels cannot be explicitly calculated in the existing explicit interactive classification models. To solve this problem, a label-guided dual-attention deep neural network model is proposed in this paper. Firstly, an automatic category label description generation method based on inverse label frequency is proposed. According to the label description generation method, a specific label description for each label is generated. The generated specific label description is applied to explicitly calculate the semantic relationship between text words and labels. On the basis of the above, review text representation with contextual information is learned by a text encoder. A label-guided dual-attention network is proposed to learn the text representation based on self-attention and the text representation based on label attention, respectively. Then, an adaptive gating mechanism is employed to fuse two mentioned text representations and the final text representation is thus obtained. Finally, a two-layer feedforward neural network is utilized as a classifier for sentiment classification. Experiments on three publicly available real-world datasets show that the proposed model produces better classification performance.
[1] 何丽,房婉琳,张红艳.基于上下文保持能力的方面级情感分类模型.模式识别与人工智能, 2021, 34(2): 157-166. (HE L, FANG W L, ZHANG H Y.Aspect-Level Sentiment Classification Model Based on Context-Preserving Capability. Pattern Re-cognition and Artificial Intelligence, 2021, 34(2): 157-166.) [2] ZHANG L, WANG S, LIU B.Deep Learning for Sentiment Analysis: A Survey. WIREs: Data Mining and Knowledge Discovery, 2018, 8(4). DOI: 10.1002/widm.125. [3] QIN Q, HU W P, LIU B.Feature Projection for Improved Text Classification//Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2020: 8161-8171. [4] WANG H, WANG S, MAZUMDER S, et al. Bayes-Enhanced Lifelong Attention Networks for Sentiment Classification//Proc of the 28th International Conference on Computational Linguistics. Strouds-burg, USA: ACL, 2020: 580-591. [5] GAO W L, YOSHINAGA N, KAJI N, et al. Modeling User Leniency and Product Popularity for Sentiment Classification//Proc of the 6th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2013: 1107-1111. [6] QU L Z, IFRIM G, WEIKUM G.The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns//Proc of the 23rd International Conference on Computational Linguistics. Strouds-burg, USA: ACL, 2010: 913-921. [7] GOLDBERG A B, ZHU X J.Seeing Stars When There Aren't Many Stars: Graph-Based Semi-Supervised Learning for Sentiment Categorization//Proc of the 1st Workshop on Graph Based Methods for Natural Language Processing. Stroudsburg, USA: ACL, 2006: 45-52. [8] KURATA G, XIANG B, ZHOU B W.Improved Neural Network-Based Multi-label Classification with Better Initialization Leveraging Label Co-occurrence//Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2016: 521-526. [9] LAI S W, XU L H, LIU K, et al. Recurrent Convolutional Neural Networks for Text Classification//Proc of the 29th AAAI Confe-rence on Artificial Intelligence. Palo Alto, USA: AAAI, 2015: 2267-2273. [10] TANG D Y, QIN B, LIU T.Document Modeling with Gated Recurrent Neural Network for Sentiment Classification//Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2015: 1422-1432. [11] YANG Z C, YANG D Y, DYER C, et al. Hierarchical Attention Networks for Document Classification//Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2016: 1480-1489. [12] YIN Y C, SONG Y Q, ZHANG M.Document-Level Multi-aspect Sentiment Classification as Machine Comprehension//Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2017: 2044-2054. [13] 陈金广,赵银歌,马丽丽.面向方面级情感分类的特征融合学习网络.模式识别与人工智能, 2021, 34(11): 1049-1057. (CHEN J G, ZHAO Y G, MA L L.Feature Fusion Learning Network for Aspect-Level Sentiment Classification. Pattern Recognition and Artificial Intelligence, 2021, 34(11): 1049-1057.) [14] MA S M, SUN X, LIN J Y, et al. A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification[C/OL].[2021-10-13]. https://www.ijcai.org/Proceedings/2018/0591.pdf. [15] CHAN H P, CHEN W, KING I.A Unified Dual-View Model for Review Summarization and Sentiment Classification with Inconsis-tency Loss//Proc of the 43rd International ACM SIGIR Confe-rence on Research and Development in Information Retrieval. New York, USA: ACM, 2020: 1191-1200. [16] SEE A, LIU P J, MANNING C D.Get to the Point: Summarization with Pointer-Generator Networks//Proc of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2017: 1073-1083. [17] DU C X, CHEN Z Z, FENG F L, et al. Explicit Interaction Model towards Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 6359-6366. [18] XIAO L, HUANG X, CHEN B L, et al. Label-Specific Document Representation for Multi-label Text Classification//Proc of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2019: 466-475. [19] SARI S, ADRIANI M.Applications of TF-IDF Concept to Improve Monolingual and Cross-Language Information Retrieval Based on Word Embeddings//Proc of the International Conference on Advanced Information Science and System. New York, USA: ACM, 2019. DOI: 10.1145/3373477.3373493. [20] CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al. Lear-ning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation//Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2014: 1724-1734. [21] YOU R H, ZHANG Z H, WANG Z Y, et al. AttentionXML: Label Tree-Based Attention-Aware Deep Model for High-Performance Extreme Multi-label Text Classification[C/OL].[2021-10-13]. https://arxiv.org/pdf/1811.01727.pdf. [22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need//Proc of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM, 2017: 6000-6010. [23] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [24] TAN Z X, WANG M X, XIE J, et al. Deep Semantic Role Labeling with Self-Attention//Proc of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI, 2018:4929-4936. [25] MCAULEY J, TARGETT C, SHI Q F, et al. Image-Based Reco-mmendations on Styles and Substitutes//Proc of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2015: 43-52. [26] MANNING C, SURDEANU M, BAUER J, et al. The Stanford CoreNLP Natural Language Processing Toolkit//Proc of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Stroudsburg, USA: ACL, 2014: 55-60. [27] BRODERSEN K H, ONG C S, STEPHAN K E, et al. The Ba-lanced Accuracy and Its Posterior Distribution//Proc of the 20th International Conference on Pattern Recognition. Washington, USA: IEEE, 2010: 3121-3124. [28] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed Representations of Words and Phrases and Their Compositionality//Proc of the 26th International Conference on Neural Information Proce-ssing Systems. New York, USA: ACM, 2013: 3111-3119. [29] KINGMA D P, BA J L. Adam: A Method for Stochastic Optimization[C/OL]. [2021-10-13]. https://arxiv.org/pdf/1412.6980.pdf. [30] BAHDANAU D, CHO K H, BENGIO Y. Neural Machine Translation by Jointly Learning to Align and Translate[C/OL]. [2021-12-13]. https://arxiv.org/pdf/1409.0473.pdf. [31] VINYALS O, BENGIO S, KUDLUR M.Order Matters: Sequence to Sequence for Sets[C/OL]. [2021-10-13].https://arxiv.org/pdf/1511.06391.pdf. [32] ZHOU Q R, WANG X J, DONG X.Differentiated Attentive Re-presentation Learning for Sentence Classification[C/OL]. [2021-10-13].https://www.ijcai.org/proceedings/2018/0644.pdf. [33] ZHANG M, QIAN T Y.Convolution over Hierarchical Syntactic and Lexical Graphs for Aspect Level Sentiment Analysis//Proc of the Conference on Empirical Methods in Natural Language Proce-ssing. Stroudsburg, USA: ACL, 2020: 3540-3549.