1. School of Information Engineering, Huzhou University, Huzhou 313000;
2. Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou University, Huzhou 313000;
3. School of Computer Science and Technology, Zhejiang Normal University, Jinhua 321004;
4. School of Science and Engineering, Huzhou College, Huzhou 313000
针对现有的基于异构图神经网络的短文本分类方法未充分利用节点之间的有效信息,以及存在的过拟合问题,文中提出基于门控双层异构图注意力网络的半监督短文本分类方法(Semi-Supervised Short Text Classification with Gated Double-Layer Heterogeneous Graph Attention Network, GDHG).GDHG包含节点注意力机制和门控异构图注意力网络两层.首先,使用节点注意力机制,训练不同类型的节点注意力系数,再将系数输入门控异构图注意力网络,训练得到门控双层注意力.然后,将门控双层注意力与节点的不同状态相乘,得到聚合的节点特征.最后,使用softmax函数对文本进行分类.GDHG利用节点注意力机制和门控异构图注意力网络的信息遗忘机制对节点信息进行聚集,得到有效的相邻节点信息,进而挖掘不同邻居节点的隐藏信息,提高聚合远程节点信息的能力.在Twitter、MR、Snippets、AGNews四个短文本数据集上的实验验证GDHG性能较优.
To address the issues of insufficient utilization of information between nodes and overfitting in short text classification based on heterogeneous graph neural network, a method for semi-supervised short text classification based on gated double-layer heterogeneous graph attention network(GDHG) is proposed. GDHG consists of two layers: node attention and gated heterogeneous graph attention network. Firstly, different types of node attention coefficients are trained by node attention, and then the node attention coefficient is input into the gated heterogeneous graph attention network to obtain the gated double-layer attention. Secondly, the gated double-layer attention is multiplied by different states of the nodes to acquire the aggregated node features. Finally, the short texts are classified with the softmax function. In the proposed GDHG, the information forgetting mechanism of node attention and gated heterogeneous graph attention network is utilized to aggregate node information. Consequently, the information of neighboring nodes is effectively obtained. And then the hidden information of different neighboring nodes is mined to improve the ability to aggregate information from remote nodes. Experiment on four short text datasets , Twitter, MR, Snippets and AGNews, illustrate the superiority of GDHG.
[1] AGGARWAL C C, ZHAI C X. A Survey of Text Classification Algorithms // AGGARWAL C C, ZHAI C X, eds. Mining Text Data. Berlin, Germany: Springer, 2012: 163-222.
[2] WANG J, WANG Z Y, ZHANG D W, et al. Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification // Proc of the 26th International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2017: 2915-2921.
[3] WANG X, CHEN R H, JIA Y, et al. Short Text Classification Using Wikipedia Concept Based Document Representation // Proc of the International Conference on Information Technology and Applications. Washington, USA: IEEE, 2013: 471-474.
[4] LI X H, YAN L, QIN N, et al. A Novel Semi-Supervised Short Text Classification Algorithm Based on Fusion Similarity // Proc of the International Conference on Intelligent Computing. Berlin, Germany: Springer, 2017: 309-319.
[5] LEE J H, KO S, HAN Y S. SALNet: Semi-Supervised Few-Shot Text Classification with Attention-Based Lexicon Construction. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(14): 13189-13197.
[6] JOHNSON R, ZHANG T. Semi-Supervised Convolutional Neural Net-works for Text Categorization via Region Embedding // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2015: 919-927.
[7] CHEN L, ZHANG M C, FU Z B, et al. FLiText: A Faster and Lighter Semi-Supervised Text Classification with Convolution Networks // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2021: 2481-2491.
[8] WANG Y Q, WANG S, YAO Q M, et al. Hierarchical Heteroge-neous Graph Representation Learning for Short Text Classification // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2021: 3091-3101.
[9] LIU Y H, GUAN R C, GIUNCHIGLIA F, et al. Deep Attention Diffusion Graph Neural Networks for Text Classification // Proc of the Conference on Empirical Methods in Natural Language Proce-ssing. Stroudsburg, USA: ACL, 2021: 8142-8152.
[10] YAO L, MAO C S, LUO Y. Graph Convolutional Networks for Text Classification // Proc of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Inte-lligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 7370-7377.
[11] KIPF T N, WELLING M. Semi-Supervised Classification with Graph Convolutional Networks[C/OL]. [2022-12-15]. https://arxiv.org/pdf/1609.02907.pdf.
[12] LIU X E, YOU X X, ZHANG X, et al. Tensor Graph Convolutional Networks for Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8409-8416.
[13] 宋泽宇,李旸,李德玉,等.融合标签关系的法律文本多标签分类方法.模式识别与人工智能. 2022, 35(2): 185-192.
(SONG Z Y, LI Y, LI D Y, et al. Multi-label Classification of Legal Text with Fusion of Label Relations. Pattern Recognition and Artificial Intelligence, 2022, 35(2): 185-192.)
[14] DING K Z, WANG J L, LI J D, et al. Be More with Less: Hypergraph Attention Networks for Inductive Text Classification // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2020: 4927-4936.
[15] YANG T C, HU L M, SHI C, et al. HGAT: Heterogeneous Graph Attention Networks for Semi-Supervised Short Text Classification. ACM Transactions on Information Systems, 2021, 39(3). DOI: 10.1145/3450352.
[16] LI Y J, ZEMEL R, BROCKSCHMIDT M, et al. Gated Graph Sequence Neural Networks[C/OL].[2022-12-15]. https://arxiv.org/pdf/1511.05493.pdf.
[17] SUN Y Z, HAN J W. Mining Heterogeneous Information Networks: A Structural Analysis Approach. ACM SIGKDD Explorations Newsletter, 2013, 14(2): 20-28.
[18] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
[19] PANG B, LEE L. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales // Proc of the 43rd Annual Meeting of Association for Computational Linguistics. Stroudsburg, USA: ACL, 2005: 115-124.
[20] PHAN X H, NGUYEN L M, HORIGUCHI S. Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-Scale Data Collections // Proc of the 17th International Conference on World Wide Web. New York, USA: ACM, 2008: 91-100.
[21] ZHANG X, ZHAO J B, LECUN Y. Character-Level Convolutional Networks for Text Classification // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2015: 649-657.
[22] DRUCKER H, WU D H, VAPNIK V N. Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks, 1999, 10(5): 1048-1054.
[23] COVER T, HART P. Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, 1967, 13(1): 21-27.
[24] LIU P F, QIU X P, HUANG X J. Recurrent Neural Network for Text Classification with Multi-task Learning // Proc of the 25th International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2016: 2873-2879.
[25] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of Tricks for Efficient Text Classification // Proc of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2017: 427-431.
[26] KIM Y. Convolutional Neural Networks for Sentence Classification // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2014: 1746-1751.
[27] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(Long and Shot Paper). Stroudsburg, USA: ACL, 2019: 4171-4186.
[28] CHEN J D, HU Y Z, LIU J P, et al. Deep Short Text Classification with Knowledge Powered Attention // Proc of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 6252-6259.
[29] VELI?KOVI? P, CUCURULL G, CASANOVA A, et al. Graph Attention Networks[C/OL].[2022-12-15]. https://arxiv.org/pdf/1710.10903.pdf.