Abstract:Traditional single-label feature extraction methods cannot effectively solve the problem of multi-label text classification. Aiming at this problem, a dual model of latent dirichlet allocation(LDA) and long short-term memory(LSTM), deep topic feature extraction model(DTFEM), is proposed in this paper. LDA and LSTM are employed as two channels, respectively. LDA is used to model global features of the text, and LSTM is used to model local features of the text. DTFEM can express the global and local features of the text simultaneously and combine supervised learning and unsupervised learning effectively to realize the feature extraction of different levels of text. Experimental results show that DTFEM is superior to other traditional text feature extraction models and obviously improves the indicators of multi-label text classification tasks.
陈文实, 刘心惠, 鲁明羽. 面向多标签文本分类的深度主题特征提取[J]. 模式识别与人工智能, 2019, 32(9): 785-792.
CHEN Wenshi, LIU Xinhui, LU Mingyu. Feature Extraction of Deep Topic Model for Multi-label Text Classification. , 2019, 32(9): 785-792.
[1] WANG J, YANG Y, MAO J H, et al. CNN-RNN: A Unified Framework for Multi-label Image Classification // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2285-2294. [2] CHEN G B, YE D H, XING Z C, et al. Ensemble Application of Convolutional and Recurrent Neural Networks for Multi-label Text Categorization // Proc of the International Joint Conference on Neural Networks. Washington, USA: IEEE, 2017: 2377-2383. [3] SALTON G, BUCKLEY C. Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 1988, 24(5): 513-523. [4] BERGER A, LAFFERTY J. Information Retrieval as Statistical Translation. ACM SIGIR Forum, 2017, 51(2): 219-226. [5] CAVNAR W B, TRENKLE J M. N-Gram-Based Text Categorization[C/OL]. [2019-04-22]. http://www.let.rug.nl/vannoord/TextCat/textcat.pdf. [6] BLEI D M, ANDREW Y N, JORDAN M I. Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003, 3: 993-1022. [7] 陈培新,郭 武.融合潜在主题信息和卷积语义特征的文本主题分类.信号处理, 2017, 33(8): 1090-1096.) (CHEN P X, GUO W. Document Topic Categorization Combining Latent Topic Information and Convolutional Semantic Features. Journal of Signal Processing, 2017, 33(8): 1090-1096.) [8] HARRIS Z S. Distributional Structure [C/OL]. [2019-04-22]. http://link.springer.com/chapter/10.1007%2F978-94-009-8467-7_1. [9] FIRTH J R. A Synopsis of Linguistic Theory, 1930-1955 // LANGENDOEN D T, ed. Studies in Linguistic Analysis. Oxford, UK: Philological Society, 1957: 1-32. [10] BENGIO Y, DUCHARME R, VINCENT P, et al. A Neural Pro-babilistic Language Model.Journal of Machine Learning Research,2003, 3: 1137-1155. [11] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient Estimation of Word Representations in Vector Space[C/OL]. [2019-04-22]. https://arxiv.org/pdf/1301.3781.pdf. [12] PENNINGTON J, SOCHER R, MANNING C D. Glove: Global Vectors for Word Representation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2014: 1532-1543. [13] LE Q V, MIKOLOV T. Distributed Representations of Sentences and Documents // Proc of the 31th International Conference on Machine Learning. Berlin, Germany: Springer, 2014: 1188-1196. [14] HOCHREITER S, SCHMIDHUBER J. Long Short-Term Memory. Neural Computation, 1997, 9(8): 1735-1780. [15] GRAVES A, MOHAMED A, HINTON G. Speech Recognition with Deep Recurrent Neural Networks // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Washington, USA: IEEE, 2013: 6645-6649. [16] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Lear-ning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C/OL]. [2019-04-22]. https://arxiv.org/pdf/1406.1078.pdf. [17] MIWA M, SASAKI Y. Modeling Joint Entity and Relation Extraction with Table Representation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2014: 1858-1869. [18] TANG D Y, QIN B, LIU T, et al. Document Modeling with Gated Recurrent Neural Network for Sentiment Classification // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2015: 1422-1432. [19] ZHANG M L, ZHOU Z H. A Review on Multi-label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819-1837. [20] LUACES O, DIEZ J, BARRANQUERO J, et al. Binary Relevance Efficacy for Multilabel Classification. Progress in Artificial Intelligence, 2012, 1(4): 303-313. [21] TSOUMAKAS G, KATAKIS I, VLAHAVAS I. Random k-labelsets for Multilabel Classification. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7): 1079-1089. [22] READ J, PFAHRINGER B, HOLMES G, et al. Classifier Chains for Multi-label Classification // Proc of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Germany: Springer, 2011: 254-269. [23] ELISSEEFF A, WESTON J. A Kernel Method for Multi-labelled Classification // DIETTERICH T G, BECKER S, GHAHRAMANI Z, eds. Advances in Neural Information Processing Systems 14. Cambridge, USA: The MIT Press, 2002: 681-687. [24] ZHANG M L, ZHOU Z H. ML-KNN: A Lazy Learning Approach to Multi-label Learning. Pattern Recognition, 2007, 40(7): 2038-2048.