Interpretability Document Classification Based on Generative-Discriminatory Hybrid Model
WANG Qiang1,2,3, CHEN Zhihao1,2,3, XU Qing1,2,3, BAO Liang1,2,3, LIAO Xiangwen1,2,3
1. College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350116; 2. Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing,Fuzhou University,Fuzhou 350116; 3. Digital Fujian Institute of Financial Big Data,Fuzhou University,Fuzhou 350116
Abstract:The deep mining of text information and the semantic relationships between word and word context and between sentence and sentence context are not taken into account in the existing interpretability document classification.Therefore,an interpretable document classification method based on a generative-discriminatory-hybrid model is proposed.The hierarchical attention mechanism is introduced into the document encoder to obtain the document representations rich in contextual semantic information.And thus more accurate classification results and explanatory information are generated,and the problem of insufficient text information mining in the existing models is handled.Experiments on PCMag and Skytrax comment datasets show that the proposed method has better performance in document classification,generates more accurate explanatory information and improves the overall performance of the method.
[1] CAMBURU O M,ROCKTASCHEL T,LUKASIEWICZ T,et al. e-SNLI:Natural Language Inference with Natural Language Explanations//Proc of the 32nd International Conference on Neural Information Processing Systems.Cambridge,USA:The MIT Press,2018:9539-9549. [2] OUYANG S X,LAWLOR A,COSTA F,et al.Improving Explainable Recommendations with Synthetic Reviews[C/OL].[2020-06-22].https://arxiv.org/pdf/1807.06978.pdf. [3] VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need//Proc of the 31st International Conference on Neural Information Processing Systems.Cambridge,USA:The MIT Press,2017:6000-6010. [4] YIN Q Y,ZHANG Y,ZHANG W N,et al. Deep Reinforcement Learning for Chinese Zero Pronoun Resolution//Proc of the 56th Annual Meeting of the Association for Computational Linguistics(Long Papers).Stroudsburg,USA:ACL,2018:569-578. [5] SAMEK W,WIEGAND T,MÜLLER K R.Explainable Artificial Intelligence:Understanding,Visualizing and Interpreting Deep Learning Models[C/OL].[2020-06-22]. https://arxiv.org/pdf/1708.08296.pdf. [6] APRIETA A B,DÍAZ-RODRÍGUEZ N,SER J D,et al. Explai-nable Artificial Intelligence(XAI):Concepts,Taxonomies Opportunities and Challenges toward Responsible AI[C/OL].[2020-06-22].https://arxiv.org/pdf/1910.100451.pdf. [7] LIU H,YIN Q Y,WANG W Y.Towards Explainable NLP:A Ge-nerative Explanation Framework for Text Classification//Proc of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:ACL,2019:5570-5581. [8] 纪守领,李进锋,杜天宇,等.机器学习模型可解释性方法,应用与安全研究综述.计算机研究与发展,2019,56(10):2071-2096. (JI S L,LI J F,DU T Y,et al.Survey on Techniques,Applications and Security of Machine Learning Interpretability.Journal of Computer Research and Development,2019,56(10):2071-2096.) [9] RIBEIRO M T,SINGH S,GUESTRIN C."Why Should I Trust You?"Explaining the Predictions of Any Classifier//Proc of the 22nd ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.New York,USA:ACM,2016:1135-1144. [10] HINTON G,VINYALS O,DEAN J.Distilling the Knowledge in a Neural Network[C/OL]. [2020-06-22].https://arxiv.org/pdf/1503.02531.pdf. [11] FROSST N,HINTON G.Distilling a Neural Network into a Soft Decision Tree[C/OL].[2020-06-22].https://arxiv.org/pdf/1711.09784.pdf. [12] ROBNIK-S'KONJA M,KONONENKO I.Explaining Classifications for Individual Instances.IEEE Transactions on Knowledge and Data Engineering,2008,20(5):589-600. [13] GUIDOTTI R,MONREALE A,RUGGIERI S,et al.Local Rule-Based Explanations of Black Box Decision Systems[C/OL].[2020-06-22].https://arxiv.org/pdf/1805.10820.pdf. [14] LUNDBERG S M,LEEE S I.A Unified Approach to Interpreting Model Predictions//Proc of the 31th International Conference on Neural Information Processing Systems.Cambridge,USA:The MIT Press,2017:4768-4777. [15] HASTIE T J,TIBSHIRANI R J.Generalized Additive Model.Statistical Science,1986,1(3):297-310. [16] SHAPLEY L S.A Value for n-Person Games//KUHN H W,TUCKER A W,eds.Contributions to the Theory of Games II,Annals of Mathematics Studies.Princeton,USA:Princeton University Press,1953: 307-317. [17] SIMONYAN K,VEDALDI A,ZISSERMAN A.Deep Inside Con-volutional Networks:Visualising Image Classification Models and Saliency Maps[C/OL].[2020-06-22].https://arxiv.org/pdf/1312.6034.pdf. [18] SHRIKUMAR A,GREENSIDE P,KUNDAJE A.Learning Important Features through Propagating Activation Differences//Proc of the 34th International Conference on Machine Learning.New York,USA:ACM,2017:3145-3153. [19] KOH P W,LIANG P.Understanding Black-Box Predictions via Influence Functions//Proc of the 34th International Conference on Machine Learning.New York,USA:ACM,2017:1885-1894. [20] GUAN C Y,WANG X Y,ZHANG Q S,et al.Towards a Deep and Unified Understanding of Deep Neural Models in NLP//Proc of the 36th International Conference on Machine Learning.New York,USA:ACM,2019:2454-2463. [21] LEI T,BARZILAY R,JAAKKOLA T S.Rationalizing Neural Predictions//Proc of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg,USA:ACL,2016:107-117. [22] HANCOCK B,VARMA P,WANG S,et al.Training Classifiers with Natural Language Explanations//Proc of the 56th Annual Meeting of the Association for Computational Linguistics(Long Papers).Stroudsburg,USA:ACL,2018:1884-1895. [23] ZHOU X D,WANG W Y.MojiTalk:Generating Emotional Responses at Scale//Proc of the 56th Annual Meeting of the Association for Computational Linguistics(Long Papers).Stroudsburg,USA:ACL,2018:1128-1137. [24] TANG D Y,QIN B,LIU T.Document Modeling with Gated Recurrent Neural Network for Sentiment Classification//Proc of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg,USA:ACL,2015:1422-1432. [25] YANG Z C,YANG D Y,DYER C,et al.Hierarchical Attention Networks for Document Classification//Proc of the Conference of the North American Chapter of the Association for Computational Linguistics(Human Language Technologies).Stroudsburg,USA:ACL,2016:1480-1489. [26] PAPINENI K,ROUKOS S,WARD T,et al.BLEU:A Method for Automatic Evaluation of Machine Translation//Proc of the 40th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:ACL,2002:311-318. [27] ZHANG X,ZHAO J B,LECUN Y.Character-Level Convolutional Networks for Text Classification//CORTES C,LAWRENCE N D,LEE D D,et al.,eds.Advances in Neural Information Processing Systems 28.Cambridge,USA:The MIT Press,2015:649-657. [28] LAI S W,XU L H,LIU K,et al.Recurrent Convolutional Neural Networks for Text Classification//Proc of the 29th AAAI Confe-rence on Artificial Intelligence.Palo Alto,USA:AAAI Press,2015:2267-2273. [29] LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-Based Neural Machine Translation//Proc of the Confe-rence on Empirical Methods in Natural Language Processing.Stroudsburg,USA:ACL,2015:1412-1421. [30] SOHN K,YAN X C,LEE H.Learning Structured Output Representation Using Deep Conditional Generative Models//CORTES C,LAWRENCE N D,LEE D D,et al.,eds.Advances in Neural Information Processing Systems 28.Cambridge,USA:The MIT Press,2015:3483-3491. [31] MANNING C D,SURDEANU M,BAUER J,et al.The Stanford CoreNLP Natural Language Processing Toolkit//Proc of the 52nd Annual Meeting of the Association for Computational Linguistics(System Demonstrations).Stroudsburg,USA:ACL,2014:55-60. [32] PENNINGTON J,SOCHER R,MANNING C D.GloVe:Global Vectors for Word Representation//Proc of the Conference on Empirical Methods in Natural Language Processing(EMNLP).Strou-dsburg,USA:ACL,2014:1532-1543. [33] KINGMA D P,BA J L.ADAM:A Method for Stochastic Optimization[C/OL].[2020-06-22].http://de.arxiv.org/pdf/1412.6980.