Dual View Contrastive Learning Networks for Multi-hop Reading Comprehension
CHEN Jinwen1,2, CHEN Yuzhong1,2
1. College of Computer and Data Science, Fuzhou University, Fuzhou 350108; 2. Fujian Key Laboratory of Network Computing and Intelligent Information Processing, Fuzhou University, Fuzhou 350108
Abstract:Multi-hop reading comprehension is an important task in machine reading comprehension, aiming at constructing a multi-hop reasoning chain from multiple documents to answer questions with requirement of combining evidence from multiple documents. Graph neural networks are widely applied to multi-hop reading comprehension tasks. However, there are still shortcomings in terms of insufficient acquisition of context mutual information for the multiple document reasoning chain and the introduction of noise due to some candidate answers being mistakenly judged as correct answers solely based on their similarity to the question. To address these issues, dual view contrastive learning networks(DVCGN) for multi-hop reading comprehension are proposed. Firstly, a heterogeneous graph-based node-level contrastive learning method is employed. Positive and negative sample pairs are generated at the node level, and both node-level and feature-level corruptions are introduced to the heterogeneous graph to construct dual views. The two corrupted views are updated iteratively through a graph attention network. DVCGN maximizes the similarity of node representations in dual views to learn node representations , obtain rich contextual semantic information and accurately model the current node representation and its relationship with the remaining nodes in the reasoning chain. Consequently, multi-granularity contextual information is effectively distinguished from interference information and richer mutual information is constructed for the reasoning chain. Furthermore, a question-guided graph node pruning method is proposed. It leverages question information to filter answer entity nodes, narrowing down the range of candidate answers and mitigating noise caused by similarity expressions in evidence sentences. Finally, experimental results on HOTPOTQA dataset demonstrate the superior performance of DVCGN.
[1] 顾迎捷,桂小林,李德福,等. 基于神经网络的机器阅读理解综述. 软件学报, 2020, 31(7): 2095-2126. (GU Y J, GUI X L, LI D F, et al. Survey of Machine Reading Comprehension Based on Neural Network. Journal of Software, 2020, 31(7): 2095-2126.) [2] RAJPURKAR P, ZHANG J, LOPYREV K, et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text // Proc of the Confe-rence on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2016: 2383-2392. [3] JOSHI M, CHOI E, WELD D, et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension // Proc of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2017, 1: 1601-1611. [4] TRISCHLER A, WANG T, YUAN X D, et al. NewsQA: A Machine Comprehension Dataset // Proc of the 2nd Workshop on Re-presentation Learning for NLP. Stroudsburg, USA: ACL, 2017: 191-200. [5] WELBL J, STENETORP P, RIEDEL S. Constructing Datasets for Multi-hop Reading Comprehension across Documents. Transactions of the Association for Computational Linguistics, 2018, 6: 287-302. [6] YANG Z L, QI P, ZHANG S Z, et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2018: 2369-2380. [7] NISHIDA K, NISHIDA K, NAGATA M, et al. Answering While Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 2335-2345. [8] 苏立新,郭嘉丰,范意兴,等. 基于标签增强的机器阅读理解模型. 模式识别与人工智能, 2020, 33(2): 106-112. (SU L X, GUO J F, FAN Y X, et al. Label-Enhanced Reading Comprehension Model. Pattern Recognition and Artificial Intelligence, 2020, 33(2): 106-112.) [9] MIN S, ZHONG V, ZETTLEMOYER L, et al. Multi-hop Reading Comprehension through Question Decomposition and Rescoring // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 6097-6109. [10] 姜文超,庄志刚,涂旭平,等. 结合外部知识的动态多层次语义抽取网络模型. 模式识别与人工智能, 2019, 32(5): 455-462. (JIANG W C, ZHUANG Z G, TU X P, et al. Dynamic Multiple-Level Semantic Extraction Model Based on External Knowledge. Pattern Recognition and Artificial Intelligence, 2019, 32(5): 455-462.) [11] SONG L F, WANG Z G, YU M, et al. Exploring Graph-Structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks[C/OL].[2022-2-06]. https://arxiv.org/pdf/1809.02040.pdf. [12] DE CAO N, AZIZ W, TITOV I. Question Answering by Reasoning across Documents with Graph Convolutional Networks // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(Long and Short Papers). Stroudsburg, USA: ACL, 2019: 2306-2317. [13] DHINGRA B, JIN Q, YANG Z L, et al. Neural Models for Reasoning over Multiple Mentions Using Coreference // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(Short Papers). Stroudsburg, USA: ACL, 2018, II: 42-48. [14] DING M, ZHOU C, CHEN Q B, et al. Cognitive Graph for Multi-hop Reading Comprehension at Scale // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 2694-2703. [15] TU M, WANG G T, HUANG J, et al. Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heteroge-neous Graphs // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 2704-2713. [16] XIAO Y X, QU Y R, QIU L, et al. Dynamically Fused Graph Network for Multi-hop Reasoning // Proc of the 57th Annual Mee-ting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 6140-6150. [17] FANG Y W, SUN S Q, GAN Z, et al. Hierarchical Graph Network for Multi-hop Question Answering // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2020: 8823-8838. [18] CHEN J F, LIN S T, DURRETT G. Multi-hop Question Answering via Reasoning Chains[C/OL]. [2022-2-06].https://arxiv.org/pdf/1910.02610.pdf. [19] GLASS M, GLIOZZO A, CHAKRAVARTI R, et al. Span Selection Pre-training for Question Answering // Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2020: 2773-2782. [20] GRAIL Q, PEREZ J, GAUSSIER E J G. Latent Question Reformulation and Information Accumulation for Multi-hop Machine Reading: USA. 20210256069. 2021-08-19. [21] TU M, HUANG K, WANG G T, et al. Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents. Proceeding of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 9073-9080. [22] SHAO N, CUI Y M, LIU T, et al. Is Graph Structure Necessary for Multi-hop Question Answering? // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2020: 7187-7192. [23] BELTAGY I, PETERS M E, COHAN A. Longformer: The Long-Document Transformer[C/OL]. [2022-2-06].https://arxiv.org/pdf/2004.05150.pdf. [24] NISHIDA K, NISHIDA K, SAITO I, et al. Towards Interpretable and Reliable Reading Comprehension: A Pipeline Model with Unanswerability Prediction // Proc of the International Joint Confe-rence on Neural Networks. Washington, USA: IEEE, 2021. DOI: 10.1109/IJCNN52387.2021.9534370. [25] SEONWOO Y, LEE S W, KIM J H, et al. Weakly Supervised Pre-training for Multi-hop Retriever // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2021: 694-704. [26] LI J W, REN M C, GAO Y, et al. Ask to Understand: Question Generation for Multi-hop Question Answering[C/OL].[2022-2-06]. https://arxiv.org/pdf/2203.09073.pdf. [27] LEE M, HAN K, SHIN M C. LittleBird: Efficient Faster & Longer Transformer for Question Answering // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2022: 5261-5277.