摘要 现有的中文事件抽取方法存在触发词和论元依赖建模不足的问题,削弱事件内的信息交互,导致论元抽取性能低下,特别是论元角色存在重叠的情况下.对此,文中提出基于图注意力和表指针网络的中文事件抽取方法(Chinese Event Extraction Method Based on Graph Attention and Table Pointer Network, ATCEE).首先,融合预训练字符向量和词性标注向量作为特征输入,并利用双向长短期记忆网络,得到事件文本的强化语义特征.再将字符级建模的依存句法图引入图注意力网络,捕获文本中各组成成分的长距离依赖关系.然后,使用表填充的方法进行特征融合,进一步增强触发词和其对应的所有论元之间的依赖性.最后,将学习得到的表特征输入全连接层和表指针网络层,进行触发词和论元的联合抽取,使用表指针网络对论元边界进行解码,更好地识别长论元实体.实验表明:ATCEE在ACE2005和DuEE1.0这两个中文基准数据集上都有明显的性能提升,并且字符级依存特征和表填充策略在一定程度上可以解决论元角色重叠问题.ATCEE源代码地址如下:https://github.com/event6/ATCEE.
Abstract:The existing Chinese event extraction methods suffer from inadequate modeling of dependencies between an event trigger word and all its corresponding arguments, which results in weakened information interaction within an event and poor performance in argument extraction, especially when there is argument role overlap. To address this issue, a Chinese event extraction method based on graph attention and table pointer network(ATCEE) is proposed in this paper. Firstly,pre-trained character vectors and part-of-speech tagging vectors are fused as feature inputs. Then, the enhanced feature of the event text is obtained by a bidirectional long short-term memory network. Next, a character-level dependency syntax graph is constructed and introduced into multi-layer graph attention network to capture long-range dependencies among constituents of the event text. Subsequently, dependencies between an event trigger word and all its corresponding arguments are further enhanced via a table filling strategy. Finally, the learned table feature is input into a fully connected layer and table pointer network layer for joint extraction of trigger words and arguments. Consequently, long argument entities can be identified better by decoding argument boundaries with a table pointer network. Experimental results indicate that ATCEE method significantly outperforms previous event extraction methods on Chinese benchmark datasets, ACE2005 and DuEE1.0. In addition, the overlap problem of the event argument role is solved by introducing character-level dependency feature and table filling strategy to some extent. The source code of ATCEE can be found at the following website: https://github.com/event6/ATCEE.
刘炜, 马亚威, 彭艳, 李卫民. 基于图注意力和表指针网络的中文事件抽取方法[J]. 模式识别与人工智能, 2023, 36(5): 459-470.
LIU Wei, MA Yawei, PENG Yan, LI Weimin. Chinese Event Extraction Method Based on Graph Attention and Table Pointer Network. Pattern Recognition and Artificial Intelligence, 2023, 36(5): 459-470.
[1] GUAN S P, CHENG X Q, BAI L, et al. What Is Event Knowledge Graph: A Survey. IEEE Transactions on Knowledge and Data Engineering, 2022. DOI: 10.1109/TKDE.2022.3180362. [2] CHEN J H, DU S Y, YANG S. Mining and Evolution Analysis of Network Public Opinion Concerns of Stakeholders in Hot Social Events. Mathematics, 2022, 10(12). DOI: 10.3390/math10122145. [3] LEE M X, MARLOT M. Information Retrieval from Oil and Gas Unstructured Data with Contextualized Framework // Proc of the 3rd EAGE Digitalization Conference and Exhibition. Bunnik, USA: EAGE, 2023. DOI: 10.3997/2214-4609.202332039. [4] WALKER C, STRASSEL S, MEDERO J, et al. ACE 2005 Multilingual Training Corpus. Progress of Theoretical Physics Supplement, 2006, 110(110): 261-276. [5] LI Q, LI J X, SHENG J W, et al. A Survey on Deep Learning Event Extraction: Approaches and Applications. IEEE Transactions on Neural Networks and Learning Systems, 2022. DOI: 10.1109/TNNLS.2022.3213168. [6] CHEN Y B, XU L H, LIU K, et al. Event Extraction via Dynamic Multi-pooling Convolutional Neural Networks // Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Proce-ssing(Long Papers). Stroudsburg, USA: ACL, 2015: 167-176. [7] NGUYEN T H, CHO K, GRISHMAN R. Joint Event Extraction via Recurrent Neural Networks // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics(Human Language Technologies). Stroudsburg, USA: ACL, 2016: 300-309. [8] YANG S, FENG D W, QIAO L B, et al. Exploring Pre-trained Language Models for Event Extraction and Generation // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 5284-5294. [9] ZENG Y, YANG H H, FENG Y S, et al. A Convolution BiLSTM Neural Network Model for Chinese Event Extraction // Proc of the 5th National CCF Conference on Natural Language Processing and Chinese Computing. Berlin, Germany: Springer, 2016: 275-287. [10] LECUN Y, BOSER B, DENKER J S, et al. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1989, 1(4): 541-551. [11] HOCHREITER S, SCHMIDHUBER J. Long Short-Term Memory. Neural Computation, 1997, 9(8): 1735-1780. [12] RAMSHAW L A, MARCUS M P. Text Chunking Using Transformation-Based Learning // ARMSTRONG S, CHURCH K, ISABELLE P, et al., eds. Natural Language Processing Using Very Large Corpora. Berlin, Germany: Springer, 1999: 157-176. [13] DING N, LI Z R, LIU Z Y, et al. Event Detection with Trigger-Aware Lattice Neural Network // Proc of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2019: 347-356. [14] WU Y, ZHANG J Y. Chinese Event Extraction Based on Attention And Semantic Features: A Bidirectional Circular Neural Network. Future Internet, 2018, 10(10). DOI: 10.3390/fi10100095. [15] XU N, XIE H H, ZHAO D Y. A Novel Joint Framework for Multiple Chinese Events Extraction // Proc of the 19th China National Conference on Chinese Computational Linguistics. Berlin, Germany: Springer, 2020: 174-183. [16] SHENG J W, GUO S, YU B W, et al. CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2021: 164-174. [17] LIU X, LUO Z C, HUANG H Y. Jointly Multiple Events Extraction via Attention-Based Graph Information Aggregation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2018: 1247-1256. [18] KIPF T N, WELLING M. Semi-Supervised Classification with Gra-ph Convolutional Networks[C/OL]. [2022-2-03].https://arxiv.org/pdf/1609.02907.pdf. [19] YAN H R, JIN X L, MENG X B, et al. Event Detection with Multi-order Graph Convolution and Aggregated Attention // Proc of the Conference on Empirical Methods in Natural Language Proce-ssing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2019: 5766-5770. [20] YOU H L, SAMUEL D, TOUILEB S, et al. EventGraph: Event Extraction as Semantic Graph Parsing // Proc of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text. Stroudsburg, USA: ACL, 2022: 7-15. [21] WU X H, WANG T R, FAN Y P, et al. Chinese Event Extraction via Graph Attention Network. ACM Transactions on Asian and Low-Resource Language Information Processing, 2022, 21(4). DOI: 10.1145/3494533. [22] VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph Attention Networks[C/OL].[2022-2-03]. https://arxiv.org/pdf/1710.10903.pdf. [23] GUPTA P, SCHÜTZE H, ANDRASSY B. Table Filling Multi-task Recurrent Neural Network for Joint Entity and Relation Extraction // Proc of the 26th International Conference on Computational Linguistics(Technical Papers). Stroudsburg, USA: ACL, 2016: 2537-2547. [24] WU Z, YING C C, ZHAO F, et al. Grid Tagging Scheme for Aspect-Oriented Fine-Grained Opinion Extraction // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2016: 2576-2585. [25] ZENG D J, TIAN J, PENG R Y, et al. Joint Event Extraction Based on Global Event-Type Guidance and Attention Enhancement. Computers, Materials and Continua, 2021, 68(3): 4161-4173. [26] SUN Y, WANG S H, LI Y K, et al. ERNIE: Enhanced Representation through Knowledge Integration[C/OL].[2022-2-03]. https://arxiv.org/pdf/1904.09223.pdf. [27] CHE W X, FENG Y L, Qin L B, et al. N-LTP: An Open-Source Neural Language Technology Platform for Chinese // Proc of the Conference on Empirical Methods in Natural Language Processing(System Demonstrations). Stroudsburg, USA: ACL, 2021: 42-49. [28] LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data // Proc of the 18th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann, 2001: 282-289. [29] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers). Stroudsburg, USA: ACL, 2019: 4171-4186. [30] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6000-6010. [31] 许晶航,左万利,梁世宁,等. 基于图注意力网络的因果关系抽取. 计算机研究与发展, 2020, 57(1): 159-174. (XU J H, ZUO W L, LIANG S N, et al. Causal Relation Extraction Based on Graph Attention Networks. Journal of Computer Research and Development, 2020, 57(1): 159-174.) [32] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the Inception Architecture for Computer Vision // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2818-2826. [33] MÜLLER R, KORNBLITH S, HINTON G E. When Does Label Smoothing Help? // Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 4694-4703. [34] ZHANG C B, JIANG P T, HOU Q B, et al. Delving Deep into Label Smoothing. IEEE Transactions on Image Processing, 2021, 30: 5984-5996. [35] LOSHCHILOV I, HUTTER F. Decoupled Weight Decay Regularization[C/OL]. [2022-2-03].https://arxiv.org/pdf/1711.05101.pdf. [36] LI X Y, LI F Y, PAN L, et al. DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios // Proc of the 9th International Conference on Natural Language Processing and Chinese Computing. Berlin, Germany: Springer, 2020: 534-545. [37] GOYAL P, DOLLÁR P, GIRSHICK R, et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour[C/OL].[2022-2-03]. https://arxiv.org/pdf/1706.02677.pdf. [38] KROGH A, HERTZ J. A Simple Weight Decay Can Improve Gene-ralization // Proc of the 4th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 1991: 950-957. [39] 贺瑞芳,段绍杨. 基于多任务学习的中文事件抽取联合模型. 软件学报, 2019, 30(4): 1015-1030. (HE R F, DUAN S Y. Joint Chinese Event Extraction Based Multi-task Learning. Journal of Software, 2019, 30(4): 1015-1030.) [40] KAN Z G, QIAO L B, YANG S, et al. Event Arguments Extraction via Dilate Gated Convolutional Neural Network with Enhanced Local Features. IEEE Access, 2020, 8: 123483-123491. [41] LYU Z H, SHI K J, LI X, et al. Multi-grained Dependency Graph Neural Network for Chinese Open Information Extraction // Proc of the 25th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Berlin, Germany: Springer, 2021: 155-167.