Abstract:Aiming at exploring the possibility of increasing the parsing accuracy by linguistic means, an experiment of Chinese dependency parsing is conducted by using MaltParser and a self-built treebank. Through the detailed analysis for the parsing results, the possible suggestion about improving the performance of the parser is provided and it is used as the guidance to modify the annotation scheme of the treebank. Experimental results show that the accuracy of unlabeled dependency attachment score increases 5.5%, and the accuracy of labeled score raises 7.5%.
[1] Abeillé A. Treebank: Building and Using Parsed Corpora. Dordrecht, Netherlands: Kluwer, 2003 [2] Tesnière L. Eléments de la Syntaxe Structurale. Paris, France: Klincksieck, 1959 [3] Feng Zhiwei. Dependency Grammar of Tesnière. Linguistics Abroad, 1983, (1): 57,63-65 (in Chinese) (冯志伟.特思尼耶尔的从属关系语法.国外语言学, 1983, (1): 57,63-65) [4] Hudson R A. Language Networks: The New Word Grammar. Oxford, USA: Oxford University Press, 2007 [5] Nivre J. Inductive Dependency Parsing, Dordrecht, Netherlands: Springer, 2006 [6] Nivre J, Hall J, Nilsson J, et al. MaltParser: A Language-Independent System for Data-Driven Dependency Parsing. Natural Language Engineering, 2007, 13(2): 95-135 [7] Liu Haitao. Huang Wei. A Chinese Dependency Syntax for Treebanking //Proc of the 20th Pacific Asia Conference on Language, Information and Computation. Wuhan, China, 2006: 126-133 [8] Liu Haitao. Factors Influencing the Results of Dependency Parsing //Sun Maosong, Chen Qunxiu, eds. Frontier of Content Computation and Applications. Beijing, China: Tsinghua University Press, 2007: 147-152 (in Chinese) (刘海涛.影响依存句法分析的因素探讨//孙茂松,陈群秀,主编.内容计算的研究与应用前沿.北京:清华大学出版社, 2007: 147-152) [9] Liu Haitao, Feng Zhiwei. Probabilistic Valency Pattern Theory for Natural Language Processing. Language Sciences, 2007, 6(3): 32-41 (in Chinese) (刘海涛,冯志伟.自然语言处理的概率配价模式理论.语言科学, 2007, 6(3): 32-41) [10] Liu Haitao. Probability Distribution of Dependency Distance. Glottometrics, 2007, 15: 1-12