|
|
A Mongolian-English Word Alignment Approach Based on Discriminative Model |
ZHANG Guan-Hong1, Odbal2, GONG Zheng3 |
1.Key Laboratory of Network and Intelligent Information Processing,Department of Computer Science and Technology,Hefei University,Hefei 230601 2.Research Center for Biomimetic Sensing and Control,Institutes of Intelligent Machines, Chinese Academy of Sciences,Hefei 230031 3.College of Computer Science,Inner Mongolia University,Hohhot 010021 |
|
|
Abstract Word alignment is an essential issue in the field of natural language processing.A discriminative word alignment method is proposed using the linear CRF model for Mongolian-English language pair. According to the differences between Mongolian and English languages, morphological, lexical and part-of-speech features can be incorporated into the CRF model, and a dual-layer CRF word alignment model is constructed. In the first layer, the chunks that are split from the sentence are aligned. Then in the second layer, the words of chunks are aligned using CRF word alignment model. The experimental results on Mongolian-English task demonstrate that the proposed method improves the performance of word alignment.
|
Received: 07 September 2010
|
|
|
|
|
[1] Brown P F,Pietra V J D,Pietra S A D,et al.The Mathematics of Statistical Machine Translation: Parameter Estimation.Computational Linguistics,1993,19(2): 263-311 [2] Vogel S,Ney H,Tillmann C.HMM-Based Word Alignment in Statistical Translation // Proc of the 16th Conference on Computational Linguistics.Stroudsburg,USA,1996: 836- 841 [3] Och F J,Ney H.A Systematic Comparison of Various Statistical Alignment Models.Computational Linguistics,2003,29(1): 19-51 [4] Wu Dekai.Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora.Computational Linguistics,1997,23(3): 377-403 [5] Zhang Hao,Gildea D.Stochastic Lexicalized Inversion Transduction Grammar for Alignment // Proc of the 43rd Annual Meeting on Association for Computational Linguistics.Ann Arbor,USA,2005: 475-482 [6] Haghighi A,Blitzer J,de Nero J,et al.Better Word Alignments with Supervised ITG Models // Proc of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.Suntec,Singapore,2009: 923-931 [7] Fraser A,Marcu D.Getting the Structure Right for Word Alignment: LEAF // Proc of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.Prague,Czech Republic,2007: 51-60 [8] Moore R C,Yih W T,Bode A.Improved Discriminative Bilingual Word Alignment // Proc of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association of Computational Linguistics.Sydney,Australia,2006: 513-520 [9] Ittycheriah A,Roukos S.A Maximum Entropy Word Aligner for Arabic-English Machine Translation // Proc of the Conference on Human Language Technology and Conference on Empirical Methods in Natural Language Processing.Vancouver,Canada,2005: 89-96 [10] Ayan N F,Dorr B J.A Maximum Entropy Approach to Combining Word Alignments // Proc of the Main Conference on Human Language Technology and Conference of the North American Chapter of the Association for Computational Linguistics.New York,USA,2006: 96-103 [11] Liu Yang,Liu Qun,Lin Shouxun.Log-Linear Models for Word Alignment // Proc of the 43rd Annual Meeting on Association for Computational Linguistics.Ann Arbor,USA,2005: 459-466 [12] Wu Hua,Wang Haifeng,Liu Zhanyi.Boosting Statistical Word Alignment Using Labeled and Unlabeled Data // Proc of the COLING/ACL on Main Conference Poster Sessions.Sydney,Australia,2006: 913-920 [13] Blunsom P,Cohn T.Discriminative Word Alignment with Conditional Random Fields // Proc of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics.Sydney,Australia,2006: 65-72 [14] Niehues J,Vogel S.Discriminative Word Alignment via Alignment Matrix Modeling // Proc of the 3rd Workshop on Statistical Machine Translation.Columbus,USA,2008: 18-25 [15] Lü Xueqiang,Wu Honglin,Yao Tianshun.Aligning English-Chinese Words without Bilingual Dictionary.Journal of Computers,2004,27(8):1036-1045 (in Chinese) (吕学强,吴宏林,姚天顺.无双语词典的英汉词对齐.计算机学报,2004,27(8): 1036-1045) [16] Wu Honglin,Liu Shaoming,Yu Ge.Word Alignment between Chinese and Japanese Based on Weighted Bipartite Graph.Journal of Chinese Information Processing,2007,21(5): 101-106 (in Chinese) (吴宏林,刘绍明,于 戈.基于加权二部图的汉日词对齐.中文信息学报,2007,21(5): 101-106) [17] Lü Yajuan,Zhao Tiejun,Li Sheng.Bilingual Structure Alignment Based on Monolingual Parsing.Journal of Computer Research and Development,2003,40(7): 970-976 (in Chinese) (吕雅娟,赵铁军,李 生.单语句法分析指导的双语结构对齐.计算机研究与发展,2003,40(7): 970-976) [18] Xue Yan.A Study on Chinese Mongolian Word Alignment and the Related Technologies.Ph.D Dissertation.Neimenggu,China: Inner Mongalia University,2009 (in Chinese) (雪 艳.汉蒙词语对齐及相关技术研究.博士学位论文.内蒙古: 内蒙古大学,2009) [19] Ma Yanjun,Stroppa N,Way A.Bootstrapping Word Alignment via Word Packing // Proc of the 45th Annual Meeting of the Association of Computational Linguistics.Prague,Czech Republic,2007: 304-311 |
|
|
|