|
|
Semi-supervised Neural Machine Translation Based on Sentence-Level BLEU Metric Data Selection |
YE Shaolin, GUO Wu |
National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei 230027 |
|
|
Abstract The performance of statistical machine translation is improved by language model. However, the monolingual corpus is not equal to be effectively used by neural machine translation. To solve this problem, a semi-supervised neural machine translation model based on sentence-level bilingual evaluation understudy(BLEU) metric data selection is proposed. The candidate translations for non-labeled data are firstly generated by statistical machine translation and neural machine translation models, respectively. Then the candidate translations are selected through sentence-level BLEU, and the selected candidate translations are added to the labeled dataset to conduct semi-supervised joint training. The experimental results demonstrate the effectiveness of the proposed algorithm in the usage of non-labeled data. In the NIST Chinese-English translation tasks, the proposed method obtains an obvious improvement over the baseline system only with the fine labeled data.
|
Received: 12 May 2017
|
|
Fund:Supported by National Key Research and Development Program of China(No.2016YFB1001303) |
About author:: (YE Shaolin, born in 1993, master student. His research interests include machine translation.) (GUO Wu(Corresponding author), born in 1973, Ph.D., associate professor. His research interests include speech signal proce-ssing and natural language processing.) |
|
|
|
[1] KALCHBRENNER N, BLUNSOM P. Recurrent Continuous Translation Models[C/OL]. [2017-03-28]. http://www.aclweb.org/anthology/D13-1176. [2] SUTSKEVER I, VINYALS O, LE Q V. Sequence to Sequence Learning with Neural Networks[C/OL]. [2017-03-28]. https:// arxiv.org/pdf/1409.3215.pdf. [3] BAHDANAU D, CHO K, BENGIO Y. Neural Machine Translation by Jointly Learning to Align and Translate[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1409.0473.pdf. [4] KOEHN P, OCH F J, MARCU D. Statistical Phrase-Based Translation // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg, USA: ACL, 2003, I: 48-54. [5] BROWN P F, PIETRA V J D, PIETRA S A D, et al. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 1993, 19(2): 263-311. [6] OCH F J. Minimum Error Rate Training in Statistical Machine Translation // Proc of the 41st Annual Meeting on Association for Computational Linguistics. Stroudsburg, USA: ACL, 2003, I: 160-167. [7] SENNRICH R, HADDOW B, BIRCH A. Improving Neural Machine Translation Models with Monolingual Data[C/OL]. [2017-03-28]. https://128.84.21.199/pdf/1511.06709v1.pdf. [8] GULCEHRE C, FIRAT O, XU K, et al. On Using Monolingual Corpora in Neural Machine Translation [C/OL]. [2017-03-28]. https://arxiv.org/pdf/1503.03535.pdf. [9] CHENG Y, XU W, HE Z J, et al. Semi-supervised Learning for Neural Machine Translation[C/OL]. [2017-03-28]. http://www.aclweb.org/anthology/P/P16/P16-1185.pdf. [10] TU Z P, LIU Y, SHANG L F, et al. Neural Machine Translation with Reconstruction[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1611.01874.pdf. [11] TU Z P, LIU Y, LU Z D, et al. Context Gates for Neural Machine Translation[C/OL]. [2017-03-28]. https://arxiv.org/pdf/1608.06043.pdf. [12] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A Method for Automatic Evaluation of Machine Translation // Proc of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2002: 311-318. [13] TU Z P, LU Z D, LIU Y, et al. Modeling Coverage for Neural Machine Translation [C/OL]. [2017-03-28]. http://aclweb.org/anthology/P16-1008. [14] KOEHN P, HOANG H, BIRCH A, et al. Moses: Open Source Toolkit for Statistical Machine Translation // Proc of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Stroudsburg, USA: ACL, 2007: 177-180. [15] 谢传栋,郭 武.基于困惑度数据挑选的半监督声学建模.模式识别与人工智能, 2016, 29(6): 542-547. (XIE C D, GUO W. Semi-supervised Acoustic Modeling Based on Perplexity Data Selection. Pattern Recognition and Artificial Inte-lligence. 2016, 29(6): 542-547.) [16] BERGER A L, PIETRA S A D, PIETRA V J D. A Maximum Entropy Approach to Natural Language Processing. Computational Linguistics, 1996, 22(1): 39-71. [17] CHEN S F, GOODMAN J. An Empirical Study of Smoothing Techniques for Language Modeling. Computer Speech & Language, 1999, 13(4): 359-394. [18] STOLCKE A. SRILM-An Extensible Language Modeling Toolkit[C/OL]. [2017-03-28]. http://www.speech.sri.com/projects/srilm/papers/icslp2002-srilm.pdf. |
|
|
|