Domain Machine Translation Method with Dynamic Incorporation of k-Nearest Neighbor Knowledge
HUANG Yuxin1,2, SHEN Tao1,2, JIANG Shuting1,2, ZENG Hao1,2, LAI Hua1,2
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504; 2. Key Laboratory of Artificial Intelligence in Yunnan Province, Kunming University of Science and Technology, Kunming 650504
Abstract Domain machine translation methods based on k-nearest neighbour retrieval improve translation quality by incorporating translation knowledge retrieved from a translation knowledge base. Existing methods enhance translation performance by fusing the decoder prediction distribution with k-nearest neighbour knowledge. However, the inaccuracy of the retrieved k-nearest neighbor knowledge may interfere with the prediction results of the model. To address this issue, a domain machine translation method with dynamic incorporation of k-nearest neighbor knowledge is proposed. The confidence of the decoder output distribution is first assessed. With the combination of gating mechanism, the proposed method dynamically decides whether to incorporate the k-nearest-neighbor retrieval results, thereby adjusting the degree of incorporation of k-nearest neighbor knowledge flexibly. The adaptive k-value module is introduced to reduce the interference caused by incorrect k-nearest neighbor knowledge. Besides, the distribution-guided loss is designed to steer the model output approach the target distribution gradually. On four domain-specific German-English machine translation datasets, the proposed method achieves improvements.
Fund:Supported by National Natural Science Foundation of China(No.62366027,62166023,62266027), Major Basic Research Project of Yunnan Province(No.202401BC070021), Science and Tech-nology Major Project of Yunnan Province(No.202303AP140008,202302AD080003), ″Double First-Class″ Science and Techno-logy Project of Kunming University of Science and Technology(No.202402AG050007).
Corresponding Authors:
LAI Hua, Master. associate professor. His research interests include intelligent information processing.
About author:: HUANG Yuxin, Ph.D., associate profe-ssor. His research interests include natural language processing and text summarization.SHEN Tao, Master student. His research interests include natural language processing and machine translation.JIANG Shuting, Ph.D. candidate. Her re-search interests include natural language processing and neural machine translation.ZENG Hao, Master student. His research interests include natural language processing and machine translation.
[1] SAUNDERS D. Domain Adaptation and Multi-domain Adaptation for Neural Machine Translation: A Survey. Journal of Artificial Intelligence Research, 2022, 75: 351-424. [2] REDKO I, MORVANT E, HABRARD A, et al. A Survey on Domain Adaptation Theory: Learning Bounds and Theoretical Guarantees[C/OL].[2024-09-10]. https://arxiv.org/pdf/2004.11829. [3] CHU C H, WANG R. A Survey of Domain Adaptation for Neural Machine Translation // Proc of the 27th International Conference on Computational Linguistics. Stroudsburg, USA: ACL, 2018: 1304-1319. [4] ZARDINI E, BLANZIERI E, PASTORELLO D. A Quantum k-Nearest Neighbors Algorithm Based on the Euclidean Distance Estimation. Quantum Machine Intelligence, 2024, 6(1). DOI: 10.1007/s42484-024-00155-2. [5] 张清华,肖嘉瑜,艾志华,等.自适应半径选择的近邻邻域分类器.模式识别与人工智能, 2022, 35(11): 989-998. (ZHANG Q H, XIAO J Y, AI Z H. Near Neighborhood Classifier with Adaptive Radius Selection. Pattern Recognition and Artificial Intelligence, 2022, 35(11): 989-998.) [6] SAKAMOTO T, AIZAWA A. Predicting Numerals in Text Using Nearest Neighbor Language Models // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023: 4795-4809. [7] KHANDELWAL U, FAN A, JURAFSKY D, et al. Nearest Neighbor Machine Translation[C/OL].[2024-09-10]. https://arxiv.org/pdf/2010.00710. [8] ZHENG X, ZHANG Z R, GUO J L, et al. Adaptive Nearest Neighbor Machine Translation // Proc of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Short Papers). Stroudsburg, USA: ACL, 2021: 368-374. [9] JIANG H, LU Z Y, MENG F D, et al. Towards Robust k-Nearest-Neighbor Machine Translation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2022: 5468-5477. [10] STAP D, MONZ C. Multilingual k-Nearest-Neighbor Machine Trans-lation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2023: 9200-9208. [11] NISHIDA Y, MORISHITA M, KAMIGAITO H, et al. Generating Diverse Translation with Perturbed kNN-MT // Proc of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop. Stroudsburg, USA: ACL, 2024: 9-31. [12] GAO Y, CAO Z W, MIAO Z J, et al. Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2024: 7990-8001. [13] YANG Z X, SUN R L, WAN X J. Nearest Neighbor Knowledge Distillation for Neural Machine Translation // Proc of the Confe-rence of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2022: 5546-5556. [14] ZHU W H, XU J J, HUANG S J, et al. INK: Injecting KNN Knowledge in Nearest Neighbor Machine Translation // Proc of the 61st Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2023: 15948-15959. [15] WANG J Y, WANG K, ZHANG Y Q, et al. Non-parametric, Nearest-Neighbor-Assisted Fine-Tuning for Neural Machine Translation[C/OL].[2024-09-10]. https://arxiv.org/pdf/2305.13648. [16] ZHU W H, HUANG S J, LÜ Y Z, et al. What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023: 2824-2836. [17] GAO R Z, ZHANG Z R, DU Y C, et al. Nearest Neighbor Machine Translation Is Meta-Optimizer on Output Projection Layer // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2023: 15592-15608. [18] MENG Y X, LI X Y, ZHENG X Y, et al. Fast Nearest Neighbor Machine Translation // Proc of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022: 555-565. [19] DEGUCHI H, WATANABE T, MATSUI Y, et al. Subset Retrie-val Nearest Neighbor Machine Translation // Proc of the 61st Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2023: 174-189. [20] ZHENG X, ZHANG Z R, HUANG S J, et al. Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2021: 4234-4241. [21] MARTINS P, MARINHO Z, MARTINS A. Efficient Machine Trans-lation Domain Adaptation // Proc of the 1st Workshop on Semiparametric Methods in NLP: Decoupling Logic from Knowledge. Stroudsburg, USA: ACL, 2022: 23-29. [22] SHI X Y, LIANG Y L, XU J N, et al. Towards Faster k-Nearest-Neighbor Machine Translation[C/OL].[2024-09-10]. https://arxiv.org/pdf/2312.07419. [23] WANG D X, FAN K, CHEN B X, et al. Efficient Cluster-Based k-Nearest-Neighbor Machine Translation // Proc of the 60th Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2022: 2175-2187. [24] MARTINS P H, ALVES J, VAZ T, et al. Empirical Assessment of kNN-MT for Real-World Translation Scenarios // Proc of the 24th Annual Conference of the European Association for Machine Translation. Stroudsburg, USA: ACL, 2023: 115-124. [25] ZHU W H, ZHAO Q F, LÜ Y Z, et al. kNN-BOX: A Unified Framework for Nearest Neighbor Generation // Proc of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Stroudsburg, USA: ACL, 2024: 10-17. [26] WANG Q, WENG R X, CHEN M. Learning Decoupled Retrieval Representation for Nearest Neighbour Neural Machine Translation // Proc of the 29th International Conference on Computational Linguistics. Stroudsburg, USA: ACL, 2022: 5142-5147. [27] JIN X Y, GE T, WEI F R. Plug and Play Knowledge Distillation for kNN-LM with External Logits // Proc of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing(Short Papers). Stroudsburg, USA: ACL, 2022: 463-469. [28] DAI Y H, ZHANG Z R, LIU Q Z, et al. Simple and Scalable Nearest Neighbor Machine Translation[C/OL].[2024-09-10]. https://arxiv.org/pdf/2302.12188. [29] CAO Z W, YANG B S, LIN H, et al. Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation // Proc of the 61st Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2023: 5841-5853. [30] FAN A, BHOSALE S, SCHWENK H, et al. Beyond English-Ce-ntric Multilingual Machine Translation. Journal of Machine Learning Research, 2021, 22(107): 1-48. [31] WU L J, TIAN F, QIN T, et al. A Study of Reinforcement Lear-ning for Neural Machine Translation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2018: 3612-3621. [32] PAULUS R, XIONG C M, SOCHER R. A Deep Reinforced Model for Abstractive Summarization[C/OL]. [2024-09-10]. https://arxiv.org/pdf/1705.04304. [33] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A Method for Automatic Evaluation of Machine Translation // Proc of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2002: 311-318. [34] TIEDEMANN J. Parallel Data, Tools and Interfaces in OPUS // Proc of the International Conference on Language Resources and Evaluation. Stroudsburg, USA: ACL, 2012: 2214-2218. [35] NG N, YEE K, BAEVSKI A, et al. Facebook FAIR's WMT19 News Translation Task Submission // Proc of the 4th Conference on Machine Translation(Shared Task Papers, Day 1). Stroudsburg, USA: ACL, 2019: 314-319. [36] BAPNA A, ARIVAZHAGAN N, FIRAT O. Simple, Scalable Adaptation for Neural Machine Translation // Proc of the Confe-rence on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2019: 1538-1548. [37] 张宝兴,彭敦陆,王雅峰.GEA-NMT:图卷积增强的多任务低资源机器翻译模型.小型微型计算机系统, 2024, 45(9): 2156-2164. (ZHANG B X, PENG D L, WANG Y F. GEA-NMT: Graph Co-nvolutional Network Enhanced Multi-task Adapting Neural Machine Translation. Journal of Chinese Computer Systems, 2024, 45(9): 2156-2164.) [38] 朱志国,郭军军,余正涛.一种Mask交互融合预训练知识的低资源神经机器翻译方法.小型微型计算机系统, 2024, 45(3): 591-597. (ZHU Z G, GUO J J, YU Z T. Low-Resource Neural Machine Translation Method Based on Mask Interactive Fusion of Pretrained Knowledge. Journal of Chinese Computer Systems, 2024, 45(3): 591-597.) [39] NIU X, MATHUR P, DINU G, et al. Evaluating Robustness to Input Perturbations for Neural Machine Translation // Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2020: 8538-8544.