模式识别与人工智能
Saturday, May. 3, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence  2023, Vol. 36 Issue (10): 931-941    DOI: 10.16451/j.cnki.issn1003-6059.202310006
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Siamese Contrastive Network Based Multilingual Parallel Sentence Pair Extraction between Chinese and Southeast Asian Languages
ZHOU Yuanzhuo1,2, MAO Cunli1,2, SHEN Zheng1,2, ZHANG Siqi1,2, YU Zhengtao1,2, WANG Zhenhan1,2
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504;
2. Key Laboratory of Artificial Intelligence in Yunnan Province, Kunming University of Science and Technology, Kunming 650504

Download: PDF (1114 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  The poor performance of parallel sentence pair extraction application on Southeast Asian languages with scarce resources is primarily due to the weak representation capabilities of the sentence pair extraction models caused by the lack of training corpora. Therefore, a siamese contrastive network based multilingual parallel sentence pair extraction between Chinese and Southeast Asian languages is proposed to optimize model structure, training strategy and data. Firstly, a siamese contrastive network framework is employed, integrating contrastive learning concept into the siamese network to enhance the representation capability for parallel sentence pairs. Next, a strategy of joint training with similar languages is introduced to share knowledge effectively and improve the learning ability of the model. Finally, Chinese-mixed Southeast Asian parallel sentence pairs are constructed by multilingual word replacement, providing abundant sample information for training. Experiments on Chinese-Thai and Chinese-Lao datasets demonstrate that the proposed method effectively enhances the performance of parallel sentence pair extraction.
Key wordsParallel Sentence Pair Extraction      Contrastive Learning      Joint Training      Siamese Network     
Received: 06 September 2023     
ZTFLH: TP391.1  
Fund:National Natural Science Foundation of China(No.62166023,U21B2027,61972186), Major Science and Technology Projects of Yunnan Province(No.202103AA080015,202203AA080004,202302AD080003), Yunnan Fundamental Research Projects(No.202301AT070471)
Corresponding Authors: MAO Cunli, Ph.D., professor. His research interests include natural language processing, information retrieval and machine translation.   
About author:: ZHOU Yuanzhuo, master student. His research interests include natural language processing and machine translation. SHEN Zheng, master student. His research interests include natural language processing and machine translation.ZHANG Siqi, Ph.D. candidate. Her research interests include natural language processing and machine translation.YU Zhengtao, Ph.D., professor. His research interests include natural language processing, information retrieval and machine translation.WANG Zhenhan, Ph.D. candidate. His research interests include natural language processing and machine translation.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
ZHOU Yuanzhuo
MAO Cunli
SHEN Zheng
ZHANG Siqi
YU Zhengtao
WANG Zhenhan
Cite this article:   
ZHOU Yuanzhuo,MAO Cunli,SHEN Zheng等. Siamese Contrastive Network Based Multilingual Parallel Sentence Pair Extraction between Chinese and Southeast Asian Languages[J]. Pattern Recognition and Artificial Intelligence, 2023, 36(10): 931-941.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202310006      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2023/V36/I10/931
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn