Pattern Recognition and Artificial Intelligence  2024, Vol. 37 Issue (1): 85-94    DOI: 10.16451/j.cnki.issn1003-6059.202401007
Multi-input Fusion Spelling Error Correction Model Based on Contrast Optimization
WU Yaoyao1,2,3, HUANG Ruizhang1,2,3, BAI Ruina1,2,3, CAO Junhang1,2,3, ZHAO Jianhui1,2,3
1. Engineering Research Center of Text Computing and Cognitive Intelligence of the Ministry of Education, Guizhou University, Guiyang 550025;
2. State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025;
3. College of Computer Science and Technology, Guizhou University, Guiyang 550025

Abstract  Chinese spelling correction is essential in text editing. Most of the existing Chinese spelling error correction models are single input models, and there are limitations in the semantic information and error correction results of the models. In this paper, a multi-input fusion spelling error correction method based on contrast optimization, MIF-SECCO, is proposed. MIF-SECCO contains two stages: multi-input semantic learning and contrast learning-driven semantic fusion error correction. In the first stage, preliminary error correction results from multiple single input models are integrated to provide sufficient complementary semantic information for semantic fusion. In the second stage, multiple complementary sentence semantics are optimized based on the contrastive learning approach to avoid over-correction of sentences by the model. The limitations of error correction results of the model are improved by fusing multiple complementary semantics for re-correction of erroneous sentences. Experimental results on the public datasets SIGHAN13, SIGHAN14 and SIGHAN15 demonstrate MIF-SECCO effectively improves the error correction performance of the model.
Key wordsChinese Spelling Error Correction      Multi-input Semantic Learning      Complementary Semantic Fusion      Contrastive Learning Optimization     
Received: 06 September 2023     
ZTFLH: TP391.1  
Fund:National Natural Science Foundation of China(No.62066007), Key Technology Research and Development Pro-gram of Guizhou Province(No.2022277)
Corresponding Authors: HUANG Ruizhang, Ph.D., professor. Her research inte-rests include natural language understanding, data fusion analysis, text mining and knowledge discovery.   
About author:: WU Yaoyao, Master student. Her research interests include natural language processing.BAI Ruina, Ph.D. candidate. Her research interests include text mining and machine learning.CAO Junhang, Master student. Her research interests include natural language processing.ZHAO Jianhui, Master student. His research interests include natural language processing.
