模式识别与人工智能
Thursday, Apr. 3, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2015, Vol. 28 Issue (3): 209-213    DOI: 10.16451/j.cnki.issn1003-6059.201503003
Papers and Reports Current Issue| Next Issue| Archive| Adv Search |
Speech Recognition Based on Deep Neural Networks on Tibetan Corpus
YUAN Sheng-Long, GUO Wu, DAI Li-Rong
National Engineering Laboratory for Speech and Language Information Processing,Department of Electronic Engineering and Information Science,University of Science and Technology of China, Hefei 230027

Download: PDF (347 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Large vocabulary continuous speech recognition on telephonic conversational Tibetan is firstly addressed in this paper. As a minority language, the major difficulty in Tibetan speech recognition is data deficiency. In this paper, the acoustic model of Tibetan is trained based on deep neural networks (DNN).To address the issue of data deficiencies, the DNN models of other majority languages are used as the initial networks of the objective Tibetan DNN model. In addition, phonetic questions of Tibetan generated by phonetic expert are unavailable due to the lacking knowledge of phonetics. To reduce the number of tri-phone hidden Markov models(HMM) in Tibetan speech recognition, phonetic questions automatically generated in the data driven manner are used for tying the tri-phone HMM. In this paper, different clustering of tri-phone states is tested and the words accuracy is about 30.86% on the test corpus by Gaussian mixture model(GMM). When the acoustic model is trained based on DNN, 3 kinds of DNN model trained by different large corpus are adopted. The experimental results show that the proposed methods can improve the recognition performance, and the words accuracy is about 43.26% on the test corpus.
Key wordsTibetan      Continuous Speech Recognition      Data Driven      Deep Neural Networks(DNN)     
Received: 21 October 2013     
ZTFLH: TP18  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
YUAN Sheng-Long
GUO Wu
DAI Li-Rong
Cite this article:   
YUAN Sheng-Long,GUO Wu,DAI Li-Rong. Speech Recognition Based on Deep Neural Networks on Tibetan Corpus[J]. , 2015, 28(3): 209-213.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201503003      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2015/V28/I3/209
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn