基于HMM/SVM两级结构的汉语易混淆语音识别<sup>*</sup>

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (633 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract The recognition rate for confusable speech is still low in stateoftheart Chinese speech recognition systems based on HMM. The inherent defects of HMM are analyzed, then a twolevelarchitecture recognition framework combining HMM and SVM is proposed. A confidence estimation module is adopted to improve the performance and efficiency of the system. The information obtained by Viterbi decoding is utilized to construct new classes of feature for SVM, which solves the problem that the conventional SVM cannot directly process variable length sequences. The relevant issues, such as confidence estimation, classification feature extraction and SVM recognizer construction, are addressed. The experimental results of confusable Chinese speech show that compared with the hybrid HMM/SVM based system the proposed method can highly improve the recognition rate with little impact on the running speed.

Key words： Speech Recognition Confusable Speech Hidden Markov Model (HMM) Support Vector Machine (SVM)

Received: 06 April 2005

ZTFLH:	TP391.4
	TP181

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	WANG HuanLiang
	HAN JiQing
	LI HaiFeng
	ZHENG TieRan

Cite this article:

WANG HuanLiang,HAN JiQing,LI HaiFeng等. Confusable Chinese Speech Recognition Based on HMM/SVM TwoLevel Architecture[J]. , 2006, 19(5): 578-584.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/ OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2006/V19/I5/578

[1] Ganapathiraju A, Hamarker J, Picone J. Support Vector Machines for Speech Recognition // Proc of the International Conference on Spoken Language Processing. Sydney, Australia, 1998: 2923-2926
[2] Aldebaro K. Speech Recognition Using Discriminative Classifiers. Ph.D Dissertation. San Diero, USA: University of California, 2003
[3] Ganapathiraju A, Hamaker J E, Picone J. Applications of Support Vector Machines to Speech Recognition. IEEE Trans on Signal Processing, 2004, 52(8): 2348-2355
[4] Smith N, Gales M. Speech Recognition Using SVMs // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14. Cambridge, USA: MIT Press, 2002: 117-129
[5] Shimodaira H, Noma K, Nakai M, et al. Dynamic Time-Alignment Kernel in Support Vector Machine // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14.Cambridge, USA: MIT Press, 2002, Ⅱ: 921-928
[6] Fine S, Saon G, Gopinath R A. Digit Recognition in Noisy Environments via a Sequential GMM/SVM System // Proc of the International Conference on Acoustics, Speech, and Signal Processing. Orlando, USA, 2002: 2242-2246
[7] Salomon J, King S, Osborne M. Framewise Phone Classification Using Support Vector Machines // Proc of the International Conference on Spoken Language Processing. Denver, USA, 2002: 2645-2648
[8] Platt J C. Probabilities for SV Machines // Smola A J, Scholkpf B, Bartlett P L, et al, eds. Advances in Large Margin Classifiers. Cambridge, USA: MIT Press, 2000: 61-74
[9] Hsu C W, Lin C J. A Comparison of Methods for Multi-Class Support Vector Machines. IEEE Trans on Neural Networks, 2002, 13(2): 415-425
[10] Zhou Tongchun. Chinese Phonetics. Beijing, China: Beijing Normal University Press, 1999 (in Chinese)
(周同春. 汉语语音学. 北京:北京师范大学出版社, 1990)
[11] Chang C C, Lin C J. LIBSVM: A Library for Support Vector Machines [EB/OL]. [2001-04-01] http://www.csie.ntu.edu.tw/~cjlin/libsvm
[12] Li Husheng, Liu Jia, Liu Runsheng. High Performance Digit Mandarin Speech Recognition. Journal of Tsinghua University: Science and Technology, 2000, 40(1): 32-34 (in Chinese)
(李虎生, 刘加, 刘润生. 高性能汉语数码语音识别算法. 清华大学学报: 自然科学版, 2000, 40(1): 32-34)