|
|
A Constrained Line Search Optimization Method for Discriminative Training of HMMs |
LIU Cong,HU Yu,DAI Li-Rong,WANG Ren-Hua |
iFlytek Speech Laboratory,Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230027 |
|
|
Abstract In this paper a optimization algorithm called constrained line search (CLS) is proposed for discriminative training (DT) of Gaussian mixture Continuous Density Hidden Markov Model (CDHMM) in speech recognition. The CLS method is used to optimize the objective function of Maximum Mutual Information (MMI) criterion based discriminative training. In this method, discriminative training of HMM is firstly treated as a constrained optimization problem, and Kullback-Leibler divergence (KLD) between models is explicitly applied as a constraint during optimization. Based upon the idea of line search, it is pointed out that a simple formula of HMM parameters can be expressed as a simple quadratic formula by constraining the KLD between HMM of two successive iterations. The proposed CLS method can be applied to optimize all model parameters in Gaussian mixture CDHMMs, including means, covariances, and mixture weights. The proposed CLS approach is investigated on two benchmark speech recognition tasks of both English and Chinese, including TIDIGITS and Mandarin 863 database. Experimental results show that the CLS optimization method consistently outperforms the conventional extended Baum-Welch (EBW) method in both recognition performance and convergence characteristic.
|
Received: 25 June 2009
|
|
|
|
|
[1] Woodland P C, Povey D. Large Scale Discriminative Training of Hidden Markov Models for Speech Recognition. Computer Speech Language, 2002, 16(1): 25-47 [2] Juang B H, Chou W, Lee C H. Minimum Classification Error Rate Methods for Speech Recognition. IEEE Trans on Speech and Audio Processing, 1997, 5(3): 257-265 [3] Jiang Hui, Soong F, Lee C H. A Dynamic In-Search Data, Selection Method with Its Applications to Acoustic Modeling and Utterance Verification. IEEE Trans on Speech and Audio Processing, 2005, 13(5): 945-955 [4] Liu Bo, Jiang Hui, Zhou Jianlai, et al. Discriminative Training Based on the Criterion of Least Phone Competing Tokens for Large Vocabulary Speech Recognition // Proc of the IEEE International Conference on Acoustic, Speech, and Signal Processing. Philadelphia, USA, 2005, Ⅰ: 117-120 [5] Povey D, Woodland P C. Minimum Phone Error and I-Smoothing for Improved Discriminative Training // Proc of the IEEE International Conference on Acoustic, Speech, and Signal Processing. Orlando, USA, 2002, Ⅰ: 105-108 [6] Du Jun, Liu Peng, Soong F K, et al. Minimum Divergence Based Discriminative Training // Proc of the International Conference on Spoken Language Processing. Pittsburgh, USA, 2006: 2410-2413 [7] Normandin Y. Hidden Markov Models, Maximum Mutual Information Estimation, and the Speech Recognition Problem. Ph.D Dissertation. Montreal, Canada: McGill University. Department of Electrical Engineering, 1991 [8] Zu Yiqing. Design of Speech Material for Mandarin Continuous Speech Database. Chinese Journal of Acoustics, 1999, 24(3): 236-247 (in Chinese) (祖漪清.汉语连续语音数据库的语料设计.声学学报, 1999, 24(3): 236-247) |
|
|
|