一种针对区分性训练的受限线性搜索优化方法

摘要
图/表
参考文献
相关文章 (2)

全文: PDF (442 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要提出一种称为“受限线性搜索”的优化方法，并用于语音识别中混合高斯的连续密度隐马尔科夫(CDHMM)模型的区分性训练。该方法可用于优化基于最大互信息(MMI)准则的区分性训练目标函数。在该方法中，首先把隐马尔科夫模型(HMM)的区分性训练问题看成一个受限的优化问题，并利用模型间的KL度量作为优化过程中的一个限制。再基于线性搜索的思想，指出通过限制更新前后模型间的KL度量，可将HMM的参数表示成一种简单的二次形式。该方法可用于优化混合高斯CDHMM模型中的任何参数，包括均值、协方差矩阵、高斯权重等。将该方法分别用于中英文两个标准语音识别任务上，包括英文TIDIGITS数据库和中文863数据库。实验结果表明，该方法相对传统的扩展Baum-Welch方法在识别性能和收敛特性上都取得一致提升。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘聪
	胡郁
	戴礼荣
	王仁华

关键词 ：自动语音识别, 区分性训练, 受限线性搜索(CLS)

Abstract：In this paper a optimization algorithm called constrained line search (CLS) is proposed for discriminative training (DT) of Gaussian mixture Continuous Density Hidden Markov Model (CDHMM) in speech recognition. The CLS method is used to optimize the objective function of Maximum Mutual Information (MMI) criterion based discriminative training. In this method, discriminative training of HMM is firstly treated as a constrained optimization problem, and Kullback-Leibler divergence (KLD) between models is explicitly applied as a constraint during optimization. Based upon the idea of line search, it is pointed out that a simple formula of HMM parameters can be expressed as a simple quadratic formula by constraining the KLD between HMM of two successive iterations. The proposed CLS method can be applied to optimize all model parameters in Gaussian mixture CDHMMs, including means, covariances, and mixture weights. The proposed CLS approach is investigated on two benchmark speech recognition tasks of both English and Chinese, including TIDIGITS and Mandarin 863 database. Experimental results show that the CLS optimization method consistently outperforms the conventional extended Baum-Welch (EBW) method in both recognition performance and convergence characteristic.

Key words： Automatic Speech Recognition Discriminative Training Constrained Line Search (CLS)

收稿日期: 2009-06-25

ZTFLH:

TN912.34

作者简介: 刘聪，男，1984年生，博士研究生，主要研究方向为语音识别.E-mail:ustc.congliu@gmail.com.胡郁，男，1978年生，博士，主要研究方向为语音合成、语音识别等.戴礼荣，男，1962年生，教授，主要研究方向为语音识别、语音合成、音视频检索、实时DSP技术等.王仁华，男，1943年生，教授，主要研究方向为语音编码、语音合成、语音识别、多媒体通信等.

引用本文:

刘聪，胡郁，戴礼荣，王仁华. 一种针对区分性训练的受限线性搜索优化方法[J]. 模式识别与人工智能, 2010, 23(4): 450-455. LIU Cong,HU Yu,DAI Li-Rong,WANG Ren-Hua. A Constrained Line Search Optimization Method for Discriminative Training of HMMs. , 2010, 23(4): 450-455.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2010/V23/I4/450