基于音素识别的语种辨识方法中的因子分析

摘要
图/表
参考文献
相关文章 (3)

全文: PDF (403 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在基于音素识别的语种辨识系统中，特定的一段语音，音素识别的结果会受到说话人和信道等干扰因素的影响。对此，文中基于音素搭配关系对每段语音构建相应的特征向量表示。在向量空间中，利用因子分析建立噪声子空间的数学描述模型，并在语言模型的训练和识别过程加以消除。在NISTLRE2007的测试任务中，相对于基于音素识别的语种辨识基线系统，该方法可有效提高系统性能。在30s时长测试中，基于音素识别的语言模型和基于音素识别的支持向量机模型的等错误率分别相对降低14。4%和12。9%。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	仲海兵
	宋彦
	戴礼荣

关键词 ：自动语种识别, 因子分析, 音素识别器

Abstract：In the phoneme recognition based language identification system, the key issue is whether the tokens or the token sequence can reflect the language related information or not. However, it is observed that for certain utterance, the noise in the output token sequence from the phone recognizer is introduced due to the channel, speaker and background clutters. To address this problem, each utterance is represented in n-gram vector. And in this vector space, the factor analysis is applied to model the noise subspace, which will be reduced in final modeling process. The experiment results on NIST LRE 2007 show that the proposed method can outperform the existing phone recognition based language identification system. In 30s evaluation task, the equal error rate (EER) of recognition reduces relatively about 14.4% against the baseline phone recognition followed by language modeling (PRLM) system, while about 12.9% against the baseline phone recognition followed by support vector machine (PRSVM) system.

Key words： Automatic Language Identification Factor Analysis Phone Recognizer

收稿日期: 2010-07-26

ZTFLH:

TN912.34

作者简介: 仲海兵，男，1986年生，硕士研究生，主要研究方向为语种识别、语音信号处理。E-mail:zhbing@mail。ustc。edu。cn。宋彦，男，1972年生，博士，讲师，主要研究方向为音、视频内容分析与检索。戴礼荣，男，1962年生，教授，博士生导师，主要研究方向为数字信号处理、模式识别。

引用本文:

仲海兵，宋彦，戴礼荣. 基于音素识别的语种辨识方法中的因子分析[J]. 模式识别与人工智能, 2012, 25(1): 105-110. ZHONG Hai-Bing, SONG Yan, DAI Li-Rong. Factor Analysis for Language Identification Based on Phoneme Recognition. , 2012, 25(1): 105-110.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2012/V25/I1/105

[1] Matejka P,Schwarz P,Cernocky J,et al.Phonotactic Language Identification Using High Quality Phoneme Recognition // Proc of the 9th European Conference on Speech Communication and Technology.Lisbon,Portugal,2005: 2237-2241
[2] Povey D.Discriminative Training for Large Vocabulary Speech Recognition.Ph.D Dissertation.Cambridge,UK: Cambridge University,2004
[3] Gauvain J L,Messaoudi A,Schewenk H.Language Recognition Using Phone Lattices // Proc of the 8th International Conference on Spoken Language Processing.Jeju Island,Korea,2004: 1283-1286
[4] Shen Wade,Reynolds D.Improving Phonotactic Language Recognition with Acoustic Adaption // Proc of the 8th Annual Conference of
the International Speech Communication Association.Antwerp,Belgium,2007: 358-361
[5] Gales M J F.Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition.Computer Speech and Language,1998,12(2): 75-98
[6] Wegmann S,McAllester D,Orloff J,et al.Speaker Normalization on Conversational Telephone Speech // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Atlanta,USA,1996: 339-341
[7] Matéjka P,Schwarz P,Hermansky H,et al.Phoneme Recognition Using Temporal Patterns // Proc of the 6th International Conference on Text,Speech and Dialogue.Ceske Budejovice,Czech Republic,2003: 198-205
[8] Campbell W M,Campbell J R,Reynolds D A,et al.High-Level Speaker Verification with Support Vector Machines // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Montreal,Canada,2004: 73-76
[9] Zissman M A.Comparison of Four Approaches to Automatic Language Identification of Telephone Speech.IEEE Trans on Speech and Audio Processing,1996,4(1): 31-44
[10] Campbell W M,Campbell J P.Support Vector Machines for Speaker and Language Recognition.Computer Speech and Language,2006,20(2/3): 210-229
[11] Solomonoff A,Campbell W,Quillen C.Channel Compensation for SVM Speaker Recognition // Proc of the Speaker and Language Recognition Workshop.Toledo,Spain,2004: 57-62
[12] Rubin D B,Thayer D T.EM Algorithms for ML Factor Analysis.Psychometrika,1982,47(1): 69-76
[13] Fu Qiang,Song Yan,Dai Lirong.Factor Analysis in GMM-Based Language Identification.Journal of Chinese Information Processing,2009,23(4): 77-81 (in Chinese)
(付强,宋彦,戴礼荣.因子分析在基于GMM的自动语种识别中的应用.中文信息学报,2009,23(4): 77-81)
[14] Xu Bing,Song Yan,Dai Lirong.The Adaptation Schemes in PR-SVM Based Language Recognition // Proc of the 6th International Symposium on Chinese Spoken Language Processing.Kunming,China,2008: 334-337