说话人识别中的串行因子分析<sup>*</sup>

摘要
图/表
参考文献
相关文章 (5)

全文: PDF (308 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在基于因子分析的说话人识别中，提出串行训练载荷矩阵的方法.在载荷矩阵训练中，采用串行的方式训练得到说话人因子矩阵、对角阵(残差矩阵)和信道空间矩阵.在说话人注册中，将以上3个载荷矩阵拼接，采用联合估计的方法得到每个说话人的因子.采用这种策略可有效解决因子分析中的饱和问题.在NIST SRE 2006年核心测试数据库上等错误率能达到3.65%.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	郭武
	戴礼荣
	王仁华

关键词 ：说话人识别, 联合因子分析, 本征音因子, 等错误率(EER)

Abstract：A serial loading matrix training method is proposed in the factor analysis based speaker recognition. In the loading matrix training process, the eigenvoice matrix, the diagonal matrix(residual) and the channel matrix are calculated serially. In the speaker enrollment process, the above three matrixes are assembled, and then the factors are calculated through the joint factor analysis. Thus, the saturation problem in factor analysis is solved. On the NIST SRE 2006 core test corpus, the equal error rate of the proposed system is 3.65%.

Key words： Speaker Recognition Joint Factor Analysis Eigenvoice Equal Error Rate (EER)

收稿日期: 2008-05-26

ZTFLH:

TN912.34

基金资助:国家863计划资助项目(No.2006AA010104)

作者简介: 郭武，男，1973年生，讲师，主要研究方向为说话人识别.E-mail: guowu@ustc.edu.cn.戴礼荣，男，1962年生，教授，主要研究方向为语音信号处理.王仁华，男，1943年生，教授，主要研究方向为语音合成.

引用本文:

郭武，戴礼荣，王仁华. 说话人识别中的串行因子分析^*[J]. 模式识别与人工智能, 2009, 22(4): 514-518. GUO Wu, DAI Li-Rong, WANG Ren-Hua. Serial Factor Analysis in Speaker Recognition. , 2009, 22(4): 514-518.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2009/V22/I4/514

[1] Reynolds D A. Channel Robust Speaker Verification via Feature Mapping // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Hongkong, China, 2003, Ⅱ: 53-56
[2] Deng Jing, Zheng T F, Wu Wenhu. Session Variability Subspace Projection Based Model Compensation for Speaker Verification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, Ⅳ: 47-50
[3] Kenny P, Ouellet P, Dehak N, et al. A Study of Inter-Speaker Variability in Speaker Verification. IEEE Trans on Audio,Speech and Language Processing, 2008, 16(5): 980-988
[4] Vogt R, Sridharan S. Experiments in Session Variability Modeling for Speaker Verification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, France, 2006, Ⅰ: 897-900
[5] Campbell W M, Sturim D E, Reynolds D A. Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311
[6] Reynolds D A, Quatieri T F, Dunn R B. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 2000, 10(1/2/3): 19-41
[7] Castaldo F, Colibro D, Dalmasso E, et al. Compensation of Nuisance Factors for Speaker and Language Recognition. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 1969-1978
[8] Kenny P, Boulianne G, Dumouchel P. Eigenvoice Modeling with Sparse Training Data. IEEE Trans on Speech and Audio Processing, 2005, 13(3): 345-354
[9] Kenny P, Boulianne G, Ouellet P, et al. Joint Factor Analysis versus Eigenchannels in Speaker Recognition. IEEE Trans on Speech and Audio Processing, 2007, 15(4): 1435-1447
[10] NIST. The NIST Year 2006 Speaker Recognition Evaluation Plan [DB/OL]. [2006-05-01]. http://www.nist.gov/speech/tests/spk/2006
[11] Mateˇjka P, Burget L, Schwarz P, et al. STBU System for the NIST 2006 Speaker Recognition Evaluation // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, Ⅳ: 221-224