Space Transformation Based on Signal Subspace in Joint Factor Analysis
LI Jin, GUO Wu, DAI Li-Rong
National Engineering Laboratory for Speech and Language Information Processing, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027
Joint factor analysis (JFA) is the mainstream algorithm in the text-independent speaker verification systems due to its clear method of modeling the spaces. However, the inevitable overlaps between the speaker space and the channel space obtained by JFA are caused because of the limitations of the algorithm process. To resolve this problem, the space transformation based on the signal subspace is proposed. Compared with JFA algorithm without the space transformation, an equal error rate (EER) reduction of 9.2% is obtained on the telephone section of the core condition trials of the NIST SRE 2008.
李晋, 郭武, 戴礼荣. 联合因子分析算法中基于信号子空间的空间变换方法[J]. 模式识别与人工智能, 2013, 26(8): 705-710.
LI Jin, GUO Wu, DAI Li-Rong. Space Transformation Based on Signal Subspace in Joint Factor Analysis. , 2013, 26(8): 705-710.
"[1] Reynolds D A, Quatieri T F, Dunn R B. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 2000, 10(1/2/3): 19-41
[2] Kenny P, Dumouchel P. Experiments in Speaker Verification Using Factor Analysis Likelihood Ratios // Proc of the ODYSSEY: Speaker and Language Recognition Workshop. Toledo, Spain, 2004: 219-226
[3] Kenny P, Boulianne G, Dumouchel P. Eigenvoice Modeling with Sparse Training Data. IEEE Trans on Speech and Audio Processing, 2005, 13(3): 345-354
[4] Kenny P, Boulianne G, Quellet P, et al. Speaker and Session Va-riability in GMM-Based Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(4): 1448-1460
[5] Dehak N, Dumouchel P, Kenny P. Modeling Prosodic Features with Joint Factor Analysis for Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 2095-2103
[6] Guo Wu, Li Yijie, Dai Lirong, et al. Factor Analysis and Space Assembling in Speaker Recognition. Acta Automatica Sinica, 2009, 35(9): 1193-1198 (in Chinese)
(郭 武,李轶杰,戴礼荣,等.说话人识别中的因子分析以及空间拼接.自动化学报, 2009, 35(9): 1193-1198)
[7] Campbell W M, Sturim D E, Reynolds D A, et al. SVM Based Speaker Verification Using a GMM Supervector Kernel and NAP Variability Compensation[EB/OL]. [2012-09-10]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.208.4140rep=rep1type=pdf
[8] He Liang, Shi Yongzhe, Liu Jia. Eigenchannel Space Combination Method of Joint Factor Analysis. Acta Automatica Sinica, 2011, 37(7): 849-856 (in Chinese)
(何 亮,史永哲,刘 加.联合因子分析中的本征信道空间拼接方法.自动化学报, 2011, 37(7): 849-856)
[9] Kenny P, Dehak N, Gupta V, et al. A New Training Regimen for Factor Analysis of Speaker Variability [EB/OL]. [ 2012-09-10]. http://www.crim.ca/perso/patrick.kenny/Kenny_ICASSP08.pdf
[10] Yao Tianren, Sun Hong. Modern Digital Signal Processing. Wuhan, China: Huazhong University of Science and Technology Press, 1999 (in Chinese)
(姚天任,孙 洪.现代数字信号处理.武汉:华中科技大学出版社,1999)
[11] Auckenthaler R, Carey M, Thomas H L. Score Normalization for Text-Independent Speaker Verification System. Digital Signal Processing, 2000, 10(1/2/3): 42-54"