Abstract:In the text-independent speaker verification systems, the mismatch and variability of the channel and environment between training and testing is a session variability problem. It can greatly degrade the speaker recognition performance. To deal with the problem more efficiently, a modified PCA method is proposed called session variation principal component analysis (SVPCA) which can integrate with within class covariance normalization (WCCN). In the NIST 2006 verification task, the proposed method is compared with our previous baseline general linear discriminative sequence-support vector machine (GLDS-SVM) system. The experimental results show a relative reduction of up to 24.2% in error equal ratio (EER). Moreover, the proposed method has advantages in computational and memory costs, compared with the state-of-art systems.
龙艳花,郭武,戴礼荣. 一种基于说话者话路变化的主成分分析方法[J]. 模式识别与人工智能, 2009, 22(2): 270-274.
LONG Yan-Hua, GUO Wu, DAI Li-Rong. A PCA Method Based on Speaker Session Variability. , 2009, 22(2): 270-274.
[1] Sturim D E, Campbell W M, Reynolds D A, et al. Robust Speaker Recognition with Cross-Channel Data: MIT-LL Results on the 2006 NIST SRE Auxiliary Microphone Task // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, Ⅳ: 49-52 [2] Kenny P, Boulianne G, Ouellet P, et al. Joint Factor Analysis versus Eigenchannels in Speaker Recognition. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(4): 1435-1447 [3] Solomonoff A, Quillen C, Campbell W. Channel Compensation for SVM Speaker Recognition [EB/OL]. [2004- 12- 01]. http://www.ll.mit.edu.mission/communications/ist/publications/040531_Solomonoff.pdf [4] Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: Springer-Verlag, 1995 [5] Hatch A, Stolcke A. Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, France, 2006, Ⅴ: 585-588 [6] Hatch A O, Kajarekar S, Stolcke A. Within-Class Covariance Normalization for SVM-Based Speaker Recognition [EB/OL]. [2007- 10- 21]. http://www.icsi.berkeley.edu/pubs/speech/HatchICSLP06.pdf [7] Matejka P, Burget L, Schwarz P, et al. STBU System for the NIST 2006 Speaker Recognition Evaluation // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, Ⅳ: 221-224 [8] Campbell W M, Campbell J P, Reynolds D A, et al. Support Vector Machines for Speaker and Language Recognition. Computer Speech and Language, 2006, 20(2/3): 210-229 [9] Nation Institute of Standards and Technology. NIST Speech Group Website [DB/OL]. [2007- 10- 03]. http://www.nist.gov/speech