|
|
Speaker Verification Based on Gaussian Probability Distribution and SVM |
GUO Wu, DAI Li-Rong, WANG Ren-Hua |
iFly Speech Laboratory, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Heifei 230027 |
|
|
Abstract In the text-independent speaker verification research, the probability distribution against the universal background model (PD-UBM) is calculated. And the score of each UBM Gaussian mixture is adopted as the input feature of the support vector machine (SVM) during the training and testing process. The proposed PD-UBM algorithm with linear kernel function can obtain the same or better performance as the generalized linear discriminant sequence (GLDS) kernel system. Furthermore, if the scores of the Gaussian mixture models (GMM-UBM), the GLDS and the PD-UBM are combined, the significant improvement of the system can be achieved. In 2006, on NIST 1conv4w-1conv4w speaker recognition evaluation (SRE) corpus, the fusion system obtained 25% relative improvement equal error rate (ERR) of over the GMM-UBM system.
|
Received: 04 May 2007
|
|
|
|
|
[1] Campbell W M, Campbell J P, Reynolds D A. Support Vector Machines for Speaker and Language Recognition. Computer Speech and Language, 2006, 20(2/3): 210-229 [2] Campbell W M, Sturim D E, Reynolds D A. Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311 [3] Campbell W M, Sturim D E, Reynolds D A, et al. SVM Based Speaker Verification Using a GMM Supervector Kernel and Nap Variability Compensation // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, USA, 2006, I: 97-100 [4] Reynolds D A, Quatieri T F, Dunn R B. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 2000, 10(1/2/3): 19-41 [5] Nello C, Jhon S T. Support Vector Machines. Cambridge, UK: Cambridge University Press, 2000 [6] Lamel L F, Rabiner L R, Rosenberg A, et al. An Improved Endpoint Detector for Isolated Word Recognition. IEEE Trans on Acoustics, Speech and Signal Processing, 1981, 29(4): 777-785 [7] Xiang Bing, Chaudhari U V, Navratil J, et al. Short-Time Gaussianization for Robust Speaker Verification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Orlando, USA, 2002, Ⅰ: 681-684 [8] Collobert R. SVMTorch: Support Vector Machines for Large-Scale Regression Problems. Journal of Machine Learning Research, 2001, 1: 143-160 [9] Matejka P, Burget L, Schwarz P, et al. STBU System for the NIST 2006 Speaker Recognition Evaluation // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, Ⅳ: 221-224 [10] Niko B, Johan D P. Application-Independent Evaluation of Speaker Detection. Computer Speech and Language, 2006, 20(2/3): 230-275 [11] NIST. The NIST Year 2006 Speaker Recognition Evaluation Plan [DB/OL]. [2005-12-01]. http://www.nist.gov/speech/tests/sre/2006/sre-06_evalplan-v9.pdf [12] Vivaracho C E. Improving SVM Training by Means of NTIL When the Data Sets are Imbalanced // Proc of the 16th International Symposium on Foundations of Intelligent Systems. Bari, Italy, 2006: 111-120 |
|
|
|