|
|
A Vocal Password System Based on Multi-Dimension Feature Classifier in Score Domain |
PAN Yi-Qian1, WEI Si2, Dai Li-Rong1, LIU Qing-Feng1,2 |
1. iFlytek Speech Laboratory,University of Science and Technology of China,Hefei 230027 2.Anhui USTC iFLYTEK Co.,Ltd.,Hefei 230088 |
|
|
Abstract As providing the score of different types of data the same weight, the average likelihood ratio verification measure used in GMM-UBM vocal password system brings decline in the system performance. Based on the different distinguished capacity between data types, a method is proposed in the score domain which classifies the test data by UBM, combines the likelihood ratio score of each class to form new multi-dimension feature, and then implements speaker verification by SVM. By use of the proposed strategy, the traditional likelihood ratio test is converted into a two-class classification problem in the multi-dimension feature space. The equal error rate of the proposed system is relatively 41.25%, 33.33%, 37.49% and 26.03% less than that of text-dependent GMM-UBM system in the co-channel experiments on four telephone corpuses respectively. The improvement of performance is also demonstrated through the cross-channel experiments.
|
Received: 13 June 2011
|
|
|
|
|
[1] Campbell J P.Speaker Recognition: A Tutorial.Proc of IEEE,1997,85(9): 1437-1462 [2] Naik J M.Speaker Verification: A Tutorial.IEEE Communications Magazine,1990,28(1): 42-48 [3] Ramasubramanian V,Das A,Kumar V P.Text-Dependent Speaker-Recognition Using One-Pass Dynamic Programming Algorithm // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Toulouse,France,2006: 901-904 [4] Furtuna T F.Dynamic Programming Algorithms in Speech Recognition.Informatica Economica Journal,2008,2(46): 94-99 [5] Reynolds D A,Rose R C.Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.IEEE Trans on Speech and Audio Processing,1995,3(1): 72-83 [6] Subramanyal A,Zheng Z,Surendran A C,et al.A Generative-Discriminative Framework Using Ensemble Methods for Text-Dependent Speaker Verification // Proc of the IEEE International Conference on Acoustics,Speech,and Signal Processing.Honolulu,USA,2007: 225-228 [7] Campbell W M,Campbell J P,Reynolds D A,et al.Support Vector Machines for Speaker and Language Recognition.Computer Speech and Language,2006,20(2/3): 210-229 [8] Campbell W M,Sturim D E,Reynolds D A.Support Vector Machines Using GMM Supervectors for Speaker Verification.IEEE Signal Processing Letters,2006,13(5): 308-311 [9] Vogt R,Baker B,Sridharan S.Factor Analysis Subspace Estimation for Speaker Verification with Short Utterances // Proc of the Interspeech 2008.Brisbane,Australia,2008: 853-856 [10] Kenny P.Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms.Technical Report,CRIM-06/08-13.Montreal,Canada: CRIM,2006 [11] Kenny P,Mihoubi M,Dumouchel P.New MAP Estimators for Speaker Recognition // Proc of the Eurospeech 2003.Geneva,Switzerland,2003: 2964-2967 [12] Kenny P,Boulianne G,Dumouchel P.Eigenvoice Modeling with Sparse Training Data.IEEE Trans on Speech and Audio Processing,2005,13(3): 345-354 [13] Gauvain J L,Lee C H.Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains.IEEE Trans on Speech and Audio Processing,1994,2(2): 291-298 [14] Nello C,Jhon S T.Support Vector Machines.Cambridge,UK: Cambridge University Press,2000 [15] Kajarekar S.Phone-Based Cepstral Polynomial SVM System for Speaker Recognition // Proc of the Interspeech 2008.Brisbane,Australia,2008: 845-848 [16] Moreno P J,Ho P P,Vasconcelos N.A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications // Thrun S,Saul L,Schlkopf B,eds.Advances in Neural Information Processing Systems.Cambridge,USA: MIT Press,2004,XIV: 1385-1393 [17] Hermansky H,Morgan N,Bayya A,et al.RASTA-PLP Speech Analysis Technique // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.San Francisco,USA,1992: 121-124 [18] Lamel L,Rabiner L,Rosenberg A,et al.An Improved Endpoint Detector for Isolated Word Recognition.IEEE Trans on Acoustics,Speech and Signal Processing,1981,29(4): 777-785 [19] Furui S.Cepstral Analysis Technique for Automatic Speaker Verification.IEEE Trans on Acoustics,Speech and Signal Processing,1981,29(2): 254-272 [20] Xiang B U,Chaudhari V,Navratil J,et al.Short-Time Gaussianization for Robust Speaker Verification // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Orlando,USA,2002: 681-684 [21] Fukada T,Tokuda K,Kobayashi T,et al.An Adaptive Algorithm for Mel-Cepstral Analysis of Speech // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.San Francisco,USA,1992: 137-140 [22] Collobert R,Bengio S.SVMTorch: Support Vector Machines for Large-Scale Regression Problems.Journal of Machine Learning Research,2001,1: 143-160 |
|
|
|