Abstract:To improve the recognition accuracy rate of a text-independent speaker recognition system, a total least square-nuisance attribute projection (TLS-NAP) algorithm is proposed. The perturbation of the projection matrix is considered and its negative effect is minimized when hidden variables are estimated by the total least square algorithm. A better performance is obtained by the nuisance attribute projection space based on these variables. The effectiveness of the proposed method is demonstrated by the experimental results on NIST SRE 08 data corpus.
[1] Kinnunen T,Li H.An Overview of Text-Independent Speaker Recognition: From Features to Supervectors.Speech Communication,2010,52(1): 12-40 [2] Reynolds D,Quatieri T,Dunn R.Speaker Verification Using Adapted Gaussian Mixture Models.Digital Signal Processing,2000,10(1/2/3): 19-41 [3] Campbell W,Sturim D,Reynolds D.Support Vector Machines Using GMM Supervectors for Speaker Verification.Signal Processing Letters,2006,13(5): 308-311 [4] Reynolds D.Channel Robust Speaker Verification via Feature Mapping // Proc of the International Conference on Acoustics,Speech and Signal Processing.Hong Kong,China,2003,II: 53-56 [5] Vogt R,Baker B,Sridharan S.Modeling Session Variability in Text-Independent Speaker Verification // Proc of the 9th European Conference on Speech Communication and Technology.Lisbon,Portugal,2005: 3117-3120 [6] Kenny P,Ouellet P,Dehak N,et al.A Study of Interspeaker Variability in Speaker Verification.IEEE Trans on Audio,Speech,and Language Processing,2008,16(5): 980-988 [7] Kenny P,Boulianne G,Ouellet P,et al.Joint Factor Analysis versus Eigenchannels in Speaker Recognition.IEEE Trans on Audio,Speech and Language Processing,2007,15(4): 1435-1447 [8] Kenny P,Boulianne G,Ouellet P,et al.Speaker and Session Variability in GMM-Based Speaker Verification.IEEE Trans on Audio,Speech and Language Processing,2007,15(4): 1448-1460 [9] Solomonoff A,Campbell W,Boardman I.Advances in Channel Compensation for SVM Speaker Recognition // Proc of the International Conference on Acoustics,Speech and Signal Processing.Philadelphia,USA,2005: 629-632 [10] Campbell W.Weighted Nuisance Attribute Projection // Proc of Odyssey Speaker and Language Recegnition Workshop.Brno,Czech,2010: 97-102 [11] Campbell W,Karam Z,Sturim D.Speaker Comparison with Inner Product Discriminant Functions // Bengio Y,Schuurmans D,Lafferty J,et al,eds.Advances in Neural Information Processing Systems.Cambridge,USA: MIT Press,2009,XXII: 207-215 [12] Zhang Xianda.Matrix Analysis and Application.Beijing,China: Tsinghua University Press,2005 (in Chinese) (张贤达.矩阵分析与应用.北京:清华大学出版社,2005) [13] Sturim D E,Reynolds D A.Speaker Adaptive Cohort Selection for Tnorm in Text-Independent Speaker Verification // Proc of the International Conference on Acoustics,Speech and Signal Processing.Philadelphia,USA,2005: 741-744 [14] Xiang Bin,Chaudhari U,Navratil J,et al.A Short-Time Gaussianization for Robust Speaker Verification // Proc of the International Conference on Acoustics,Speech and Signal Processing.Orlando,USA,2002: 681-684