基于得分域多维特征分类器的声纹密码系统

Abstract
Figure/Table
References
Related Citation (1)

Download: PDF (472 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract As providing the score of different types of data the same weight, the average likelihood ratio verification measure used in GMM-UBM vocal password system brings decline in the system performance. Based on the different distinguished capacity between data types, a method is proposed in the score domain which classifies the test data by UBM, combines the likelihood ratio score of each class to form new multi-dimension feature, and then implements speaker verification by SVM. By use of the proposed strategy, the traditional likelihood ratio test is converted into a two-class classification problem in the multi-dimension feature space. The equal error rate of the proposed system is relatively 41.25%, 33.33%, 37.49% and 26.03% less than that of text-dependent GMM-UBM system in the co-channel experiments on four telephone corpuses respectively. The improvement of performance is also demonstrated through the cross-channel experiments.

Key words： Vocal Password Gaussian Mixture Model-Universal Background Model (GMM-UBM) Average Likelihood Ratio Two-Class Classifier

Received: 13 June 2011

ZTFLH:

TN912.34

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	PAN Yi-Qian
	WEI Si
	Dai Li-Rong
	LIU Qing-Feng

Cite this article:

PAN Yi-Qian,WEI Si,Dai Li-Rong等. A Vocal Password System Based on Multi-Dimension Feature Classifier in Score Domain[J]. , 2012, 25(5): 755-761.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/ OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2012/V25/I5/755

[1] Campbell J P.Speaker Recognition: A Tutorial.Proc of IEEE,1997,85(9): 1437-1462
[2] Naik J M.Speaker Verification: A Tutorial.IEEE Communications Magazine,1990,28(1): 42-48
[3] Ramasubramanian V,Das A,Kumar V P.Text-Dependent Speaker-Recognition Using One-Pass Dynamic Programming Algorithm // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Toulouse,France,2006: 901-904
[4] Furtuna T F.Dynamic Programming Algorithms in Speech Recognition.Informatica Economica Journal,2008,2(46): 94-99
[5] Reynolds D A,Rose R C.Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models.IEEE Trans on Speech and Audio Processing,1995,3(1): 72-83
[6] Subramanyal A,Zheng Z,Surendran A C,et al.A Generative-Discriminative Framework Using Ensemble Methods for Text-Dependent Speaker Verification // Proc of the IEEE International Conference on Acoustics,Speech,and Signal Processing.Honolulu,USA,2007: 225-228
[7] Campbell W M,Campbell J P,Reynolds D A,et al.Support Vector Machines for Speaker and Language Recognition.Computer Speech and Language,2006,20(2/3): 210-229
[8] Campbell W M,Sturim D E,Reynolds D A.Support Vector Machines Using GMM Supervectors for Speaker Verification.IEEE Signal Processing Letters,2006,13(5): 308-311
[9] Vogt R,Baker B,Sridharan S.Factor Analysis Subspace Estimation for Speaker Verification with Short Utterances // Proc of the Interspeech 2008.Brisbane,Australia,2008: 853-856
[10] Kenny P.Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms.Technical Report,CRIM-06/08-13.Montreal,Canada: CRIM,2006
[11] Kenny P,Mihoubi M,Dumouchel P.New MAP Estimators for Speaker Recognition // Proc of the Eurospeech 2003.Geneva,Switzerland,2003: 2964-2967
[12] Kenny P,Boulianne G,Dumouchel P.Eigenvoice Modeling with Sparse Training Data.IEEE Trans on Speech and Audio Processing,2005,13(3): 345-354
[13] Gauvain J L,Lee C H.Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains.IEEE Trans on Speech and Audio Processing,1994,2(2): 291-298
[14] Nello C,Jhon S T.Support Vector Machines.Cambridge,UK: Cambridge University Press,2000
[15] Kajarekar S.Phone-Based Cepstral Polynomial SVM System for Speaker Recognition // Proc of the Interspeech 2008.Brisbane,Australia,2008: 845-848
[16] Moreno P J,Ho P P,Vasconcelos N.A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications // Thrun S,Saul L,Schlkopf B,eds.Advances in Neural Information Processing Systems.Cambridge,USA: MIT Press,2004,XIV: 1385-1393
[17] Hermansky H,Morgan N,Bayya A,et al.RASTA-PLP Speech Analysis Technique // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.San Francisco,USA,1992: 121-124
[18] Lamel L,Rabiner L,Rosenberg A,et al.An Improved Endpoint Detector for Isolated Word Recognition.IEEE Trans on Acoustics,Speech and Signal Processing,1981,29(4): 777-785
[19] Furui S.Cepstral Analysis Technique for Automatic Speaker Verification.IEEE Trans on Acoustics,Speech and Signal Processing,1981,29(2): 254-272
[20] Xiang B U,Chaudhari V,Navratil J,et al.Short-Time Gaussianization for Robust Speaker Verification // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Orlando,USA,2002: 681-684
[21] Fukada T,Tokuda K,Kobayashi T,et al.An Adaptive Algorithm for Mel-Cepstral Analysis of Speech // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.San Francisco,USA,1992: 137-140
[22] Collobert R,Bengio S.SVMTorch: Support Vector Machines for Large-Scale Regression Problems.Journal of Machine Learning Research,2001,1: 143-160