Abstract:In current language identification system, the commonly used feature parameters have not made the best use of auditory characteristics and have weak robustness in complex environments. An auditory-based robust feature extraction algorithm is proposed. Each sub-band energy of the extracted auditory features is calculated by using a Gammachirp filter bank instead of the commonly used triangle filter bank. The compensation filter using data-driven analysis for each sub-band output is obtained by a constrained optimization process which jointly minimizes the environmental distortion as well as the distortion caused by the filter itself. Experimental results show that the feature outperforms the Mel-frequency cepstral coefficient widely used in noisy environments.
[1] Matejka P.Phonotactic and Acoustic Language Recognition.Ph.D Dissertation.Brno,Czech: Brno University of Technology,2008 [2] Zhang Weiqiang,Liu Jia.Language Identification Based on Auditory Features.Journal of Tsinghua University: Science and Technology,2009,49(1): 78-81 (in Chinese) (张卫强,刘 加.基于听感知特征的语种识别.清华大学学报:自然科学版,2009,49(1): 78-81) [3] Wang Yue,Qian Zhihong,Wang Xue,et al.An Auditory Feature Extraction Algorithm Based on γ-Tone Filter-Banks.Acta Electronica Sinica,2010,38(3): 525-528(in Chinese) (王 玥,钱志鸿,王 雪,等.基于伽马通滤波器组的听觉特征提取算法研究.电子学报,2010,38(3): 525-528) [4] Kim C,Stern R M.Feature Extraction for Robust Speech Recognition Using a Power-Law Nonlinearity and Power-Bias Subtraction // Proc of the 10th Annual Conference of the International Speech Communication Association.Brighton,UK,2009: 28-31 [5] Chiu Y H,Stern R M.Analysis of Physiologically-Motivated Signal Processing for Robust Speech Recognition // Proc of the 9th International Conference on Spoken Language.Brisbane,Australia,2006: 1000-1003 [6] Chiu Y H B,Stern R M.Minimum Variance Modulation Filter for Robust Speech Recognition // Proc of the IEEE International Conference on Acoustics,Speech and Signal Processing.Taipei,China,2009: 3917-3920 [7] Aertsen A M H J,Olders J H J,Johannesma P I M.Spectral-Temporal Receptive Fields of Auditory Neurons in the Grassfrog.Biological Cybernetics,1981,39(3): 195-209 [8] Yan Luo.The Study of Filter Bank to Simulate the Characteristics of the Human Basilar Membrane.Master Dissertation.Beijing,China: Beijing Jiaotong University,2009 (in Chinese) (颜 罗.人耳基底膜滤波器仿真研究.硕士学位论文.北京:北京交通大学,2009) [9] Irino T,Patterson R D.A Time-Domain Level-Dependent Auditory Filter: The Gammachirp.The Journal of the Acoustical Society of America,1997,101(1): 412-419 [10] Hajaiej Z,Ouni K,Ellouze N.Gammachirp Filter Frond-End for Automatic Speech Recognition // Proc of the 4th International Conference on Sciences of Electronic,Technologies of Information and Telecommunications.Hammamet,Tunisia,2007: 873-877 [11] Torres-carrasquillo P A,Singer E,Gleason T,et al.The MITLL NIST LRE 2009 Language Recognition System // Proc of the IEEE International Conference on Acoustics Speech and Signal Processing.Dallas,USA,2010: 4994-4997