Abstract:To improve the precision of the spoken term detection system, a term specific thresholding method based on improved score distribution is presented. At the decision stage of the system, different thresholds are set for every query according to the posterior scores. The distribution of all posterior scores retrieved for a query term is modeled by exponential mixture model. The parameters are estimated by the expectation maximization (EM) algorithm in an unsupervised manner. The threshold value is calculated by Bayes minimum risk rule. Since EM algorithm is sensitive to initial values, K-means clustering is used in the initialization instead of randomization. Posterior scores are firstly divided into two classes, the prior distributions are calculated and the intial values of the model parameters are estimated by maximum likelihood method. The experimental results show that the performance of the proposed thresholding method is better than that of others.
陆梨花,张连海. 一种基于改进得分分布的查询项特定阈值方法*[J]. 模式识别与人工智能, 2015, 28(5): 437-442.
LU Li-Hua, ZHANG Lian-Hai. A Term Specific Thresholding Method Based on Improved Score Distribution. , 2015, 28(5): 437-442.
[1] Mamou J, Ramabhadran B, Siohan O. Vocabulary Independent Spoken Term Detection // Proc of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Amsterdam, The Netherland, 2007: 615-622 [2] Tejedor J, Wang D, King S, et al. A Posterior Probability-Based System Hybridisation and Combination for Spoken Term Detection // Proc of the 10th Annual Conference of the International Speech Communication Association. Brighton, UK, 2009: 2131-2134 [3] Tejedor J, Echeverría A, Wang D, et al. Evolutionary Discriminative Confidence Estimation for Spoken Term Detection. Multimedia Tools and Applications, 2013, 62(1): 5-34 [4] Lee H Y, Chen C P, Lee L S. Integrating Recognition and Retrieval with Relevance Feedback for Spoken Term Detection. IEEE Trans on Audio, Speech, and Language Processing, 2012, 20(7): 2095-2110 [5] Tu T W, Lee H Y, Lee L S. Improved Spoken Term Detection Using Support Vector Machines with Acoustic and Context Features from Pseudo-Relevance Feedback // Proc of the IEEE Workshop on Automatic Speech Recognition and Understanding. Waikoloa, USA, 2011: 383-388 [6] Lee H H, Lee L S. Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples. IEEE Trans on Audio, Speech, and Language Processing, 2013, 21(6): 1272-1284 [7] Miller D R H, Kleber M, Kao C L, et al. Rapid and Accurate Spoken Term Detection // Proc of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium, 2007: 314-317 [8] Soltau H, Saon G, Povey D, et al. The IBM 2006 Gale Arabic ASR System // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, IV: 349-352 [9] Vergyri D, Shafran I, Stolcke A, et al. The SRI/OGI 2006 Spoken Term Detection System // Proc of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium, 2007: 2393-2396 [10] Allauzen C, Mohri M, Saraclar M. General Indexation of Weighted Automata: Application to Spoken Utterance Retrieval // Proc of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL. Stroudsburg, USA, 2004: 33-40 [11] Parlak S, Saraclar M. Spoken Term Detection for Turkish Broadcast News // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, USA, 2008: 5244-5247 [12] Can D, Saraclar M. Lattice Indexing for Spoken Term Detection. IEEE Trans on Audio, Speech, and Language Processing, 2011, 19(8): 2338-2347 [13] Mohri M, Pereira F, Riley M. Weighted Finite-State Transducers in Speech Recognition. Computer Speech & Language, 2002, 16(1): 69-88 [14] Allauzen C, Riley M, Schalkwyk J, et al. OpenFst: A General and Efficient Weighted Finite-State Transducer Library // Proc of the 12th International Conference on Implementation and Application of Automata. Prague, Czech Republic, 2007: 11-23 [15] Manmatha R, Toni M, Feng F F. Modeling Score Distributions for Combining the Outputs of Search Engines // Proc of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New Orleans, USA, 2001: 267-275