基于K-L散度模型聚类的快速说话人辨识方法

摘要
图/表
参考文献(13)
相关文章 (7)

全文: PDF (410 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在网络应用环境下，需要处理的音频数据和注册说话人急剧增加，传统说话人辨识方法难以满足实时性要求。文中提出采用K-L散度的说话人模型聚类方法，从而构造一个分级辨识模型，提高辨识效率。研究利用类辨识信息估计置信度的方法，可尽早有效排除集外说话人。实验结果显示，文中方法可使辨识速度平均提高3。2倍，而闭集辨识错误率平均只有0。9%的增加。采用类辨识置信度进一步提高开集辨识速度，并且在保持集内错误率不变的情况下，使集外错误率相对下降5。1%。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ： K-L散度, 模型聚类, 置信度, 说话人辨识, 网络环境

Abstract：With the increase of enrolled speakers and audio data to be recognized, the conventional speaker identification methods can not meet the real-time demand for internet application environment. A K-L divergence based speaker model clustering method is proposed to construct a hierarchical identification system, which remarkably improves the recognition efficiency. Moreover, the confidence measure using class-level identification information is also investigated to effectively exclude out-of-set speaker as early as possible. The experimental results show the proposed method averagely increases the identification speed by 3.2 times while the error rate of closed-set identification only increases about 0.9% compared with the conventional method. The open-set identification can be speeded up by using class-level confidence measure and a relatively 5.1% error rate reduction can be achieved on out-of-set speakers identification while keeping the identification performance of in-set speakers unchanged.

Key words： K-L Divergence Model Clustering Confidence Measure Speaker Identification Internet Environment

收稿日期: 2009-02-09

ZTFLH:	TN912.3
	TP391.4

基金资助:国家973计划项目(No.2007CB311100)、国家863计划重点项目(No.2006AA010103)资助

作者简介: 王欢良，男，1974年生，博士，讲师，主要研究方向为语音识别.E-mail:huanliangwang@126.com.韩纪庆，男，1964年生，教授，博士生导师，主要研究方向为语音信号处理、模式识别等.郑贵滨，男，1973年生，博士，副教授，主要研究方向为音频检索。

引用本文:

王欢良，韩纪庆，郑贵滨. 基于K-L散度模型聚类的快速说话人辨识方法[J]. 模式识别与人工智能, 2010, 23(6): 856-861. WANG Huan-Liang,HAN Ji-Qing,ZHENG Gui-Bin. K-L Divergence Based Model Clustering Method for Fast Speaker Identification. , 2010, 23(6): 856-861.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2010/V23/I6/856

[1] Campbell J P. Speaker Recognition: A Tutorial. Proc of the IEEE, 1997, 85(9): 1437-1462
[2] Pellom B L, Hansen J H L. An Efficient Scoring Algorithm for Gaussian Mixture Model Based Speaker Identification. IEEE Signal Processing Letter, 1998, 5(11): 281-284
[3] McLaughlin J, Reynolds D A, Gleeson T. A Study of Computation Speed-Ups of the GMM-UBM Speaker Recognition System // Proc of the 6th European Conference on Speech Communication and Technology. Budapest, Hungary, 1999: 1215-1218
[4] Kinnunen T, Karpov E, Franti P. Real-Time Speaker Identification and Verification. IEEE Trans on Audio, Speech, and Language Processing, 2006, 14(1): 277-288
[5] Jhanwar N, Raina A K. Pitch Correlogram Clustering for Fast Speaker Identification. EURASIP Journal on Applied Signal Processing, 2004, 17: 2640-2649
[6] Liu Wenju, Sun Bin, Zhong Qiuhai. Research on Hierarchical Speaker Recognition Based on Speaker Clustering Technology. Acta Electronica Sinica, 2005, 33(7): 1230-1233 (in Chinese)
(刘文举,孙兵,钟秋海.基于说话人分类技术的分级说话人识别研究.电子学报, 2005, 33(7): 1230-1233)
[7] Xiong Zhenyu, Zheng T F, Song Zhanjiang, et al. Combining Selection Tree with Observation Reordering Pruning for Efficient Speaker Identification Using GMM-UBM // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Philadelphia, USA, 2005: 625-628
[8] Aronowitz H, Burshtein D. Efficient Speaker Recognition Using Approximated Cross Entropy (ACE). IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 2033-2043
[9] Apsingekar V R, Leon P L D. Efficient Speaker Identification Using Speaker Model Clustering // Proc of the 16th European Signal Processing Conference. Lausanne, Switzerland, 2008: 64-68
[10] Shangguan Wei, Dai Beiqian. Speaker Clustering Based Likelihood Scores Fusion Robust Speaker Verification. Journal of Lanzhou University: Natural Sciences, 2008, 44(3): 81-86 (in Chinese)
(上官葳,戴蓓蒨.基于话者聚类的多系统输出评分融合话者确认.兰州大学学报:自然科学版, 2008, 44(3): 81-86)
[11] Kullback S, Leibler R A. On Information and Sufficiency. Annals of Mathematical Statistics, 1951, 22(1): 79-86
[12] Goldberger J, Gordon S, Greenspan H. An Efficient Image Similarity Measure Based on Approximations of KL-Divergence between Two Gaussian Mixtures // Proc of the 9th International Conference on Computer Vision. Nice, France, 2003: 370-377
[13] Wang Huanliang, Han Jiqin, Zheng Tieran. Approximation of Kullback-Leibler Divergence between Two Gaussian Mixture Distributions. Acta Automatica Sinica, 2008, 34(5): 529-534 (in Chinese)
(王欢良,韩纪庆,郑铁然.高斯混合分布之间K-L散度的近似计算.自动化学报, 2008, 34(5): 529-534)