一种基于语音组成单位的说话人识别算法

摘要
图/表
参考文献
相关文章 (13)

全文: PDF (315 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要以线性预测系数为特征通过高斯混合模型的迭代算法对训练样本的初始k均值聚类结果进行优化,得到语音组成单位的表示.以语音组成单位的模式匹配为基础,提出一种文本无关说话人确认的方法——均值法,以及一种文本无关说话人辨认方法.实验结果表明,即使在短时语音下本文方法都能取得较好效果.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	黄长存
	汪增福

关键词 ：语音组成单位, 文本独立说话人确认, 文本独立说话人辨认

Abstract：The linear prediction coefficients (LPC) are selected as features to construct the corresponding feature space. Firstly, the samples for training of each individual in system are clustered in the feature space to form initial clusters by using k-mean clustering algorithm. Then, the initial clustering results are optimized based on Gaussian mixture model (GMM) iterative algorithm to obtain the speech component unit representation of the individual. On the basis of the obtained speech component units, a text-independent speaker verification method, called averaging method, and a text-independent speaker identification method are presented. Experimental results show that the proposed algorithm can produce a satisfying result even in short utterances.

Key words： Speech Component Unit Text-Independent Speaker Verification Text-Independent Speaker Identification

收稿日期: 2007-03-13

ZTFLH:

TP242.6

作者简介: 黄长存,男,1982年生,硕士,主要研究方向为说话人识别.E-mail:cchuang@mail.ustc.edu.cn.汪增福,男,1960年生,教授,博士生导师,主要研究方向为立体视觉、生物特征识别、情感计算以及智能机器人等.E-mail:zfwang@ustc.edu.cn.

引用本文:

黄长存，汪增福. 一种基于语音组成单位的说话人识别算法[J]. 模式识别与人工智能, 2008, 21(6): 856-866. HUANG Chang-Cun, WANG Zeng-Fu. A Speaker Recognition Algorithm Based on Speech Component Unit. , 2008, 21(6): 856-866.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2008/V21/I6/856

[1] Pfeifer L L. New Techniques for Text-Independent Speaker Identification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Tulsa, USA, 1978, Ⅲ: 283-286
[2] Matsumoto H, Nimura T. Text-Independent Speaker Identification Based on Piecewise Canonical Discriminant Analysis // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Tulsa, USA, 1978, Ⅲ: 291-294
[3] Li K P, Jr Wrench E H. An Approach to Text-Independent Speaker Recognition with Short Utterances // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Boston, USA, 1983, Ⅷ: 555-558
[4] Savic M, Gupta S K. Variable Parameter Speaker Verification System Based on Hidden Markov Modeling // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Albuquerque, Mexico, 1990: 281-284
[5] Poritz A B. Linear Predictive Hidden Markov Models and the Speech Signal // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Paris, France, 1982, Ⅶ: 1291-1294
[6] Reynolds D A, Rose R C. Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Trans on Speech and Audio Processing, 1995, 3(1): 72-83
[7] Jr Campbell J P. Speaker Recognition:A Tutorial. Proc of the IEEE, 1997, 85(9): 1437-1462
[8] Sun Jixiang. Modern Pattern Recognition. Changsha, China: National University of Defense Technology Press, 2003: 13-32 (in Chinese)
(孙即祥.现代模式识别.长沙:国防科技大学出版社, 2003: 13-32)
[9] Oppenheim A V, Schafer R W. Digital Signal Processing. Englewood Cliffs, USA: Prentice-Hall, 1975
[10] Cen Qixiang. A General Introduction to Phonetics. Beijing, China: Science Press, 1959: 41-42 (in Chinese)
(岑麒祥.语音学概论.北京:科学出版社, 1959: 41-42)
[11] Shen Yang. Fifteen Lectures of Linguistics' Fundamental Knowledge. Beijing, China: Peking University Press, 2005: 52-79 (in Chinese)
(沈阳.语言学常识十五讲.北京:北京大学出版社, 2005: 52-79)