基于协方差描述子和黎曼流形的语音情感识别<sup>*</sup>

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (443 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要提出一种基于协方差描述子和黎曼流形的语音情感识别方法.根据提取的语音声学特征，计算协方差矩阵用于表征语句的情感信息.考虑到非奇异协方差矩阵所构成空间的高维特性，引入一种仿射不变度量使得该空间满足黎曼流形的要求.进而根据微分几何，建立基于黎曼流形的算法架构.实验证明，该方法在语音情感识别中获得较好的识别效果，尤其在噪声环境下能更有效地提高识别准确率.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘佳
	陈纯
	叶承羲
	李娜
	卜佳俊

关键词 ：语音情感识别, 协方差描述子, 黎曼流形, 噪声环境, 支持向量机(SVM)

Abstract：An algorithm for speech emotion recognition is proposed based on covariance descriptor and Riemannian manifold. According to the extracted acoustic features, covariance matrices are computed as the emotion descriptors of sentences. With the consideration of high dimensional characteristic of the space constructed by non-singular covariance matrices, an affine invariance metric is adopted to make the space meet the requirement of Riemannian manifold. With differential geometry, the speech emotion recognition is performed on the manifold. The experimental results show a significant improvement in recognition accuracy, especially under noisy environments.

Key words： Speech Emotion Recognition Covariance Descriptor Riemannian Manifold Noisy Environment Support Vector Machine (SVM)

收稿日期: 2008-10-27

ZTFLH:

TP391

基金资助:国家自然科学基金项目(No.60873124)、国家科技支撑计划项目(No.2008BAH26B02)资助

作者简介: 刘佳，女，1981年生，博士研究生，主要研究方向为语音情感识别.E-mail: liujia@zju.edu.cn.陈纯，男，1955年生，教授，博士生导师，主要研究方向为图形图像处理、语音分析、嵌入式系统.叶承羲，男，1985年生，硕士研究生，主要研究方向为模式识别、图像处理.李娜，女，1978年生，博士后，主要研究方向为模式识别.卜佳俊，男，1973年生，教授，博士生导师，主要研究方向为嵌入式系统、语音和图形图像处理.

引用本文:

刘佳，陈纯，叶承羲，李娜，卜佳俊. 基于协方差描述子和黎曼流形的语音情感识别^*[J]. 模式识别与人工智能, 2009, 22(5): 673-677. LIU Jia, CHEN Chun, YE Cheng-Xi, LI Na, BU Jia-Jun. Speech Emotion Recognition Based on Covariance Descriptor and Riemannian Manifold. , 2009, 22(5): 673-677.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2009/V22/I5/673

[1] Lin Yilin, Wei Gang, Yang Kangcai. A Survey of Emotion Recognition in Speech. Journal of Circuits and Systems, 2007, 12(1): 90-98 (in Chinese)
(林奕琳,韦岗,杨康才.语音情感识别的研究进展.电路与系统学报, 2007, 12(1): 90-98)
[2] Wang Zhiliang, Chen Fengjun, Xue Weimin. A Survey of Facial Expression Recognition. Computer Applications and Software, 2003, 20(12): 63-66 (in Chinese)
(王志良,陈锋军,薛为民.人脸表情识别方法综述.计算机应用与软件, 2003, 20(12): 63-66)
[3] Liu Dan, Zhang Naiyao, Zhu Hancheng. A CAD System of Music Animation Based on Form and Mood Recognition. Pattern Recognition and Artificial Intelligence, 2003, 16(3): 283-287 (in Chinese)
(刘丹,张乃尧,朱汉城.基于曲式和情感识别的音乐动画CAD系统.模式识别与人工智能, 2003, 16(3): 283-287)
[4] Zhao Li, Jiang Chunhui, Zou Cairong, et al. A Study on Emotional Feature Analysis and Recognition in Speech. Acta Electronica Sinica, 2004, 32(4): 606-609 (in Chinese)
(赵力,蒋春辉,邹采荣,等.语音信号中的情感特征分析和识别的研究.电子学报, 2004, 32(4): 606-609)
[5] Pao T L, Chen Y T, Yeh J H. Emotion Recognition from Mandarin Speech Signals // Proc of the International Symposium on Chinese Spoken Language Processing. Hongkong, China, 2004: 301-304
[6] Nicholson J, Takahashi K, Nakatsu R. Emotion Recognition in Speech Using Neural Networks. Neural Computing and Applications, 2000, 2(2): 495-501
[7] Yu Feng, Chang E, Xu Yingqing, et al. Emotion Detection from Speech to Enrich Multimedia Content // Proc of the IEEE Pacific Rim Conference on Multimedia. Beijing, China, 2001: 550-557
[8] Tuzel O, Porikli F, Meer P. Region Covariance: A Fast Descriptor for Detection and Classification // Proc of the 9th European Conference on Computer Vision. Graz, Austria, 2006, Ⅱ: 589-600
[9] Fletcher P T, Joshi S. Riemannian Geometry for the Statistical Analysis of Diffusion Tensor Data. Signal Processing, 2007, 87(2): 250-262
[10] Pennec X, Fillard P, Ayache N. A Riemannian Framework for Tensor Computing. International Journal of Computer Vision, 2006, 66(1): 41-66