一种基于选择性集成SVM的新闻音频自动分类方法<sup>*</sup>

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (365 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要作为视频检索的一种重要线索,音频检测和分类受到广泛关注并已成为一个热门的研究方向.在新闻视频先验模型和结构的基础上,提出一种基于选择性集成SVM(SENSVM)的分类器设计方法.从而将新闻视频划分成静音、音乐、语音和带有背景音乐的语音这4种类型.用8514s的真实新闻音频数据所作的仿真实验结果表明:所提出基于选择性集成SVM的新闻音频自动分类算法的平均准确率高达98.2%,远远高于单纯基于SVM的方法和传统的基于门限的方法.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	韩冰
	高新波
	姬红兵

关键词 ：音频自动分类, 选择性集成, 支持向量机, 决策规则

Abstract：As a significant clue for video indexing and retrieval, audio detection and classification has attracted much attention and become a hot topic. On the basis of the prior model of news video structure, a selective ensemble support vector machines (SENSVM) is proposed to detect and classify the news audio into 4 types: silence, music, speech, and speech with music background. Experiments on real news audio clips of 8514s in total length illustrate that the average accuracy rate of the proposed audio classification method reaches 98.2%, which is much better than that of the available SVMbased method or the traditional thresholdbased method.

Key words： Automatic Audio Classification Selective Ensemble Support Vector Machine Decision Rules

收稿日期: 2005-04-11

ZTFLH:

TP391

基金资助:新世纪优秀人才支持计划项目 (No.NCET040948)、国家自然科学基金项目 (No.60202004) 和教育部重点项目 (No.104173)资助

作者简介: 韩冰,女,1978年生,博士研究生,主要研究方向为视频检索、模式识别等.E-mail: hanbing@lab202.xidian.edu.cn.高新波,男,1972年生,教授,博士生导师,主要研究方向为视频分析与检索、模式识别、模糊信息处理和医学影像信息处理.姬红兵,男,1963年生,教授,博士生导师,主要研究方向为雷达目标识别、现代信号处理理论与方法.

引用本文:

韩冰，高新波，姬红兵. 一种基于选择性集成SVM的新闻音频自动分类方法^*[J]. 模式识别与人工智能, 2006, 19(5): 634-639. HAN Bing, GAO XinBo, JI HongBing. Automatic News Audio Classification Method Based on Selective Ensemble SVMs. , 2006, 19(5): 634-639.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2006/V19/I5/634

[1] Zhang T, Kuo C C J. Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Trans on Speech and Audio Processing, 2001, 9(4): 441-457
[2] Feiten B, Frank R, Ungvary T. Organization of Sounds with Neural Nets // Proc of the International Computer Music Conference. San Francisco, USA, 1991: 441-444
[3] Feiten B, Günzel S. Automatic Indexing of a Sound Database Using Self-Organizing Neural Nets. Computer Music Journal, 1994, 18(3): 53-65
[4] Kimber D, Wilcox L. Acoustic Segmentation for Audio Browsers // Proc of Interface Conference. Sydney, Australia, 1996: 384-392
[5] Scheirer E, Slaney M. Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Munich, Germany, 1997, Ⅱ: 1331-1334
[6] Li Y, Dorai C. SVM-Based Audio Classification for Instructional Video Analysis // Proc of the IEEE International Conference on Acoustics Speech and Signal Processing. Montreal, Canada, 2004, Ⅴ: 897-900
[7] Vapnik V N. The Nature of Statistical Learning Theory. 2nd Edition. Berlin, Germany: Springer-Verlag, 1995
[8] Hansen L K, Salamon P. Neural Network Ensembles. IEEE Trans on Pattern Analysis and Machine Intelligence, 1990, 12(10): 993-1001
[9] Hansen L K, Liisberg L, Salamon P. Ensemble Methods for Handwritten Digit Recognition // Proc of the IEEE Workshop on Neural Networks for Signal Processing. Copenhagen, Denmark, 1992: 333-342
[10] Zhou Z H, Jiang Y, Yang Y B, et al. Lung Cancer Cell Identification Based on Artificial Neural Network Ensembles. Artificial Intelligence in Medicine, 2002, 24(1): 25-36
[11] Zhou Z H, Wu J X, Jiang Y, et al. Genetic Algorithm Based Selective Neural Network Ensemble // Proc of the 17th International Joint Conference on Artificial Intelligence. Seattle, USA, 2001, Ⅱ: 797-802
[12] Maclin R, Shavlik J W. Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks // Proc of the 14th International Joint Conference on Artificial Intelligence. Montreal, Canada, 1995: 524-530
[13] Schapire R E. The Strength of Weak Learnability. Machine Learning, 1990, 5(2): 197-227
[14] Wang Y, Liu Z, Huang J C. Multimedia Content Analysis Using Both Audio and Visual Cues. IEEE Signal Processing Magazine, 2000, 17(6): 12-36
[15] Zhang T, Kuo C C J. Audio Content Analysis for Online Audiovisual Data Segmentation. IEEE Trans on Speech and Audio Processing, 2001, 9(4): 441-457