1.Institute of Semiconductor and Information Technology, Tongji University, Shanghai 200092 2.College of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou 310035 3.Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083
Abstract:It is essential for speech processing system to have robust speech detection. In this paper, a PCA(principal component analysis)based speech detection method is proposed. A good result of the examination by using this method is gotten. In this method, speech and nonspeech subspaces are created respectively by using PCA. The result of fast PCA is the basis of the new subspace. By analysis the distribution of the data in subspace, the speech and nonspeech can be detected respectively. Creating a number of different type nonspeech subspaces can get a better performance than creating one.
[1] Yi Kechu. Speech Signal Processing. Beijing, China: National Defence Industry Press, 2000 (in Chinese) (易克初.语音信号处理.北京:国防工业出版社, 2000) [2] Zhang Baoxuan. Research Progress of Chinese Speech Recognition. Shandong Electron, 1994, 2: 8-10 (in Chinese) (张保轩.汉语语音识别研究进展综述.山东电子, 1994, 2: 8-10) [3] Tanyer S G, zer H. Voice Activity Detection in Nonstationary Noise. IEEE Trans on Speech and Audio Processing, 2000, 8(4): 478-482 [4] Tucker R. Voice Activity Detection Using a Periodicity Measure. IEE Proceeding-Communications, Speech and Vision, 1992, 139(4): 377-380 [5] Haigh J A, Mason J S. A Voice Activity Detector Based on Cepstral Analysis // Proc of the European Conference on Speech Communication and Technology. Berlin, Germany, 1993, Ⅱ: 1103-1106 [6] Sohn J, Sung W. A Voice Activity Detector Employing Soft Decision Based Noise Spectrum Adaptation // Proc of the International Conference on Acoustics, Speech and Signal Processing. Seattle, USA, 1998: 365-368 [7] Sohn J, Kim N S, Sung W. A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters, 1999, 6(1): 1-3 [8] Scheirer E, Slaney M. Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator // Proc of the International Conference on Acoustics, Speech and Signal Processing. Munich, Germany, 1997: 1331-1334 [9] Lu Jian, Chen Yisong, et al. Character Analyse on Automatically Assorting of Speech or Music. Journal of Computer-Aided Design & Computer Graphics, 2002, 14(3): 233-237 (in Chinese) (卢 坚,陈毅松,等. 语音/音乐自动分类中的特征分析.计算机辅助设计与图形学报, 2002, 14(3): 233-237) [10] Wang Shoujue, Wang Bonan. Analysis and Theory of High-Dimension Space Geometry for Artificial Neural Networks. Acta Electronica Sinica, 2002, 30(1): 1-4 (in Chinese) (王守觉,王柏南.人工神经网络的多维空间几何分析及其理论.电子学报, 2002, 30(1): 1-4) [11] Szu H, Zheng Liming, et al. Intelligent Vision Image Processing-Unsupervised Learning Method on Multicenter Image and Other Methods. Shanghai, China: Shanghai Scientific and Technological Education Press, 2002 (in Chinese) (斯华龄,张立明,等. 智能视觉图像处理——多通道图像的无监督学习方法及其他方法.上海:上海科技教育出版社, 2002)