|
|
A Novel Voice Conversion Method Based on Codebook Mapping with PhonemeTied Weighting |
WANG ZiXiang, DAI LiRong, WANG YuPing, WANG RenHua |
Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027 |
|
|
Abstract The voice conversion system framework is introduced in this paper. Further, the conventional codebook mapping method for voice conversion is discussed. This paper point out that the conventional codebook mapping method, which calculates the weighting coefficients based on whole codebooks, tends to generate overly smoothed effect on converted speech spectrum. So the converted speech quality is decreased greatly. To address this problem, a novel voice conversion method based on codebook mapping with phonemetied weighting is presented. And a new decision tree based prosodic conversion method is also proposed. The experiments show that the proposed methods can effectively convert speaker's individuality while maintaining high speech quality with only a small amount of training data.
|
Received: 07 January 2005
|
|
|
|
|
[1] Abe M, Nakamura S, Shikano K, Kuwabara H. Voice Conversion through Vector Quantization. In: Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. New York, USA, 1988, Ⅰ: 655-658 [2] Narendranath M, Murthy H A, Rajendran S, Yegnanarayana B. Transformation of Formants for Voice Conversion Using Artificial Neural Networks. Speech Communication, 1995, 16(2): 207-216 [3] Mizuno H, Abe M. Voice Conversion Algorithm Based on Piecewise Linear Conversion Rules of Formant Frequency and Spectrum Tilt. Speech Communication, 1995, 16(2): 153-164 [4] Stylianou Y, Cappe O, Moulines E. Continuous Probabilistic Transform for Voice Conversion. IEEE Trans on Speech and Audio Processing, 1998, 6(2): 131-142 [5] Wang Z X, Wang R H, Shuang Z W, Ling Z H. A Novel Voice Conversion System Based on Codebook Mapping with Phoneme-tied Weighting. In: Proc of the 8th Intenational Conference on Spoken Language Processing. Jeju Island, Korea, 2004, 1197-1200 [6] Kawahara H. Restructuring Speech Representations Using a Pitch-Adaptive Time Frequency Smoothing and a Instantaneous-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sound. Speech Communication, 1999, 27(3-4): 187-207 [7] Arslan L M. Speaker Transformation Algorithm Using Segmental Codebooks (STASC). Speech Communication, 1999, 28(3): 211-226 [8] Turk O, Arslan L M. Subband Based Voice Conversion. In: Proc of the International Conference on Spoken Language Processing. Denver, USA, 2002, Ⅰ: 289-292 [9] Breiman L, Friedman J, Olshen R, Stone C. Classification and Regression Trees. New York, USA: Chapman and Hall, 1984 [10] Hasan M M, Nasr A M, Sultana S. An Approach to Voice Conversion Using Feature Statistical Mapping. Applied Acoustics, 2005, 66(5): 513-532 |
|
|
|