Advanced Search
Volume 29 Issue 7
Jan.  2011
Turn off MathJax
Article Contents
Jian Zhi-hua, Yang Zhen. An Algorithm for Voice Conversion Based on Mixtures of Linear Transformation[J]. Journal of Electronics & Information Technology, 2007, 29(7): 1700-1702. doi: 10.3724/SP.J.1146.2006.00787
Citation: Jian Zhi-hua, Yang Zhen. An Algorithm for Voice Conversion Based on Mixtures of Linear Transformation[J]. Journal of Electronics & Information Technology, 2007, 29(7): 1700-1702. doi: 10.3724/SP.J.1146.2006.00787

An Algorithm for Voice Conversion Based on Mixtures of Linear Transformation

doi: 10.3724/SP.J.1146.2006.00787
  • Received Date: 2006-06-06
  • Rev Recd Date: 2006-10-30
  • Publish Date: 2007-07-19
  • This paper proposes an algorithm for voice conversion based on mixtures of linear transformation which avoids the need for parallel training corpus inherent in conventional approaches. In maximum likelihood framework, the EM algorithm is used to compute the parameters of the transfer function. And the chirp Z-transform is utilized to enhance the smoothed spectral envelop due to the linear weighted averaging. The proposed voice conversion system is evaluated using both objective and subjective measures. The experiment results demonstrate that the proposed approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus is needed.
  • loading
  • Childers D G, Wu K, and Hicks D M, et al.. Voice conversion[J].Speech Communication.1989, 8(2):147-158[2]Abe M, Nakamura S, Shikano K, and Kuwabara H. Voice conversion through vector quantization. IEEE Proceedings of ICASSP, New York, USA, Apr. 11-14, 1988: 565-568.[3]Arslan L M. Speaker transformation algorithm using segmental codebooks[J].Speech Communication.1999, 28(3):211-226[4]Narendranath M, Murthy H A, and Rajendran S, et al.. Transformation of formants for voice conversion using artificial neural networks[J].Speech Communication.1995, 16(2):207-216[5]Iwahashi N and Sagisaka Y. Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks[J].Speech Communication.1995, 16(2):139-151[6]Stylianou Y, Cappe O, and Moulines E. Continuous Probabilistic Transform for Voice Conversion[J].IEEE Trans on Speech and Audio Processing.1998, 6(2):131-142[7]Kain A and Macon M W. Spectral voice conversion for text-to-speech synthesis. IEEE Proceedings of ICASSP, Seattle, USA, May 12-15, 1998: 285-288.[8]Smits R and Yegnanarayana B. Determination of instants of significant excitation in speech using group delay function[J].IEEE Trans. on Speech and Audio Processing.1995, 3(5):325-333[9]Diakoloukas V D and Digalakis V V. Maximum likelihood stochastic transformation adaptation of hidden Markov models[J].IEEE Trans. on Speech and Audio Processing.1999, 7(2):177-187[10]Wang T T. The segmented chirp z-transform and its application in spectrum analysis[J].IEEE Trans. on Instrumentation and Measurement.1990, 39(2):318-324[11]Rao K S and Yegnanarayana B. Prosody modification using instants of significant excitation[J].IEEE Trans. on Audio, Speech and Language.2006, 14(3):972-980[12]Hasan M M, Nasr A M, and Sultana S. An approach to voice conversion using feature statistical mapping[J].Applied Acoustics.2005, 66(5):513-532
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (3090) PDF downloads(762) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return