一种语音信号非周期性、周期性及基频检测的改进方法

杜硕; 杜利民

doi:10.3724/SP.J.1146.2007.01314

一种语音信号非周期性、周期性及基频检测的改进方法

doi: 10.3724/SP.J.1146.2007.01314

杜硕,
杜利民

计量
- 文章访问数: 3691
- HTML全文浏览量: 102
- PDF下载量: 1423
- 被引次数: 0
出版历程
- 收稿日期: 2007-08-14
- 修回日期: 2007-12-26
- 刊出日期: 2008-04-19

Modified Detection of Aperiodicity,Periodicity and Pitch in Speech

Du Shuo,
Du Li-min

摘要

摘要: APP方法可以准确检测语音信号中的非周期性、周期性和基频，是近年提出的一种先进检测新方法，对于语音基础研究和语音技术应用研究有重要作用。APP方法的最大优点是可以同时检测语音信号的基频周期、周期成分和非周期成分的能量比例，而最大缺点是计算代价巨大，运行时间为110倍实时，成为实际应用的最大障碍。该文在深入剖析APP方法的基础上，从原理架构和技术实现两个方面消除不合理的冗余处理，提出新的改进途径，发展成为改进的APP方法，即MAPP方法。MAPP方法不但加强了APP方法处理机制的合理性基础，改善基频检测的准确性和鲁棒性，而且提高计算效率约1个数量级，在CPU时钟频率为1.70GHz和内存为512MB的Pentium 计算机上的运行时间加快到12.3倍实时。
- 语音信号处理; 基频检测; 周期能量; 非周期能量; 平均幅值差分函数
Abstract: The APP method is capable to provide excellent estimation of speech aperiodic / periodic measurement and pitch simultaneously which is useful in speech research and application. However, due to its heavy computational load, APP system is about 110 times real-time, being an extinct flaw for utilization. In this paper, a modified detection of aperiodicity, periodicity and pitch in speech (MAPP) method is presented, which maintains the merit of APP method and eliminates the redundancy of configuration and computation, rationalizing the methodology. Computer simulation shows that MAPP method maintains high accuracy and robustness and that the system is improved to 12.3 times real-time on Pentium processor with 1.70GHz CPU and 512MB RAM, speeding up about one order of magnitude.

HTML全文

参考文献(1)

刘建, 郑方, 吴文虎. 基于幅度差平方和函数的基音周期提取算法[J]. 清华大学学报(自然科学版), 2006, 46(1): 44-77. Liu Jian, Zheng Fang, and Wu Wen-hu. Real-time pitch tracking based on sum of magnitude difference square function[J]. Journal of Tsinghua University(Science and Technology), 2006, 46(1): 44-77. [2] Luengo I, Saratxaga I, and Navas E, et al.. Evaluation of pitch detection algorithms under real conditions[C]. ICASSP07 Proc., Hawai, USA, Apr. 15-20, 2007: 1057-1060. [3] Li Y and Wang D L. Pitch detection in polyphonic music using Instrument tone models[C]. ICASSP07 Proc., Hawai, USA, Apr. 15-20, 2007: 481-484. [4] Roa S, Bennewitz M, and Behnke S. Fundamental frequency estimation based on pitch-scaled harmonic filtering[C]. ICASSP07 Proc., Hawai, USA, Apr. 15-20, 2007: 397-400. [5] Joho D, Bennewitz M, and Behnke S. Pitch estimation using models of voiced speech on three levels[C]. ICASSP07 Proc., Hawai, USA, Apr. 15-20, 2007: 1077-1080. [6] Wohlmayr M. Joint position-pitch extraction from multichannel audio[C]. Interspeech2007 Proc., Antwerp, Belgium, August 27-31, 2007: 303-306. [7] Brown G and Cooke M. Computational auditory scene analysis[J]. Computer Speech and Language, 1994, (8): 297-336. [8] Ellis D P W. Using knowledge to organize sound: the prediction-driven approach to computational auditory scene analysis, and its application to speech/nonspeech mixtures[J]. Speech Communications, 1999, (27): 281-298. [9] Yegnanarayana B, dAlessandro C, and Darsinos V. An iterative algorithm for decomposition of speech signals into periodic and aperiodic components[J]. IEEE Trans. on Speech Audio Process., 1998, 6(1): 1-11. [10] dAlessandro C, Darsinos V, and Yegnanarayana B. Effectiveness of aperiodic and periodic decomposition method for analysis of voice sources[J]. IEEE Trans. on Speech Audio Process., 1998, 6(1): 12-23. [11] Fujimura O. Approximation to voice aperiodicity. IEEE Trans. on Audio Electroacoust., 1968, AU-16(1): 68-73. [12] Jackson P and Shadle C. Frication noise modulated by voicing, as revealed by pitch-scaled decomposition[J]. J. Acoust. Soc. Amer., 2000, 108(4): 1421-1434. [13] Serra X and Smith J. Spectral Modeling Synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition[J]. Comput. Music J., 1990, 14(4): 12-24. [14] Deshmukh O, Espy-Wilson C Y, and Salomon A, et al.. Use of temporal information: detection of periodicity, aperiodicity, and pitch in speech[J]. IEEE Trans. on Speech and Audio Processing, 2005, 13(5): 776-786. [15] Deshmukh O and Espy-Wilson C. Detection of periodicity and aperiodicity in speech signal based on temporal information[C]. 15th Int. Congr. Phonetic Sciences Proc., Barcelona, Spain, 2003: 1365-1368. [16] Deshmukh O and Espy-Wilson C. A measure of periodicity and aperiodicity in speech[C]. IEEE ICASSP Proc., Hong Kong, China, 2003: 448-451. [17] Glasberg B R and Moore B C J. Derivation of auditory filter shapes from notched-noise data[J]. Hear. Res., 1990, 47 (1-2):103-138. [18] Johansson M. The Hilbert transform[D]. [Master thesis]. Vaxjo University, 1999. [19] Ross M, Shaffer H, and Cohen A, et al.. Average magnitude difference function pitch extractor[J]. IEEE Trans. on Signal Processing, 1974, 22 (5): 353-362. [20] 杜硕. 语音信号的周期性、非周期性及基频的检测[D]. [学士论文]. 北京工业大学, 2007.

施引文献

资源附件(0)

访问统计