高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于幅度压缩滤波的清浊音分类及基音估计

徐静云 赵晓群 王峤 王缔罡

徐静云, 赵晓群, 王峤, 王缔罡. 基于幅度压缩滤波的清浊音分类及基音估计[J]. 电子与信息学报, 2016, 38(3): 586-593. doi: 10.11999/JEIT150778
引用本文: 徐静云, 赵晓群, 王峤, 王缔罡. 基于幅度压缩滤波的清浊音分类及基音估计[J]. 电子与信息学报, 2016, 38(3): 586-593. doi: 10.11999/JEIT150778
XU Jingyun, ZHAO Xiaoqun, WANG Qiao, WANG Digang. Voiced/Unvoiced Classification and Pitch Estimation Based on Amplitude Compression Filter[J]. Journal of Electronics & Information Technology, 2016, 38(3): 586-593. doi: 10.11999/JEIT150778
Citation: XU Jingyun, ZHAO Xiaoqun, WANG Qiao, WANG Digang. Voiced/Unvoiced Classification and Pitch Estimation Based on Amplitude Compression Filter[J]. Journal of Electronics & Information Technology, 2016, 38(3): 586-593. doi: 10.11999/JEIT150778

基于幅度压缩滤波的清浊音分类及基音估计

doi: 10.11999/JEIT150778
基金项目: 

国家自然科学基金(61271248),湖州市自然科学基金(2015YZ04)

Voiced/Unvoiced Classification and Pitch Estimation Based on Amplitude Compression Filter

Funds: 

The National Natural Science Foundation of China (61271248), The Natural Science Foundation of Huzhou City (2015YZ04)

  • 摘要: 该文针对传统算法在实环境(不同噪声类型和信噪比)下容易发生清浊误判和基音估计错误问题,提出一种基于幅度压缩基音估计滤波(PEFAC)的清浊音分类及基音估计方法。首先,通过PEFAC削弱语音的低频噪声,提取出基音谐波;然后,采用基于对称平均幅度和函数的脉冲序列加权算法(SIM)确定谐波数目;最后,利用动态规划估计出基音,用基于3元素特征矢量的高斯混合模型对清浊音进行分类。仿真结果表明,在实环境下,所提方法能有效抑制清浊误判及基音估计错误现象的发生,性能优于传统方法。
  • RABINER L, CHENG M, ROSENBERG A E, et al. A comparative performance study of several pitch detection algorithms[J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1976, 24(5): 399-418.
    VEPREK P and SCORDILIS M S. Analysis, enhancement and evaluation of five pitch determination techniques[J]. Speech Communication, 2002, 37(3): 249-270.
    HAN Kun and Wang Deliang. Neural network based pitch tracking in very noisy speech[J]. IEEE/ACM Transactions on Audio, speech, and Language Processing, 2014, 22(12): 2158-2168.
    MOLINA E, TARDON L J, BARBANCHO A M, et al. SiPTH: Singing transcription based on hysteresis defined on the pitch-time curve[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(2): 252-263.
    DUAN Zhiyao, HAN Jinyu, and PARDO B. Multi-pitch streaming of harmonic sound mixtures[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(1): 138-150.
    CHEN Yujui, WEI Chengwen, CHIANG Yifan, et al. Neuromorphic pitch based noise reduction for monosyllable hearing aid system application[J]. IEEE Transactions on Circuits and Systems, 2014, 61(2): 463-475.
    王玥, 钱志鸿, 张营. 基于扩展谱相减的RCAF基音周期检测算法[J]. 电子与信息学报, 2009, 31(5): 1161-1165.
    WANG Yue, QIAN Zhihong, and ZHANG Ying. RCAF pitch detection algorithm based on expanded spectral subtraction [J]. Journal of Electronics Information Technology, 2009, 31(5): 1161-1165.
    SHIMAMURA T and KOBAYASHI H. Weighted autocorrelation for pitch extraction of noisy speech[J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(7): 727-730.
    徐敬德, 常亮, 崔慧娟, 等. 基于频域和时域结合的基音周期提取算法[J]. 清华大学学报, 2012, 52(3): 413-415.
    XU Jingde, CHANG Liang, CUI Huijuan, et al. A pitch period detection algorithm using time and frequency analyses[J]. Journal of Tsinghua University, 2012, 52(3): 413-415.
    SHAHNAZ C, ZHU W P, and AHMAD M O. Pitch estimation based on a harmonic sinusoidal autocorrelation model and a time-domain matching scheme[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 322-335.
    HUANG F and LEE T. Pitch estimation in noisy speech using accumulated peak spectrum and sparse estimation technique[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(1): 99-109.
    GONZALEZ S and BROOKES M. PEFACA pitch estimation algorithm robust to high levels of noise[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(2): 518-530.
    BYRNE D, DILLON H, TRAN K, et al. An international comparison of long term average speech spectra[J]. The Journal of the Acoustical Society of America, 1994, 96(4): 2108-2120.
    BROOKES M. VOICEBOX: A speech processing toolbox
    PLANTE F, MEYER G F, and AINSWORTH W A. A pitch extraction reference database[C]. 4th European Conference on Speech Communication and Technology, Madrid, 1995: 837-840.
    STEENEKEN H J and GEURTSEN F W. Description of the RSG-10 noise database[R]. Report IZF 1988-3 TNO, Soesterberg: Institute for Perception, 1988.
    International Telecommunication Union-TP.56. Objective measurement of active speech level[S]. Geneva, 1993.
    张文耀, 许刚, 王裕国. 循环AMDF及其语音基音周期估计算法[J]. 电子学报, 2003, 31(6): 886-890.
    ZHANG Wenyao, XU Gang, and WANG Yuguo. Circular AMDF and pitch estimation based on it[J]. Acta Electronica Sinica, 2003, 31(6): 886-890.
    韩明, 刘教民, 孟军英, 等. 一种自适应调整的混合高斯背景建模和目标检测算法[J]. 电子与信息学报, 2014, 36(8): 2023-2027. doi: 10.3724/SP.J.1146.2013.01438.
    HAN Ming, LIU Jiaomin, MENG Junying, et al. A modeling and target detection algorithm based on adaptive adjustment??for mixture Gaussian background[J]. Journal of Electronics Information Technology, 2014, 36(8): 2023-2027. doi: 10.3724/SP.J.1146.2013.01438.
    TALKIN D. Speech Coding and Synthesis[M]. Elsevier Science, 1995, Chapter.14: 495-518.
  • 加载中
计量
  • 文章访问数:  1782
  • HTML全文浏览量:  118
  • PDF下载量:  613
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-06-29
  • 修回日期:  2015-12-02
  • 刊出日期:  2016-03-19

目录

    /

    返回文章
    返回