高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用

章雒霏 张铭 李晨

章雒霏, 张铭, 李晨. 一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用[J]. 电子与信息学报, 2016, 38(8): 2020-2026. doi: 10.11999/JEIT151302
引用本文: 章雒霏, 张铭, 李晨. 一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用[J]. 电子与信息学报, 2016, 38(8): 2020-2026. doi: 10.11999/JEIT151302
ZHANG Luofei, ZHANG Ming, LI Chen. A New Voice and Noise Activity Detection Algorithm and Its Applicationto Dual Microphone Noise Suppression System for Handset[J]. Journal of Electronics & Information Technology, 2016, 38(8): 2020-2026. doi: 10.11999/JEIT151302
Citation: ZHANG Luofei, ZHANG Ming, LI Chen. A New Voice and Noise Activity Detection Algorithm and Its Applicationto Dual Microphone Noise Suppression System for Handset[J]. Journal of Electronics & Information Technology, 2016, 38(8): 2020-2026. doi: 10.11999/JEIT151302

一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用

doi: 10.11999/JEIT151302
基金项目: 

江苏省自然科学基金,江苏省声频技术工程重点实验室基金项目(BE2014139)

A New Voice and Noise Activity Detection Algorithm and Its Applicationto Dual Microphone Noise Suppression System for Handset

Funds: 

Program of Natural Science Research of Jiangsu Higher Education Institutions of China, Program of Science and Technology of Jiangsu (BE2014139)

  • 摘要: 针对现有双通道语音活动检测(Voice Activity Detection, VAD)算法依赖于固定阈值难以在多种噪声环境下准确地检测语音和噪声,应用于手机消噪系统会造成语音失真或噪声消除不好等问题,该文提出一种基于神经网络的VAD算法,该算法以分频带能量差和归一化互通道相关为特征,采用神经网络对语音和噪声进行分类。在此基础上,将神经网络VAD与基于互通道信号功率比值的VAD相结合,提出一种新的适用于手机消噪系统的语音和噪声活动检测算法分别对语音和噪声进行检测,并以此进行噪声抑制处理,减少了消噪系统因VAD误判而造成的性能下降。实验结果表明,该处理方法在抑制背景噪声和减少语音失真等方面优于现有的消噪算法,对于方向性语音干扰也有很好的抑制效果。
  • JEUB M, HERGLOTZ C, NELKE C M, et al. Noise reduction for dual-microphone mobile phones exploiting power level differences[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Kyoto, 2012: 1693-1696. doi: 10.1109/ICASSP.2012.6288223.
    XU Y, DU J, and DAI L R. A Regression approach to speech enhancement based on deep neural networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(1): 7-19. doi: 10.1109/TASLP.2014.2364452.
    XU Y, DU J, and DAI L R. An experimental study on speech enhancement based on deep neural networks[J]. IEEE Signal Processing Letters, 2014, 21(1): 65-68. doi: 10.1109/LSP. 2013.2291240.
    WANG Y X, NARAYANAN A, and WANG D L. On training targets for supervised speech separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(12): 1849-1859. doi: 10.1109/TASLP.2014.2352935.
    王明合, 张二华, 唐振明, 等. 基于Fisher 线性判别分析的语音信号端点检测方法[J]. 电子与信息学报, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.
    WANG Minghe, ZHANG Erhua, TANG Zhenmin, et al. Voice activity detection based on Fisher linear discriminant analysis[J]. Journal of Electronics Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.
    郭海燕, 李枭雄, 李拟珺. 基于基频状态和帧间相关性的单通道语音分离算法[J]. 东南大学学报(自然科学版), 2014, 44(6): 1100-1104.
    GUO Haiyan, LI Xiaoxiong, and LI Nijun. Single-channel speech separation based on pitch state and interframe correlation[J]. Journal of Southeast University (Natural Science Edition), 2014, 44(6): 1100-1104.
    NELKE C, BEAUGEANT C, and VARY P. Dual microphone noise PSD estimation for mobile phones in hands-free position exploiting the coherence and speech presence probability[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, 2013: 7279-7283. doi: 10.1109/ ICASSP.2013.6639076.
    YOUSEFIAN N, RAHMANI M, and AKBARI A. Power level difference as a criterion for speech enhancement[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, 2009: 4653-4656. doi: dx.doi.org/ 10.1109/ICASSP.2009.4960668.
    YOUSEFIAN N, AKBARI A, and RAHMANI M. Using power level difference for near field dual-microphone speech enhancement[J]. Applied Acoustics, 2009, 70(11/12): 1412-1421.
    FU Z H, FAN F, and HUANG J D. Dual-microphone noise reduction for mobile phone application[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, 2013: 7239-7243. doi: 10.1109/ ICASSP.2013.6639068.
    MEYER-BAESE U. Digital Signal Processing with Field Programmable Gate Arrays[M]. Third Edition, Berlin Heidelberg: Springer, 2007: 298-305.
    RUBIO J E, ISHIZUKA K, SAWADA H, et al. Two- microphone voice activity detection based on the homogeneity of the direction of arrival estimates[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, 2007: 385-388. doi: 10.1109/ICASSP. 2007.366930.
    ZHAO H C, LI L G, and LI L H, et al. Dual-microphone adaptive noise canceller with a voice activity detector[C]. IEEE Region 10 Symposium, Kuala Lumpur, 2014: 551-554. doi: 10.1109/TENCONSpring.2014.6863095.
    CHOI J H and CHANG J H. Dual-microphone voice activity detection technique based on two-step power level difference ratio[J] IEEE Transactions on Audio, Speech and Language Processing, 2014. 22(6): 1069-1081.
    HU Y, and LOIZHOU P C. Evaluation of objective quality measures for speech enhancement[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(1): 229-238.
  • 加载中
计量
  • 文章访问数:  1488
  • HTML全文浏览量:  152
  • PDF下载量:  574
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-11-23
  • 修回日期:  2016-04-12
  • 刊出日期:  2016-08-19

目录

    /

    返回文章
    返回