高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种用于截幅音频修复中的自适应一致迭代硬阈值算法

邹霞 吴彭龙 孙蒙 张星昱

邹霞, 吴彭龙, 孙蒙, 张星昱. 一种用于截幅音频修复中的自适应一致迭代硬阈值算法[J]. 电子与信息学报, 2019, 41(4): 925-931. doi: 10.11999/JEIT180543
引用本文: 邹霞, 吴彭龙, 孙蒙, 张星昱. 一种用于截幅音频修复中的自适应一致迭代硬阈值算法[J]. 电子与信息学报, 2019, 41(4): 925-931. doi: 10.11999/JEIT180543
Xia ZOU, Penglong WU, Meng SUN, Xingyu ZHANG. An Adaptive Consistent Iterative Hard Thresholding Alogorith for Audio Declipping[J]. Journal of Electronics & Information Technology, 2019, 41(4): 925-931. doi: 10.11999/JEIT180543
Citation: Xia ZOU, Penglong WU, Meng SUN, Xingyu ZHANG. An Adaptive Consistent Iterative Hard Thresholding Alogorith for Audio Declipping[J]. Journal of Electronics & Information Technology, 2019, 41(4): 925-931. doi: 10.11999/JEIT180543

一种用于截幅音频修复中的自适应一致迭代硬阈值算法

doi: 10.11999/JEIT180543
基金项目: 国家自然科学基金(61402519),江苏省优秀青年基金(BK20180080)
详细信息
    作者简介:

    邹霞:男,1979年生,副教授,研究方向为语音信号处理等

    吴彭龙:男,1995年生,硕士,研究方向为智能信息处理等

    孙蒙:男,1984年生,讲师,研究方向为机器学习和语音处理等

    张星昱:男,1994年生,硕士,研究方向为信息安全处理等

    通讯作者:

    吴彭龙 17551050128@163.com

  • 中图分类号: TN912.3

An Adaptive Consistent Iterative Hard Thresholding Alogorith for Audio Declipping

Funds: The National Natural Science Foundation of China (61402519), The Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (BK20180080)
  • 摘要:

    一致迭代硬阈值(CIHT)算法在处理音频截幅失真中具有较好的性能。但是,在截幅程度较大时音频截幅修复的性能会下降。因此,该文提出一种基于自适应门限的改进算法。该算法自动估计音频信号截幅程度,根据估计的截幅程度信息,自适应调整算法中的截幅程度因子。与近年来提出的CIHT算法和一致字典学习算法(CDL)相比,该文所提算法能更好地重建音频信号,特别在音频信号截幅失真严重的情况。该算法的运算复杂度与CIHT相近,与CDL相比,拥有更快的运行速度,有利于实时实现。

  • 图  1  信号截幅失真示意图

    图  2  测试数据散点图

    图  3  测试数据拟合图

    图  4  失真示意图

    图  5  函数曲线图

    图  6  截幅语音信号修复后的SDR提升对比图

    图  7  截幅音乐信号修复后的SDR提升对比图

    图  8  截幅语音信号修复后波形对比图

    图  9  截幅音乐信号修复后波形对比图

    表  1  截幅语音信号1修复前后PESQ得分比较

    输入语音
    SDR (dB)
    截幅语音CIHT算法
    修复后
    CDL算法
    修复后
    ACIHT算法
    修复后
    21.88382.08772.12012.2400
    42.20412.44112.48602.6039
    62.34512.62392.62472.8551
    82.50842.79742.76083.1258
    102.65763.05013.07153.2847
    122.79513.25383.30203.6657
    142.98583.49153.50573.8716
    163.10983.68813.62234.1016
    183.29843.82033.71674.2174
    203.41284.19344.22244.2440
    下载: 导出CSV

    表  2  截幅语音信号2修复前后PESQ得分比较

    输入语音
    SDR (dB)
    截幅语音CIHT算法
    修复后
    CDL算法
    修复后
    ACIHT算法
    修复后
    21.70801.89801.98022.2026
    41.99772.30652.34512.5566
    62.21152.60412.65372.7818
    82.39002.89042.91763.0617
    102.59463.13973.21163.3242
    122.76253.46623.45463.4784
    142.93593.78443.83243.9217
    163.24813.98204.03864.0463
    183.33624.21504.13754.3005
    203.40044.31864.28454.3500
    下载: 导出CSV
  • JANSSEN A, VELDHUIS R, and VRIES L. Adaptive interpolation of discrete-time signals that can be modeled as autoregressive processes[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1986, 34(2): 317–330 doi: 10.1109/TASSP.1986.1164824
    ABEL J S and ABEL J S. Restoring a clipped signal[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, 1991, 3: 1745–1748.
    SIMON J G, PATRICK J, and WILLIAM N W. Statistical model-based approaches to audio restoration and analysis[J]. Journal of New Music Research, 2001, 30(4): 323–338 doi: 10.1076/jnmr.30.4.323.7489
    ADLER A, EMIYA V, and JAFARI M G. Audio Inpainting[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(3): 922–932 doi: 10.1109/TASL.2011.2168211
    ADLER A, EMIYA V, and JAFARI M G. A constrained matching pursuit approach to audio declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011: 329–332.
    DEFRAENE B, MANSOUR N, and HERTOGH S D. Declipping of audio signals using perceptual compressed sensing[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(12): 2627–2637 doi: 10.1109/TASL.2013.2281570
    FOUCART S and NEEDHAM T. Sparse recovery from saturated measurements[J]. Information and Inference: A Journal of the IMA, 2017, 6(2): 196–212 doi: 10.1093/imaiai/iaw020
    OZEROV A, BILEN C, and PEREZ P. Multichannel audio declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Shanghai, China, 2016: 659–663.
    KAI S, KOWALSKI M, and DORFLER M. Audio declipping with social sparsity[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Florence, Italy, 2014: 1577–1581.
    KITIC S, JACQUES L, and MADHU N. Consistent iterative hard thresholding for signal declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada, 2013: 5939–5943.
    RENCKER L, BACH F, WANG Wenwu, et al. Consistent dictionary learning for signal declipping[C]. International Conference on Latent Variable Analysis and Signal Separation, Guildford, UK, 2018: 446–455.
    LECUE G and FOUCART S. An IHT algorithm for sparse recovery from sub-exponential measurements[J]. IEEE Signal Processing Letters, 2017, 24(3): 1280–1283 doi: 10.1109/LSP.2017.2721500
    HINES A, SKOGLUND J, and KOKARAM A. Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada, 2013: 3697–3701.
    何孝月. 基于EPESQ的VoIP语音质量评估的研究与实现[D]. [硕士论文], 中南大学, 2008.

    HE Xiaoyue. Speech Quality Evaluation of VoIP Based on EPESQ[D]. [Master dissertation], Central South University, 2008.
  • 加载中
图(9) / 表(2)
计量
  • 文章访问数:  2644
  • HTML全文浏览量:  891
  • PDF下载量:  70
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-04
  • 修回日期:  2018-12-04
  • 网络出版日期:  2018-12-13
  • 刊出日期:  2019-04-01

目录

    /

    返回文章
    返回