一种用于截幅音频修复中的自适应一致迭代硬阈值算法

邹霞; 吴彭龙; 孙蒙; 张星昱

doi:10.11999/JEIT180543

一种用于截幅音频修复中的自适应一致迭代硬阈值算法

doi: 10.11999/JEIT180543

陆军工程大学指挥控制工程学院南京 210007

基金项目: 国家自然科学基金(61402519)，江苏省优秀青年基金(BK20180080)

详细信息

作者简介:
邹霞：男，1979年生，副教授，研究方向为语音信号处理等

吴彭龙：男，1995年生，硕士，研究方向为智能信息处理等

孙蒙：男，1984年生，讲师，研究方向为机器学习和语音处理等

张星昱：男，1994年生，硕士，研究方向为信息安全处理等

通讯作者:
吴彭龙　17551050128@163.com

中图分类号: TN912.3
计量
- 文章访问数: 2684
- HTML全文浏览量: 948
- PDF下载量: 71
- 被引次数: 0
出版历程
- 收稿日期: 2018-06-04
- 修回日期: 2018-12-04
- 网络出版日期: 2018-12-13
- 刊出日期: 2019-04-01

An Adaptive Consistent Iterative Hard Thresholding Alogorith for Audio Declipping

The Army Engineering University of PLA, Nanjing 210007, China

Funds: The National Natural Science Foundation of China (61402519), The Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (BK20180080)

摘要

摘要:
一致迭代硬阈值(CIHT)算法在处理音频截幅失真中具有较好的性能。但是，在截幅程度较大时音频截幅修复的性能会下降。因此，该文提出一种基于自适应门限的改进算法。该算法自动估计音频信号截幅程度，根据估计的截幅程度信息，自适应调整算法中的截幅程度因子。与近年来提出的CIHT算法和一致字典学习算法(CDL)相比，该文所提算法能更好地重建音频信号，特别在音频信号截幅失真严重的情况。该算法的运算复杂度与CIHT相近，与CDL相比，拥有更快的运行速度，有利于实时实现。
- 音频信号处理 /
- 截幅失真 /
- 自适应门限 /
- 一致迭代硬阈值
Abstract:
Audio clipping distortion can be solved by the Consistent Iterative Hard Thresholding (CIHT) algorithm, but the performance of restoration will decrease when the clipping degree is large, so, an algorithm based on adaptive threshold is proposed. The method estimates automatically the clipping degree, and the factor of the clipping degree is adjusted in the algorithm according to the degree of clipping. Compared with the CIHT algorithm and the Consistent Dictionary Learning (CDL) algorithm, the performance of restoration by the proposed algorithm is much better than the other two, especially in the case of severe clipping distortion. Compared with CDL, the computational complexity of the proposed algorithm is low like CIHT, compared with CDL, it has faster processing speed, which is beneficial to the practicality of the algorithm.
- Audio signal processing /
- Clipping distortion /
- Adaptive threshold /
- Consistent Iterative Hard Thresholding (CIHT)

HTML全文

图 1 信号截幅失真示意图

下载: 全尺寸图片幻灯片

图 2 测试数据散点图

下载: 全尺寸图片幻灯片

图 3 测试数据拟合图

下载: 全尺寸图片幻灯片

图 4 失真示意图

下载: 全尺寸图片幻灯片

图 5 函数曲线图

下载: 全尺寸图片幻灯片

图 6 截幅语音信号修复后的SDR提升对比图

下载: 全尺寸图片幻灯片

图 7 截幅音乐信号修复后的SDR提升对比图

下载: 全尺寸图片幻灯片

图 8 截幅语音信号修复后波形对比图

下载: 全尺寸图片幻灯片

图 9 截幅音乐信号修复后波形对比图

下载: 全尺寸图片幻灯片

表 1 截幅语音信号1修复前后PESQ得分比较

输入语音 SDR (dB)	截幅语音	CIHT算法修复后	CDL算法修复后	ACIHT算法修复后
2	1.8838	2.0877	2.1201	2.2400
4	2.2041	2.4411	2.4860	2.6039
6	2.3451	2.6239	2.6247	2.8551
8	2.5084	2.7974	2.7608	3.1258
10	2.6576	3.0501	3.0715	3.2847
12	2.7951	3.2538	3.3020	3.6657
14	2.9858	3.4915	3.5057	3.8716
16	3.1098	3.6881	3.6223	4.1016
18	3.2984	3.8203	3.7167	4.2174
20	3.4128	4.1934	4.2224	4.2440

下载: 导出CSV

表 2 截幅语音信号2修复前后PESQ得分比较

输入语音 SDR (dB)	截幅语音	CIHT算法修复后	CDL算法修复后	ACIHT算法修复后
2	1.7080	1.8980	1.9802	2.2026
4	1.9977	2.3065	2.3451	2.5566
6	2.2115	2.6041	2.6537	2.7818
8	2.3900	2.8904	2.9176	3.0617
10	2.5946	3.1397	3.2116	3.3242
12	2.7625	3.4662	3.4546	3.4784
14	2.9359	3.7844	3.8324	3.9217
16	3.2481	3.9820	4.0386	4.0463
18	3.3362	4.2150	4.1375	4.3005
20	3.4004	4.3186	4.2845	4.3500

下载: 导出CSV

参考文献(14)

JANSSEN A, VELDHUIS R, and VRIES L. Adaptive interpolation of discrete-time signals that can be modeled as autoregressive processes[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1986, 34(2): 317–330 doi: 10.1109/TASSP.1986.1164824

ABEL J S and ABEL J S. Restoring a clipped signal[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, 1991, 3: 1745–1748.

SIMON J G, PATRICK J, and WILLIAM N W. Statistical model-based approaches to audio restoration and analysis[J]. Journal of New Music Research, 2001, 30(4): 323–338 doi: 10.1076/jnmr.30.4.323.7489

ADLER A, EMIYA V, and JAFARI M G. Audio Inpainting[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(3): 922–932 doi: 10.1109/TASL.2011.2168211

ADLER A, EMIYA V, and JAFARI M G. A constrained matching pursuit approach to audio declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011: 329–332.

DEFRAENE B, MANSOUR N, and HERTOGH S D. Declipping of audio signals using perceptual compressed sensing[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(12): 2627–2637 doi: 10.1109/TASL.2013.2281570

FOUCART S and NEEDHAM T. Sparse recovery from saturated measurements[J]. Information and Inference: A Journal of the IMA, 2017, 6(2): 196–212 doi: 10.1093/imaiai/iaw020

OZEROV A, BILEN C, and PEREZ P. Multichannel audio declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Shanghai, China, 2016: 659–663.

KAI S, KOWALSKI M, and DORFLER M. Audio declipping with social sparsity[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Florence, Italy, 2014: 1577–1581.

KITIC S, JACQUES L, and MADHU N. Consistent iterative hard thresholding for signal declipping[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada, 2013: 5939–5943.

RENCKER L, BACH F, WANG Wenwu, et al. Consistent dictionary learning for signal declipping[C]. International Conference on Latent Variable Analysis and Signal Separation, Guildford, UK, 2018: 446–455.

LECUE G and FOUCART S. An IHT algorithm for sparse recovery from sub-exponential measurements[J]. IEEE Signal Processing Letters, 2017, 24(3): 1280–1283 doi: 10.1109/LSP.2017.2721500

HINES A, SKOGLUND J, and KOKARAM A. Robustness of speech quality metrics to background noise and network degradations: Comparing ViSQOL, PESQ and POLQA[C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vancouver, Canada, 2013: 3697–3701.

何孝月. 基于EPESQ的VoIP语音质量评估的研究与实现[D]. [硕士论文], 中南大学, 2008.

HE Xiaoyue. Speech Quality Evaluation of VoIP Based on EPESQ[D]. [Master dissertation], Central South University, 2008.

施引文献

资源附件(0)

访问统计