Advanced Search
Volume 39 Issue 2
Feb.  2017
Turn off MathJax
Article Contents
ZHOU Weili, HE Qianhua, WANG Yalou, PANG Wenfeng. Adapted Stopping Residue Error Based Sparse Representation for Speech Denoising[J]. Journal of Electronics & Information Technology, 2017, 39(2): 309-315. doi: 10.11999/JEIT160369
Citation: ZHOU Weili, HE Qianhua, WANG Yalou, PANG Wenfeng. Adapted Stopping Residue Error Based Sparse Representation for Speech Denoising[J]. Journal of Electronics & Information Technology, 2017, 39(2): 309-315. doi: 10.11999/JEIT160369

Adapted Stopping Residue Error Based Sparse Representation for Speech Denoising

doi: 10.11999/JEIT160369
Funds:

The National Natural Science Foundation of China (61571192), The Science and Technology Foundation of Guangdong Province (2015A010103003)

  • Received Date: 2016-04-18
  • Rev Recd Date: 2016-08-25
  • Publish Date: 2017-02-19
  • A sparse representation speech denoising method based on adapted stopping residue error is proposed. Firstly, an over complete dictionary of the clean speech power spectrum is learned by the K-Singular Value Decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error is adaptively achieved according to the estimated cross terms and the noise spectrum which is adjusted by a weighted factor, and the Orthogonal Matching Pursuit (OMP) approach is applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech is re-synthesis via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the standard spectral subtraction, sparse representation based speech denoising algorithm and the AutoRegressive Hidden Markov Model (AR-HMM) based speech denoising method in terms of subjective and objective measure.
  • loading
  • BABY D, VIRTANEN T, GEMMEKE J F, et al. Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(11): 1788-1799. doi: 10.1109/TASLP.2015.2450491.
    ZHOU W L and HE Q H. Non-intrusive speech quality objective evaluation in high-noise environments[C]. IEEE China Summit and International Conference on Signal and Information Processing, Chengdu, China, 2015: 50-54. doi: 10.1109/ChinaSIP.2015.7230360.
    KODRASI I, MARQUARDT D, and DOCLO S. Curvature-based optimization of the trade-off parameter in the speech distortion weighted multichannel wiener filter[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Australia, 2015: 315-319. doi: 10.1109/ICASSP.2015.7177982.
    MARTIN R. Noise power spectral density estimation based on optimal smoothing and minimum statistics[J]. IEEE Transactions on Speech and Language Processing, 2001, 9(5): 504-512. doi: 10.1109/89.928915.
    GERKMANN T. MMSE-optimal enhancement of complex speech coefficients with uncertain prior knowledge of the clean speech phase[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy, 2014: 4478-4482. doi: 10.1109/ICASSP.2014.6854449.
    DAVID Y and KLEIJN W B. HMM-based gain modeling for enhancement of speech in noise[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(3): 882-892. 10.1109/TASL.2006.885256.
    EVANA N, MASON J, LIU W, et al. An assessment on the fundamental limitations of spectral subtraction[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Toulous, France, 2006: 145-148. doi: 10.1109/ ICASSP.2006.1659978.
    HILMAN F, KOJI I, and KOICHI S. Feature normalization based on non-extensive statistics for speech recognition[J]. Speech Communication, 2013, 55(5): 587-599. doi: 10.1016/ j.specom.2013.02.004.
    HSIEH C T, HUANG P Y, CHEN Y H, et al. Speech enhancement based on sparse representation under color noisy environment[C]. International Symposium on Intelligent Signal Processing and Communication Systems, Nusa Dua, Indonesia, 2015: 134-138. doi: 10.1109/ISPACS. 2015.7432752.
    孙林慧, 杨震. 基于数据驱动字典和稀疏表示的语音增强[J]. 信号处理, 2011, 27(12): 1793-1800.
    SUN L H and YANG Z. Speech enhancement based on datadriven dictionary and sparse representation[J]. Signal Processing, 2011, 27(12): 1793-1800.
    ZHAO Y P, ZHAO X H, and WANG B. A speech enhancement method employing sparse representation of power spectral density[J]. Journal of Information and Computational Science, 2013, 10(6): 1705-1714.
    ZHAO N, XU X, and YANG Y. Sparse representations for speech enhancement[J]. Chinese Journal of Electronics, 2011, 19(2): 268-272.
    SIGG C D, DIKK T, and BUHMANN J M. Speech enhancement using generative dictionary learning[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(6): 1698-1712. doi: 10.1109/TASL.2012.2187194.
    ZHAO Y P and WANG B. A speech enhancement method based on sparse reconstruction of power spectral density [J]. Computers Electrical Engineering, 2014, 40(4): 1705-1714. doi: 10.1016/j.compeleceng.2013.12.007.
    LOIZOU P C. Speech Enhancement: Theory and Practice [M]. Florida, US: CRC Press, 2013: 104-106.
    RANGACHARI S and LOIZOU P. A noise estimation algorithm for highly nonstationary environments[J]. Speech Communication, 2006, 48(2): 220-231. doi: 10.1016/ j.specom.2006.08.005.
    BEROUTI M, SCHWARTZ M, and MAKHOUL J. Enhancement of speech corrupted by acoustic noise[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Washington, US, 1979: 4478-4482. doi: 10.1109/ ICASSP.1979.1170788.
    CHANG L H and WU J Y. An improved RIP-based performance guarantee for sparse signal recovery via orthogonal matching pursuit[J]. IEEE Transactions on Information Theory, 2014, 60(9): 5702-5715. doi: 10.1109/ TIT.2014.2338314.
    AHARON M and ELAD M. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. doi: 10.1109/TSP.2006. Signal 881199.
    ITU-T. P.862-2001. Perceptual evaluation of speech quality (PESQ): An objective method for end to end speech quality assessment of narrow-band telephone networks and speech codecs[S]. Geneva, ITU-T, 2001.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (1511) PDF downloads(508) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return