高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于受限玻尔兹曼机的语音带宽扩展

王迎雪 赵胜辉 于莹莹 匡镜明

王迎雪, 赵胜辉, 于莹莹, 匡镜明. 基于受限玻尔兹曼机的语音带宽扩展[J]. 电子与信息学报, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
引用本文: 王迎雪, 赵胜辉, 于莹莹, 匡镜明. 基于受限玻尔兹曼机的语音带宽扩展[J]. 电子与信息学报, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
WANG Yingxue, ZHAO Shenghui, YU Yingying, KUANG Jingming. Speech Bandwidth Extension Based on Restricted Boltzmann Machines[J]. Journal of Electronics & Information Technology, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
Citation: WANG Yingxue, ZHAO Shenghui, YU Yingying, KUANG Jingming. Speech Bandwidth Extension Based on Restricted Boltzmann Machines[J]. Journal of Electronics & Information Technology, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034

基于受限玻尔兹曼机的语音带宽扩展

doi: 10.11999/JEIT151034

Speech Bandwidth Extension Based on Restricted Boltzmann Machines

  • 摘要: 语音带宽扩展是为了提高语音质量,利用语音低频和高频之间的相关性重构语音高频的一种技术。高斯混合模型法是语音带宽技术中被广泛应用的一种方法,但是,由于该方法假设语音高频、低频服从高斯分布,且只表征了语音低频、高频之间的线性关系,从而导致合成的高频语音出现失真。因此,该文提出一种基于受限玻尔兹曼机的方法,该方法利用两个高斯伯努利受限玻尔兹曼机提取语音低频和高频中蕴含的高阶统计特性;并利用前馈神经网络将语音低频高阶统计特性参数映射为高频高阶统计特性参数。这样,通过提取语音低频和高频中蕴含的高阶统计特性,该方法可以深层挖掘语音高频和语音低频之间的实际关系,从而更加准确地模拟频谱包络分布,合成质量更高的语音。客观测试、主观测试结果表明,该方法性能优于传统的高斯混合模型方法。
  • BAUER P, ABEL J, FISCHER V, et al. Automatic recognition of wideband telephone speech with limited amount of matched training data[C]. Proceedings of the 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2013: 1232-1236.
    GANDHIMATHI G and JAYAKUMAR S. Speech enhancement using an artificial bandwidth extension algorithm in multicast conferencing through cloud services[J]. Information Technology Journal, 2014, 13(12): 1953-1960. doi: 10.3923/itj.2014.1953.1960.
    YOSHIDA Y and ABE M. An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping[C]. Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, 1994: 1591-1594.
    WANG Yingxue, ZHAO Shenghui, et al. Superwideband extension for AMR-WB using conditional codebooks[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 3695-3698.
    NAKATOH Yoshihisa, TSUSHIMA Mineo, NORIMATSU Takeshi, et al. Generation of broadband speech from narrowband speech using on linear mapping[J]. Electronics and Communications in Japan, Part 2 (Electronics), 2002, 85(8): 44-53. doi: 10.1002/ecjb.10065.
    DUY N D, SUZUKI M, MINEMSTSU N, et al. Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span features[C]. INTERSPEECH, Lyon, France, 2013: 3453-3457.
    PARK K Y and KIM H S. Narrowband to wideband conversion of speech using GMM based transformation[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Istanbul, Turkey, 2000: 1843-1846.
    PULAKKA H, REMES U, PALOMAKI K, et al. Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011: 5100-5103.
    JAX P and VARY P. Artificial bandwidth extension of speech signals using mmse estimation based on a hidden markov model[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Hong Kong, 2003: 680-683.
    BAUER P, ABEL J, et al. HMM-based artificial bandwidth extension supported by neural networks[C]. 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), Juan-les-Pins, France, 2014: 1-5.
    LIU Haojie, BAO Changchun, and LIU Xin. Spectral envelope estimation used for audio bandwidth extension based on RBF neural network[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013: 543-547.
    LI K and LEE C H. A deep neural network approach to speech bandwidth expansion[C]. Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015: 4395-4399.
    SEO H, KANG H G, and SOONG F. A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 6087-6091.
    LIU Xin and BAO Changchun. Audio bandwidth extension based on temporal smoothing cepstral coefficients[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014, 2014(1): 1-16.
    OHTANI Y, AMURA M, ORITA M, et al. GMM-based bandwidth extension using sub-band basis spectrum model[C]. Fifteenth Annual Conference of the International Speech Communication Association, Singapore, 2014: 2489-2493.
    ACKLEY D H, HINTON G E, et al. A learning algorithm for Boltzmann machines[J]. Cognitive Science, 1985, 9(1): 147-169. doi: 10.1207/s15516709cog0901_7.
    MOHAME A, DAHL G E, and HINTON G E. Acoustic modeling using deep belief networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 14-22.
    HINTON G E. Training products of experts by minimizing contrastive divergence[J]. Neural Computation, 2002, 14(8): 1771-1800.
    HINTON G E and SALAKHUTDINOV R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
    com/products e/speech, 1994.
    MAKINEN J, BESSETTE B, BRUHN S, et al. AMR-WB+: A new audio coding standard for 3rd generation mobile audio services[C]. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania, USA, 2005: 1109-1112.
    张勇, 胡瑞敏. 基于高斯混合模型的语音带宽扩展算法的研究[J]. 声学学报, 2009, 34(5): 471-480.
    ZHANG Yong and HU Ruimin. Speech bandwidth extension based on Gaussian mixture model[J]. Acta Acustica, 2009, 34(5): 471-480.
    NOUR-ELDIN AMR H and KABAL P. Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech[C]. INTERSPEECH, Brisbane, Australia, 2008: 53-56.
  • 加载中
计量
  • 文章访问数:  1481
  • HTML全文浏览量:  117
  • PDF下载量:  723
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-09-14
  • 修回日期:  2016-03-03
  • 刊出日期:  2016-07-19

目录

    /

    返回文章
    返回