Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method

SHEN Xizhong; ZHENG Xiaoxiu

doi:10.11999/JEIT171014

Volume 40 Issue 7

Jul. 2018

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2018 > 40(7): 1612-1618

SHEN Xizhong, ZHENG Xiaoxiu. Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method[J]. Journal of Electronics & Information Technology, 2018, 40(7): 1612-1618. doi: 10.11999/JEIT171014

Citation:

SHEN Xizhong, ZHENG Xiaoxiu. Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method[J]. Journal of Electronics & Information Technology, 2018, 40(7): 1612-1618. doi: 10.11999/JEIT171014

Citation:

PDF( 2647 KB)

Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method

doi: 10.11999/JEIT171014 cstr: 32379.14.JEIT171014

SHEN Xizhong ZHENG Xiaoxiu

Funds:

Foundation of Shanghai Science and Technology Commission of Shanghai Municipality (15ZR1440700)

Received Date: 2017-10-30
Rev Recd Date: 2018-04-11
Publish Date: 2018-07-19

Abstract

Abstract

In recent years, Teager energy operator is proposed as a kind of nonlinear method characterized with tracking a time-varying signal. The operator is combined with empirical mode decomposition, and a new method of voice activity detection is proposed to find the best voice start point and end point. Empirical Mode Decomposition (EMD) is further exploited and some valid choice conditions are constructed to choose the valid intrinsic mode functions. Thus, the method is able to deal with the voice with noise. Also, the character of the single mode of empirical mode decomposition meets the demand of single frequency component required by Teager Energy Operator (TEO). At last, Hilbert transform is added to solve the inherent problem of the mode mixing due to empirical mode decomposition. Based on the above consideration, the proposed method can identify the unvoiced sound with noise, which is better than the direct TEO and double threshold method. Experiments show the validity of the proposed method.
- Voice Activity Detection (VAD)、Teager Energy Operator (TEO)、Empirical Mode Decomposition (EMD)、Intrinsic Mode Function (IMF)、Hilbert transform,

FullText(HTML)

References(18)

References

[2] KUMAR J and JENA P. Solution to fault detection during power swing using Teager-Kaiser Energy Operator[J]. Arabian Journal for Science and Engineering, 2017, 42(12): 5003-5013.

胡航. 现代语音信号处理[M]. 北京: 电子工业出版社, 2014: 30-48.

[3] BHOWMICK A, CHANDRA M, and BISWAS A. Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition[J]. International Journal of Speech Technology, 2017(4): 1-15.

HAN Xiaohuan and JING Xinxing. Speech endpoint detection based on power spectrum diference and Teager energy operator[J]. Computer Application and Software, 2011, 28(4): 82-83.

LI Jie, ZHOU Ping, and DU Zhiran. Application of short-time TEO energy in noisy speech endpoint[J]. Computer Engineering and Applications, 2013, 49(12): 144-147. doi: 10.3778/j.issn.1002-8331.1110-0479.

WANG Maorong, ZHOU Ping, JING Xinxing, et al. Voice activity detection algorithm based on Mel-TEO in noisy environment[J]. Microelectronics & Computer, 2016, 33(4): 46-49. doi: 10.19304/j.cnki.issn1000-7180.2016.04.010.

WANG Minghe, ZHANG Erhua, TANG Zhenmin, et al. Voice activity detection based on Fisher linear discriminant analysis[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.

LI Ye, ZHANG Renzhi, CUI Huijuan, et al. Voice activity detection with low signal-to-noise rations based on the spectrum entropy[J]. Journal of Tsinghua University (Science and Technology), 2005, 45(10): 1397-1440.

LIU Huan, WANG Jun, LIN Qiguang, et al. A novel speech activity detection algorithm based on the fusion of time and frequency domain features[J]. Journal of Jiangsu University of Science and Technology(Natural Science Edition), 2017, 31(1): 73-78. doi: 10.3969/j.issn.1673-4807.2017.01.014.

[10] WAN Yulong, WANG Xianliang, ZHOU Ruohua, et al. Enhanced voice activity detection based on automatic segmentation and event classification[J]. Journal of Computational Information Systems, 2014, 10(10): 4169-4177.

[11] GHOSH P K, TSIARTAS A, and NARAYANAN S. Robust voice activity detection using long-term signal variability[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(3): 600-613.

LU Zhimao, JIN Hui, ZHANG Chunxiang, et al. Voice activity detection in complex environment based on Hilbert-Huang transform and order statistics filter[J]. Journal of Electronics & Information Technology, 2012, 34(1): 213-217. doi: 10.3724/SP.J.1146.2011.0047.

[13] CHOI Jaehun and CHANG Joonhyuk. Dual-microphone voice activity detection technique based on two-step power level difference ratio[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(6): 1069-1081.

[14] TEAGER H and TEAGER S. Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract[M]. Springer, 1990: 241-261.

[15] KAISER J F. On a simple algorithm to calculate the energy of a signal[C]. IEEE International Conference on Acoustics, New York, USA, 1990: 381-384.

[16] HUANG N E, SHEN Z, LONG S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis[J]. Proceedings: Mathematical, Physical and Engineering Sciences, 1998, 454(1971): 903–995.

[17] KIRBAS I and PEKER M. Signal detection based on empirical mode decomposition and Teager-Kaiser energy operator and its application to P and S wave arrival time detection in seismic signal analysis[J]. Neural Computing and Applications, 2017, 28(10): 3035-3045.

ZHENG Jinde, CHENG Junsheng, and YANG Yu. Modified EEMD algorithm and its application[J]. Journal of Vibration and Shock, 2013, 32(21): 21-26.

Relative Articles

Supplements(0)

Cited By

Proportional views