Advanced Search
Volume 37 Issue 6
Jun.  2015
Turn off MathJax
Article Contents
Wang Ming-he, Zhang Er-hua, Tang Zhen-min, Xu Hao. Voice Activity Detection Based on Fisher Linear Discriminant Analysis[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122
Citation: Wang Ming-he, Zhang Er-hua, Tang Zhen-min, Xu Hao. Voice Activity Detection Based on Fisher Linear Discriminant Analysis[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122

Voice Activity Detection Based on Fisher Linear Discriminant Analysis

doi: 10.11999/JEIT141122
  • Received Date: 2014-08-29
  • Rev Recd Date: 2014-12-19
  • Publish Date: 2015-06-19
  • Traditional Voice Activity Detection (VAD) approaches can not effectively detect consonant as well as noisy unvoiced consonant. To address this problem, this paper proposes a VAD approach Mel Frequency Cepstrum Coefficient (F-MFCC) based on Fisher linear discriminant analysis, in consideration of two-class issue regarding to consonant and background noise. Fisher criterion rule is used to solve the optimal projection vector, building upon which we can minimize the within-class scatter can be minimized and the between-class scatter can be maximized, as a result to enhance separability between consonant and background noise. Extensive experiments are conducted to evaluate the F-MFCC performance. The results demonstrate that, under different SNR and noise conditions, the proposed approach achieves higher VAD accuracy.
  • loading
  • Junqua J C. Robustness and cooperative multi-model man-machine communication applications[C]. The Structure of Multimodal Dialogue, Maratea, Italy, 1991: 101-112.
    ETSI. Universal Mobile Telecommunication Systems (UMTS); Mandatory Speech Codec speech processing functions, AMR speech codec; Voice Activity Detector VAD[S]. ETSI TS 126 094 v11.0.0(2012-10): 1-26.
    Wan Yu-long, Wang Xian-liang, Zhou Ruo-hua, et al.. Enhanced voice activity detection based on automatic segmentation and event classification[J]. Journal of Computational Information Systems, 2014, 10(10): 4169-4177.
    宫朝辉, 刁麓弘. 改进共振峰提取的语音端点检测[J]. 计算机辅助设计与图形学学报, 2013, 25(8): 1230-1236.
    Gong Zhao-hui and Diao Lu-hong. Improved speech endpoint detection based on formant[J]. Journal of Computer Aided Design Computer Graphics, 2013, 25(8): 1230-1236.
    李晔, 张仁志, 崔慧娟, 等. 低信噪比下基于谱熵的语音端点检测算法[J]. 清华大学学报(自然科学版), 2005, 45(10): 1397-1440.
    Li Ye, Zhang Ren-zhi, Cui Hui-juan, et al.. Voice activity detection algorithm with low signal-to-noise ratios based on the spectrum entropy[J]. Journal of Tsinghua University (Science and Technology), 2005, 45(10): 1397-1440.
    Chen Shi-huang and Wang Jhing-fa. A wavelet-based voice activity detection algorithm in noisy environments[C]. Proceedings of the 9th IEEE International Conference on Electmnics, Circuits and Systems, Dubrovnik, Croatia, 2002: 995-998.
    Ghosh P K, Tsiartas A, and Narayanan S. Robust voice activity detection using long-term signal variability[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(3): 600-613.
    王宏志, 徐玉超, 李美静. 基于Mel频率倒谱参数相似度的语音端点检测算法[J]. 吉林大学学报(工学版), 2012, 42(5): 1331-1335.
    Wang Hong-zhi, Xu Yu-chao, and Li Mei-jing. Voice activity detection algorithm based on Mel-frequency cepstrum coefficient (MFCC) similarity[J]. Journal of Jilin University (Engineering and Technology Edition), 2012, 42(5): 1331-1335.
    Oh Sang-yeob and Chung Kyung-yong. Improvement of speech detection using ERB feature extraction[J]. Wireless Personal Communications, 2014, 79(4): 2439-2451.
    卢志茂, 金辉, 张春祥, 等. 基于HHT和OSF的复杂环境语音端点检测[J]. 电子与信息学报, 2012, 34(1): 213-217.
    Lu Zhi-mao, Jin Hui, Zhang Chun-xiang, et al.. Voice activity detection in complex environment based on Hilbert-Huang transform and order statistics filter[J]. Journal of Electronics Information Technology, 2012, 34(1): 213-217.
    Deng Shi-wen and Han Ji-qing. Statistical voice activity detection based on sparse representation over learned dictionary[J]. Digital Signal Processing, 2013, 23(4): 1228-1232.
    Zhang Yan, Tang Zhen-min, Li Yan-ping, et al.. A hierarchical framework approach for voice activity detection and speech enhancement[J]. The Scientific World Journal, 2014, Vol. 2014: Article ID 723643, 8 pages.
    Choi Jae-hun and Chang Joon-hyuk. Dual-microphone voice activity detection technique based on two-step power level difference ratio[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(6): 1069-1081.
    Ryant N, Liberman M, and Yuan Jia-hong. Speech activity detection on YouTube using deep neural networks[C]. Interspeech: 14th Annual Conference of the International Speech Communication Association, Lyon, France, 2013: 728-731.
    Fisher R A. The use of multiple measures in taxonomic problems[J]. Annals of Eugenics, 1936, 7(2): 179-188.
    Mak M W and Yu H B. A study of voice activity detection techniques for NIST speaker recognition evaluations[J]. Computer Speech Language, 2014, 28(1): 295-313.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (1644) PDF downloads(860) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return