SREEKUMAR K T, GEORGE K K, ARUNRAJ K, et al. Spectral matching based voice activity detector for improved speaker recognition[C]. 2014 International Conference on Power Signals Control and Computations (EPSCICON), Thrissur, 2014: 1-4. doi: 10.1109/EPSCICON.2014.6887507.
|
DUTA C L, GHEORGHE L, and TAPUS N. Real time implementation of MELP speech compression algorithm using Blackfin processors[C]. 2015 9th International Symposium on Image and Signal Processing and Analysis (ISPA), Zagreb, 2015: 250-255. doi: 10.1109/ISPA.2015. 7306067.
|
CHUL Y I, HYEONTAEK L, and DONGSUK Y. Formant-based robust voice activity detection[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(12): 2238-2245. doi: 10.1109/TASLP. 2015.2476762.
|
JONGSEO S, NAM SOO K, and WONYONG S. A statistical model-based voice activity detection[J]. IEEE Signal Processing Letters, 1999, 6(1): 1-3. doi: 10.1109/97. 736233.
|
DUK C Y, AL-NAIMI K, and KONDOZ A. Improved voice activity detection based on a smoothed statistical likelihood ratio[C]. 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, 2001: 737-740. doi: 10.1109/ICASSP.2001.941020.
|
RAMIREZ J, SEGURA J, BENITEZ C, et al. Statistical voice activity detection using a multiple observation likelihood ratio test[J]. IEEE Signal Process Letters, 2005, 12(10): 689-692. doi: 10.1109/LSP.2005.855551.
|
RAMIREZ J, SEGURA J C, GORRIZ J M, et al. Improved voice activity detection using contextual multiple hypothesis testing for robust speech recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(8): 2177-2189. doi: 10.1109/TASL.2007.903937.
|
ICK K S, HAING J Q, and HYUK C J. Discriminative weight training for a statistical model-based voice activity detection[J]. IEEE Signal Processing Letters, 2008, 15: 170-173. doi: 10.1109/LSP.2007.913595.
|
YOUNGJOO S and HOIRIN K. Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection[J]. Signal Processing Letters, 2012, 19(8): 507-510. doi: 10.1109/LSP.2012.2204978.
|
FERRONI G, BONFIGLI R, PRINCIPI E, et al. A deep neural network approach for voice activity detection in multi-room domestic scenarios[C]. 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, 2015: 1-8. doi: 10.1109/IJCNN.2015.7280510.
|
INYOUNG H and JOON HYUK C. Voice activity detection based on statistical model employing deep neural network[C]. 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2014: 582-585. doi: 10.1109/IIH-MSP.2014.150.
|
TAN Yingwei, LIU Wenju, WEI J, et al. Hybrid SVM/HMM architectures for statistical model-based voice activity detection[C]. 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, 2014: 2875-2878. doi: 10.1109/ IJCNN.2014.6889403.
|
何伟俊, 贺前华, 刘杨. 基于子带保留似然比的鲁棒语音激活检测算法[J]. 华中科技大学学报(自然科学版), 2015, 43(11): 78-82. doi: 10.13245/j.hust.151115.
|
HE Weijun, HE Qianhua, and LIU Yang. Sub-band reserved likelihood ratio-based robust voice activity detection[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2015, 43(11): 78-82. doi: 10.13245/ j.hust.151115.
|
PEARLMAN W A and GRAY R M. Source coding of the discrete Fourier transform[J]. IEEE Transactions on Information Theory, 1978, 24(6): 683-692. doi: 10.1109/TIT. 1978.1055950.
|
GERKMANN T and HENDRIKS R C. Unbiased MMSE-based noise power estimation with low complexity and low tracking delay[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(4): 1383-1393. doi: 10.1109/TASL.2011.2180896.
|
EPHRAIM Y and MALAH D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator[J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1984, 32(6): 1109-1121. doi: 10.1109/ TASSP.1984.1164453.
|
赵力. 语音信号处理[M]. 第2版, 北京: 机械工业出版社, 2009: 38-39.
|
ZHAO Li. Speech Signal Processing[M]. Second edition, Beijing: China Machine Press, 2009: 38-39.
|
MOUSAZADEH S and COHEN I. Voice activity detection in presence of transient noise using spectral clustering[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(6): 1261-1271. doi: 10.1109/TASL.2013.2248717.
|
PETSATODIS T, BOUKIS C, and TALANTZIS F. Convex combination of multiple statistical models with application to VAD[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(8): 2314-2327. doi: 10.1109/TASL.2011. 2131131.
|