Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit

LI Ying; CHEN Qiuju

doi:10.11999/JEIT160120

Volume 39 Issue 1

Jan. 2017

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2017 > 39(1): 183-190

LI Ying, CHEN Qiuju. Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit[J]. Journal of Electronics & Information Technology, 2017, 39(1): 183-190. doi: 10.11999/JEIT160120

Citation:

LI Ying, CHEN Qiuju. Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit[J]. Journal of Electronics & Information Technology, 2017, 39(1): 183-190. doi: 10.11999/JEIT160120

LI Ying, CHEN Qiuju. Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit[J]. Journal of Electronics & Information Technology, 2017, 39(1): 183-190. doi: 10.11999/JEIT160120

Citation:

LI Ying, CHEN Qiuju. Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit[J]. Journal of Electronics & Information Technology, 2017, 39(1): 183-190. doi: 10.11999/JEIT160120

PDF( 1424 KB)

Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit

doi: 10.11999/JEIT160120 cstr: 32379.14.JEIT160120

LI Ying^{1
,
,},
CHEN Qiuju¹

Funds:

The National Natural Science Foundation of China (61075022)

Received Date: 2016-01-26
Rev Recd Date: 2016-12-06
Publish Date: 2017-01-19

Abstract

Abstract

A sound event recognition method based on optimized Orthogonal Matching Pursuit (OMP) is proposed for decreasing the influence of sound event recognition on various environments. Firstly, OMP is used for sparse decomposition and reconstruction of sound signal to decrease the influence of noise and reserve the main body of sound signal, where Particle Swarm Optimization (PSO) is adopted to accelerate the best atom searching in the process of sparse decomposition. Then, an optimized composited feature of Mel-Frequency Cepstral Coefficients (MFCCs), time-frequency OMP feature, and PITCH feature is extracted from reconstructed signal. Finally, Random Forests (RF) classifier is employed to recognize 40 classes of sound events in different environments and Signal-to-Noise Rates (SNRs). The experiment result shows that the proposed method can effectively recognize sound events in various environments.
- Sound event recognition,
- Orthogonal Matching Pursuit (OMP),
- Sparse decomposition,
- Particle Swarm Optimization (PSO),
- Random Forests (RF)

FullText(HTML)

References(28)

References

MALIK H. Acoustic environment identification and its applications to audio forensics[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1827-1837. doi: 10.1109/tifs.2013.2280888.

HEITTOL T, MESAROS A, VIRTANEN T, et al. Sound event detection in multisource environments using source separation[C]. CHiME 2011 Workshop on Machine Listening in Multisource Environments, Florence, Italy, 2011: 36-40.

SHI Z, HAN J, ZHENG T, et al. Identification of objectionable audio segments based on pseudo and heterogeneous mixture models[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 611-623. doi: 10.1109/tasl.2012.2229980.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. An adaptive framework for acoustic monitoring of potential hazards[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2009, 2009(1): 1-15. doi: 10.1155/2009/594103.

ZHAO H and MALIK H. Audio recording location identification using acoustic environment signature[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1746-1759. doi: 10.1109/tifs.2013.2278843.

VARGHEES V N and RAMACHANDRAN K I. A novel heart sound activity detection framework for automated heart sound analysis[J]. Biomedical Signal Processing and Control, 2014, 13: 174-188. doi: 10.1016/j.bspc.2014.05.002.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. On acoustic surveillance of hazardous situations[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, China, 2009: 165-168. doi: 10.1109/icassp. 2009.4959546.

MCLOUGHLIN I, ZHANG H, XIE Z, et al. Robust sound event classification using deep neural networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(3): 540-552. doi: 10.1109/taslp.2015.2389618.

SHARAN R V and MOIR T J. Robust audio surveillance using spectrogram image texture feature[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Australia, 2015: 1956-1960. doi: 10.1109/icassp.2015.7178312.

DENNIS J, TRAN H D, and CHNG E S. Image feature representation of the subband power distribution for robust sound event classification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(2): 367-377. doi: 10.1109/tasl.2012.2226160.

颜鑫, 李应. 利用抗噪幂归一化倒谱系数的鸟类声音识别[J]. 电子学报, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

YAN X and LI Y. Anti-noise power normalized cepstral coefficients in bird sounds recognition[J]. Acta Electronica Sinica, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

LI Y and WU Z. Animal sound recognition based on double feature of spectrogram in real environment[C]. IEEE International Conference on Wireless Communications Signal Processing, Nanjing, China, 2015: 1-5. doi: 10.1109/ wcsp.2015.7341003.

CHANG K M and LIU S H. Gaussian noise filtering from ECG by Wiener filter and ensemble empirical mode decomposition[J]. Journal of Signal Processing Systems, 2011, 64(2): 249-264. doi: 10.1007/s11265-009-0447-z.

LEE Y K, JUNG G W, and KWON O W. Speech enhancement by Kalman filtering with a particle filter-based preprocessor[C]. IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, 2013: 340-341. doi: 10.1109/ice.2013.6486919.

VERMA N and VERMA A K. Real time adaptive denoising of musical signals in wavelet domain[C]. Nirma University International Conference on Engineering, Ahmedabad, India, 2012: 1-5. doi: 10.1109/nuicone.2012.649323.

周晓敏, 李应. 基于 Radon 和平移不变性小波变换的鸟类声音识别[J]. 计算机应用, 2014, 34(5): 1391-1396. doi: 10. 11772/j.issn.1001-9081.2014.05.1391.

ZHOU X and LI Y. Bird sounds recognition based on Radon and translation invariant discrete wavelet transform[J]. Journal of Computer Applications, 2014, 34(5): 1391-1396. doi: 10.11772/j.issn.1001-9081.2014.05.1391.

CHU S, NARAYANAN S, and KUO C C J. Environmental sound recognition with time-frequency audio features[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(6): 1142-1158. doi: 10.1109/tasl.2009. 2017438.

WANG J C, LIN C H, CHEN B W, et al. Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation[J]. IEEE Transactions on Automation Science and Engineering, 2014, 11(2): 607-613. doi: 10.1109/tase.2013.2285131.

MALLAT S G and ZHANG Z. Matching pursuits with time-frequency dictionaries[J]. IEEE Transactions on Signal Processing, 1993, 41(12): 3397-3415. doi: 10.1109/78.258082.

SOUSSEN C, GRIBONVAL R, IDIER J, et al. Joint k-step analysis of orthogonal matching pursuit and orthogonal least squares[J]. IEEE Transactions on Information Theory, 2013, 59(5): 3158-3174. doi: 10.1109/tit.2013.2238606.

BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324.

KENNEDY J. Particle Swarm Optimization[M]. Washington, US: Springer, 2011: 760-766. doi: 10.1007/978-0-387-30164- 8_630.

马超, 邓超, 熊尧, 等. 一种基于混合遗传和粒子群的智能优化算法[J]. 计算机研究与发展, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

MA C, DENG C, XIONG Y, et al. An intelligent optimization algorithm based on hybrid of GA and PSO[J]. Computer Research and Development, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

LI S and FANG L. Signal denoising with random refined orthogonal matching pursuit[J]. IEEE Transactions on Instrumentation and Measurement, 2012, 61(1): 26-34. doi: 10.1109/tim.2011.2157547.

CHANG C C and LIN C J. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1-27. doi: 10.1145/1961189. 1961199.

Relative Articles

Supplements(0)

Cited By

Proportional views