基于优化的正交匹配追踪声音事件识别

李应; 陈秋菊

doi:10.11999/JEIT160120

基于优化的正交匹配追踪声音事件识别

doi: 10.11999/JEIT160120 cstr: 32379.14.JEIT160120

李应^1, ,,
陈秋菊¹

基金项目:

国家自然科学基金(61075022)

计量
- 文章访问数: 1446
- HTML全文浏览量: 188
- PDF下载量: 372
- 被引次数: 0
出版历程
- 收稿日期: 2016-01-26
- 修回日期: 2016-12-06
- 刊出日期: 2017-01-19

Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit

LI Ying^{1
, ,},
CHEN Qiuju¹

Funds:

The National Natural Science Foundation of China (61075022)

摘要

摘要: 针对各种环境声对声音事件识别的影响，该文提出一种基于优化的正交匹配追踪(Orthogonal Matching Pursuit, OMP)声音事件识别方法。首先，利用OMP稀疏分解并重构声音信号，保留声音信号的主体部分，减小噪声的影响。其中，使用粒子群(Particle Swarm Optimization, PSO)算法优化搜索最优原子，实现OMP的快速稀疏分解。接着，对重构声音信号提取Mel频率倒谱系数(Mel-Frequency Cepstral Coefficients, MFCCs)，与OMP时-频特征和基频(PITCH)特征，组成优化OMP的复合特征。最后，通过优化OMP复合特征，使用随机森林(Random Forests, RF)对40种声音事件在不同环境不同信噪比下进行识别。实验结果表明，优化OMP复合特征结合RF的方法能有效地识别各种环境下的声音事件。
- 声音事件识别 /
- 正交匹配追踪 /
- 稀疏分解 /
- 粒子群优化 /
- 随机森林
Abstract: A sound event recognition method based on optimized Orthogonal Matching Pursuit (OMP) is proposed for decreasing the influence of sound event recognition on various environments. Firstly, OMP is used for sparse decomposition and reconstruction of sound signal to decrease the influence of noise and reserve the main body of sound signal, where Particle Swarm Optimization (PSO) is adopted to accelerate the best atom searching in the process of sparse decomposition. Then, an optimized composited feature of Mel-Frequency Cepstral Coefficients (MFCCs), time-frequency OMP feature, and PITCH feature is extracted from reconstructed signal. Finally, Random Forests (RF) classifier is employed to recognize 40 classes of sound events in different environments and Signal-to-Noise Rates (SNRs). The experiment result shows that the proposed method can effectively recognize sound events in various environments.
- Sound event recognition /
- Orthogonal Matching Pursuit (OMP) /
- Sparse decomposition /
- Particle Swarm Optimization (PSO) /
- Random Forests (RF)

HTML全文

参考文献(28)

MALIK H. Acoustic environment identification and its applications to audio forensics[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1827-1837. doi: 10.1109/tifs.2013.2280888.

HEITTOL T, MESAROS A, VIRTANEN T, et al. Sound event detection in multisource environments using source separation[C]. CHiME 2011 Workshop on Machine Listening in Multisource Environments, Florence, Italy, 2011: 36-40.

SHI Z, HAN J, ZHENG T, et al. Identification of objectionable audio segments based on pseudo and heterogeneous mixture models[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 611-623. doi: 10.1109/tasl.2012.2229980.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. An adaptive framework for acoustic monitoring of potential hazards[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2009, 2009(1): 1-15. doi: 10.1155/2009/594103.

ZHAO H and MALIK H. Audio recording location identification using acoustic environment signature[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1746-1759. doi: 10.1109/tifs.2013.2278843.

VARGHEES V N and RAMACHANDRAN K I. A novel heart sound activity detection framework for automated heart sound analysis[J]. Biomedical Signal Processing and Control, 2014, 13: 174-188. doi: 10.1016/j.bspc.2014.05.002.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. On acoustic surveillance of hazardous situations[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, China, 2009: 165-168. doi: 10.1109/icassp. 2009.4959546.

MCLOUGHLIN I, ZHANG H, XIE Z, et al. Robust sound event classification using deep neural networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(3): 540-552. doi: 10.1109/taslp.2015.2389618.

SHARAN R V and MOIR T J. Robust audio surveillance using spectrogram image texture feature[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Australia, 2015: 1956-1960. doi: 10.1109/icassp.2015.7178312.

DENNIS J, TRAN H D, and CHNG E S. Image feature representation of the subband power distribution for robust sound event classification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(2): 367-377. doi: 10.1109/tasl.2012.2226160.

颜鑫, 李应. 利用抗噪幂归一化倒谱系数的鸟类声音识别[J]. 电子学报, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

YAN X and LI Y. Anti-noise power normalized cepstral coefficients in bird sounds recognition[J]. Acta Electronica Sinica, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

LI Y and WU Z. Animal sound recognition based on double feature of spectrogram in real environment[C]. IEEE International Conference on Wireless Communications Signal Processing, Nanjing, China, 2015: 1-5. doi: 10.1109/ wcsp.2015.7341003.

CHANG K M and LIU S H. Gaussian noise filtering from ECG by Wiener filter and ensemble empirical mode decomposition[J]. Journal of Signal Processing Systems, 2011, 64(2): 249-264. doi: 10.1007/s11265-009-0447-z.

LEE Y K, JUNG G W, and KWON O W. Speech enhancement by Kalman filtering with a particle filter-based preprocessor[C]. IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, 2013: 340-341. doi: 10.1109/ice.2013.6486919.

VERMA N and VERMA A K. Real time adaptive denoising of musical signals in wavelet domain[C]. Nirma University International Conference on Engineering, Ahmedabad, India, 2012: 1-5. doi: 10.1109/nuicone.2012.649323.

周晓敏, 李应. 基于 Radon 和平移不变性小波变换的鸟类声音识别[J]. 计算机应用, 2014, 34(5): 1391-1396. doi: 10. 11772/j.issn.1001-9081.2014.05.1391.

ZHOU X and LI Y. Bird sounds recognition based on Radon and translation invariant discrete wavelet transform[J]. Journal of Computer Applications, 2014, 34(5): 1391-1396. doi: 10.11772/j.issn.1001-9081.2014.05.1391.

CHU S, NARAYANAN S, and KUO C C J. Environmental sound recognition with time-frequency audio features[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(6): 1142-1158. doi: 10.1109/tasl.2009. 2017438.

WANG J C, LIN C H, CHEN B W, et al. Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation[J]. IEEE Transactions on Automation Science and Engineering, 2014, 11(2): 607-613. doi: 10.1109/tase.2013.2285131.

MALLAT S G and ZHANG Z. Matching pursuits with time-frequency dictionaries[J]. IEEE Transactions on Signal Processing, 1993, 41(12): 3397-3415. doi: 10.1109/78.258082.

SOUSSEN C, GRIBONVAL R, IDIER J, et al. Joint k-step analysis of orthogonal matching pursuit and orthogonal least squares[J]. IEEE Transactions on Information Theory, 2013, 59(5): 3158-3174. doi: 10.1109/tit.2013.2238606.

BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324.

KENNEDY J. Particle Swarm Optimization[M]. Washington, US: Springer, 2011: 760-766. doi: 10.1007/978-0-387-30164- 8_630.

马超, 邓超, 熊尧, 等. 一种基于混合遗传和粒子群的智能优化算法[J]. 计算机研究与发展, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

MA C, DENG C, XIONG Y, et al. An intelligent optimization algorithm based on hybrid of GA and PSO[J]. Computer Research and Development, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

LI S and FANG L. Signal denoising with random refined orthogonal matching pursuit[J]. IEEE Transactions on Instrumentation and Measurement, 2012, 61(1): 26-34. doi: 10.1109/tim.2011.2157547.

CHANG C C and LIN C J. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1-27. doi: 10.1145/1961189. 1961199.

施引文献

资源附件(0)

访问统计