Spatial-temporal Stream Anomaly Detection Based on Bayesian Fusion
-
摘要:
针对直接利用卷积自编码网络未考虑视频时间信息的问题,该文提出基于贝叶斯融合的时空流异常行为检测模型。空间流模型采用卷积自编码网络对视频单帧进行重构,时间流模型采用卷积长短期记忆(LSTM)编码-解码网络对短期光流序列进行重构。接着,分别计算空间流模型和时间流模型下每帧的重构误差,设计自适应阈值对重构误差图进行二值化,并基于贝叶斯准则对空间流和时间流下的重构误差进行融合,得到融合重构误差图,并在此基础上进行异常行为判断。实验结果表明,该算法在UCSD和Avenue视频库上的检测效果优于现有异常检测算法。
Abstract:Focusing on the problem that convolutional auto-encoder network based anomaly detection ignores time information, a novel anomaly detection model based on Bayesian fusion of spatial-temporal stream is proposed. A convolution auto-encoder network is used in spatial stream model to reconstructs video frames, and a convolutional Long Short-Term Memory (LSTM) encoder-decoder network is used to reconstruct short-term optical sequence in the temporal stream model. Then, the reconstruction errors under spatial and temporal stream are calculated separately. Meanwhile, an adaptive thresholds is designed to obtain the reconstruction binary error maps. Finally, the Bayesian fusion strategy is developed to combine the reconstruction error of spatial and temporal stream to obtain the final fusion reconstruction error map based on which the abnormal behavior can be determined. Experimental results show that the proposed algorithm is superior to the existing anomaly detection algorithms in UCSD and Avenue datasets.
-
Key words:
- Anomaly detection /
- Bayesian fusion /
- Spatial-temporal stream
-
LU Cewu, SHI Jianping, and JIA Jiaya. Abnormal event detection at 150 FPS in MATLAB[C]. Proceedings of 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 2013: 2720–2727. WEN Hui, GE Shiming, CHEN Shuixian, et al. Abnormal event detection via adaptive cascade dictionary learning[C]. Proceedings of 2015 IEEE International Conference on Image Processing, Quebec, Canada, 2015: 847–851. GUO Huiwen, WU Xinyu, CAI Shibo, et al. Quaternion discrete cosine transformation signature analysis in crowd scenes for abnormal event detection[J]. Neurocomputing, 2016, 204: 106–115. doi: 10.1016/j.neucom.2015.07.153 SABOKROU M, FATHY M, HOSEINI M, et al. Real-time anomaly detection and localization in crowded scenes[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, USA, 2015: 56–62. SABOKROU M, FAYYAZ M, FATHY M, et al. Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes[J]. Computer Vision and Image Understanding, 2018, 172: 88–97. doi: 10.1016/j.cviu.2018.02.006 XU Dan, YAN Yan, RICCI E, et al. Detecting anomalous events in videos by learning deep representations of appearance and motion[J]. Computer Vision and Image Understanding, 2016, 156: 117–127. doi: 10.1016/j.cviu.2016.10.010 DIMOKRANITOU A. Adversarial autoencoders for anomalous event detection in images[D]. [Ph.D. dissertation], Purdue University, 2017. 袁静, 章毓晋. 融合梯度差信息的稀疏去噪自编码网络在异常行为检测中的应用[J]. 自动化学报, 2017, 43(4): 604–610. doi: 10.16383/j.aas.2017.c150667YUAN Jing and ZHANG Yujin. Application of sparse denoising auto encoder network with gradient difference information for abnormal action detection[J]. Acta Automatica Sinica, 2017, 43(4): 604–610. doi: 10.16383/j.aas.2017.c150667 HASAN M, CHOI J, NEUMANN J, et al. Learning temporal regularity in video sequences[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 733–742. doi: 10.1109/CVPR.2016.86. CHONG Y S and YONG H T. Abnormal event detection in videos using spatiotemporal autoencoder[C]. Proceedings of the 14th International Symposium on Neural Networks, Hokkaido, Japan, 2017: 189–196. FEICHTENHOFER C, PINZ A, and ZISSERMAN A. Convolutional two-stream network fusion for video action recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1933–1941. doi: 10.1109/CVPR.2016.213. SHI Xingjian, CHEN Zhourong, WANG Hao, et al. Convolutional LSTM network: A machine learning approach for precipitation nowcasting[C]. Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 802–810. LIU Ce, FREEMAN W T, ADELSON E H, et al. Human-assisted motion annotation[C]. Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, USA, 2008: 1–8. XIE Yulin, LU Huchuan, and YANG M H. Bayesian saliency via low and mid level cues[J]. IEEE Transactions on Image Processin, 2013, 22(5): 1689–1698. doi: 10.1109/TIP.2012.2216276 LI Xiaohui, LU Huchuan, ZHANG Lihe, et al. Saliency detection via dense and sparse reconstruction[C]. Proceedings of 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 2013: 2976–2983. MAHADEVAN V, LI Weixin, BHALODIA V, et al. Anomaly detection in crowded scenes[C]. Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 1975–1981. DUCHI J, HAZAN E, and SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. The Journal of Machine Learning Research, 2011, 12: 2121–2159. GLOROT X and BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010: 249–256. WANG Tian and SNOUSSI H. Histograms of optical flow orientation for abnormal events detection[C]. Proceedings of 2013 IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, Clearwater, USA, 2013: 45–52. LUO Weixin, LIU Wen, and GAO Shenghua. Remembering history with convolutional LSTM for anomaly detection[C]. Proceedings of 2017 IEEE International Conference on Multimedia and Expo, Hong Kong, China, 2017: 439–444. IONESCU R T, SMEUREANU S, ALEXE B, et al. Unmasking the abnormal events in video[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2895–2903. LUO Weixin, LIU Wen, and GAO Shenghua. A revisit of sparse coding based anomaly detection in stacked RNN framework[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 341–349.