Citation: | Tianliang LIU, Qingwei QIAO, Junwei WAN, Xiubin DAI, Jiebo LUO. Human Action Recognition via Spatio-temporal Dual Network Flow and Visual Attention Fusion[J]. Journal of Electronics & Information Technology, 2018, 40(10): 2395-2401. doi: 10.11999/JEIT171116 |
IKIZLER-CINBIS N and SCLAROFF S, Object, scene and actions: Combining multiple features for human action recognition[C]. European Conference on Computer Vision, Heraklion, Crete, Greece, 2010, 6311: 494–507.
|
WANG Heng, KLASER A, and SCHMID C. Action recognition by dense trajectories[C]. IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2011: 3169–3176.
|
张良, 鲁梦梦, 姜华. 局部分布信息增强的视觉单词描述与动作识别[J]. 电子与信息学报, 2016, 38(3): 549–556 doi: 10.11999/JEIT150410
ZHANG Liang, LU Mengmeng, and JIANG Hua. An improved scheme of visual words description and action recognition using local enhanced distribution information[J]. Journal of Electronics&Information Technology, 2016, 38(3): 549–556 doi: 10.11999/JEIT150410
|
SHARMA S, KIROS R and SALAKHUTDINOV R. Action recognition using visual attention[C]. International Conference on Neural Information Processing Systems Times Series Workshop, Montreal, Canada, 2015: 1–11.
|
SCHMIDHUBER J. Deep learning in neural networks: An overview[J]. Neural Networks, 2015, 61: 85–1117 doi: 10.1016/j.neunet.2014.09.003
|
RENSINK R A. The dynamic representation of scenes[J]. Visual Cognition, 2000, 1(1/3): 17–42.
|
XU Kelvin, BA Jimmy, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]. Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 2015, 14: 77–81.
|
BAHDANAU D, CHO K, and BENGIO Y. Neural machine translation by jointly learning to align and translate[C]. International Conference on Learning Representation, San Diego, USA, 2015: 1–15.
|
MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[C]. Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: 2204–2212.
|
BA Jimmy Lei, GROSSE R, SALAKHUTDINOV R, et al. Learning wake-sleep recurrent attention models[C]. International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 2593–2601.
|
AND J S P. Horn-Schunck optical flow with a multi-scale strategy[J]. Image Processing on Line, 2013, 20: 151–172 doi: 10.5201/ipol.2013.20
|
RUSSAKOVSKY O, DENG Jia, SU Hao, et al. ImageNet: Large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252.
|
SZEGEDY Christian, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
|
ANDREJ K, JUSTIN J, and LI Feifei. Visualizing and understanding recurrent networks[C]. International Conference on Learning Representation Workshop, Caribe Hilton, USA, 2016: 1–11.
|
GOLDBERGER J, GORDON S, and GREENSPAN H. An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures[C]. IEEE International Conference on Computer Vision, Nice, France, 2003: 487–493.
|
SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929–1958.
|
KINGMA D P and BA J. Adam: A method for stochastic optimization[C]. International Conference on Learning Representation, San Diego, USA, 2015: 1–15.
|