高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种视角无关的时空关联深度视频行为识别方法

吴培良 杨霄 毛秉毅 孔令富 侯增广

吴培良, 杨霄, 毛秉毅, 孔令富, 侯增广. 一种视角无关的时空关联深度视频行为识别方法[J]. 电子与信息学报, 2019, 41(4): 904-910. doi: 10.11999/JEIT180477
引用本文: 吴培良, 杨霄, 毛秉毅, 孔令富, 侯增广. 一种视角无关的时空关联深度视频行为识别方法[J]. 电子与信息学报, 2019, 41(4): 904-910. doi: 10.11999/JEIT180477
Peiliang WU, Xiao YANG, Bingyi MAO, Lingfu KONG, Zengguang HOU. A Perspective-independent Method for Behavior Recognition in Depth Video via Temporal-spatial Correlating[J]. Journal of Electronics & Information Technology, 2019, 41(4): 904-910. doi: 10.11999/JEIT180477
Citation: Peiliang WU, Xiao YANG, Bingyi MAO, Lingfu KONG, Zengguang HOU. A Perspective-independent Method for Behavior Recognition in Depth Video via Temporal-spatial Correlating[J]. Journal of Electronics & Information Technology, 2019, 41(4): 904-910. doi: 10.11999/JEIT180477

一种视角无关的时空关联深度视频行为识别方法

doi: 10.11999/JEIT180477
基金项目: 国家自然科学基金(61305113),河北省自然科学基金(F2016203358),中国博士后基金(2018M631620),燕山大学博士基金(BL18007)
详细信息
    作者简介:

    吴培良:男,1981年生,副教授,研究方向为家庭服务机器人行为识别与学习、功用性认知

    杨霄:男,1993年生,硕士生,研究方向为家庭服务机器人行为识别

    毛秉毅:男,1964年生,副研究员,研究方向为家庭服务机器人

    孔令富:男,1957年生,教授,研究方向为智能机器人系统、智能信息处理

    侯增广:男,1969年生,研究员,研究方向为机器人与智能系统、康复机器人与微创介入手术机器人

    通讯作者:

    毛秉毅 ysdxmby@163.com

  • 中图分类号: TP242.6+2

A Perspective-independent Method for Behavior Recognition in Depth Video via Temporal-spatial Correlating

Funds: The National Natural Science Foundation of China (61305113), The Natural Science Foundation of Hebei Province (F2016203358), China Postdoctoral Science Foundation (2018M631620), The Doctoral Fund of Yanshan University (BL18007)
  • 摘要:

    当前行为识别方法在不同视角下的识别准确率较低,该文提出一种视角无关的时空关联深度视频行为识别方法。首先,运用深度卷积神经网络的全连接层将不同视角下的人体姿态映射到与视角无关的高维空间,以构建空间域下深度行为视频的人体姿态模型(HPM);其次,考虑视频序列帧之间的时空相关性,在每个神经元激活的时间序列中分段应用时间等级池化(RP)函数,实现对视频时间子序列的编码;然后,将傅里叶时间金字塔(FTP)算法作用于每一个池化后的时间序列,并加以连接产生最终的时空特征表示;最后,在不同数据集上,基于不同方法进行了行为识别分类测试。实验结果表明,该文方法(HPM+RP+FTP)提高了不同视角下深度视频识别准确率,在UWA3DII数据集中,比现有最好方法高出18%。此外,该文方法具有较好的泛化性能,在MSR Daily Activity3D数据集上得到82.5%的准确率。

  • 图  1  整体模型框架

    图  2  本文采用的CNN模型结构

    图  3  比较2种方法对于特点动作的识别准确率

    图  4  MSR Daily Activity3D数据集16种动作的混淆矩阵

    表  1  UWA3D Multiview ActivityII数据集的动作识别准确性(%)

    训练视角V1&V2V1&V3V1&V4V2&V3V2&V4V3&V4平均准确率
    测试视角V3V4V2V4V2V3V1V4V1V3V1V2
    文献[6]45.040.435.136.934.736.049.529.357.135.449.029.339.8
    文献[7]49.442.834.639.738.144.853.333.553.641.256.732.643.4
    文献[18]52.751.859.057.542.844.258.138.463.243.866.348.052.2
    文献[17]60.161.357.165.161.666.870.659.573.259.372.554.563.5
    HPM(fc7)+RP80.274.969.976.449.263.871.459.980.776.984.468.471.3
    HPM(fc7)+FTP80.680.575.282.0 65.4 72.077.367.083.6 81.083.674.176.9
    HPM(fc6)+RP+FTP83.981.3 74.8 82.066.272.8 78.8 70.0 83.379.1 85.9 75.9 77.8
    HPM(fc7)+RP+FTP85.8 81.6 76.3 80.561.776.5 78.1 71.5 82.981.7 85.9 76.3 78.3
    注:V1, V2, V3, V4分别表示正面视角、左侧视角、右侧视角、顶部视角
    下载: 导出CSV

    表  2  几种方法对MSR Daily Activity3D的准确率(%)

    方法准确率
    文献[19]79.1
    文献[20]54.0
    文献[21]68.0
    文献[22]73.8
    HPM(fc7)+RP60.0
    HPM(fc7)+FTP79.9
    HPM(fc6)+RP+FTP81.3
    HPM(fc7)+RP+FTP82.5
    下载: 导出CSV
  • ZHOU Yang, NI Bingbing, HONG Richang, et al. Interaction part mining: A mid-level approach for fine-grained action recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 3323–3331. doi: 10.1109/CVPR.2015.7298953.
    WANG Jiang, NIE Xiaohan, XIA Yin, et al. Cross-view action modeling, learning, and recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 2649–2656. doi: 10.1109/CVPR.2014.339.
    LIU Peng and YIN Lijun. Spontaneous thermal facial expression analysis based on trajectory-pooled fisher vector descriptor[C]. IEEE International Conference on Multimedia and Expo, Hong Kong, China, 2017: 835–840. doi: 10.1109/ICME.2017.8019315.
    YANG Xiaodong and TIAN Yingli. Super normal vector for activity recognition using depth sequences[C]. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 804–811. doi: 10.1109/CVPR.2014.108.
    ZHANG Baochang, YANG Yun, CHEN Chen, et al. Action recognition using 3D histograms of texture and a multi-class boosting classifier[J]. IEEE Transactions on Image Processing, 2017, 26(10): 4648–4660 doi: 10.1109/TIP.2017.2718189
    YIN Xiaochuan and CHEN Qijun. Deep metric learning autoencoder for nonlinear temporal alignment of human motion[C]. IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016: 2160–2166. doi: 10.1109/ICRA.2016.7487366.
    SHAHROUDY A, LIU Jun, NG T, et al. NTU RGB+D: A large scale dataset for 3D human activity analysis[C]. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1010–1019. doi: 10.1109/CVPR.2016.115.
    KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1725–1732. doi: 10.1109/CVPR.2014.223.
    HAIDER F, CAMPBELL N, and LUZ S. Active speaker detection in human machine multiparty dialogue using visual prosody information[C]. IEEE Global Conference on Signal and Information Processing, Washington, D.C., USA, 2016: 1207–1211. doi: 10.1109/GlobalSIP.2016.7906033.
    SIMONYAN K and ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[J]. Advances in Neural Information Processing Systems, 2014, 1(4): 568–576 doi: 10.1002/14651858.CD001941.pub3
    TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]. IEEE International Conference on Computer Vision, Honolulu, USA, 2015: 4489–4497. doi: 10.1109/ICCV.2015.510.
    DONAHUE J, HENDRICKS L A, ROHRBACH M, et al. Long-term recurrent convolutional networks for visual recognition and description[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 677–691 doi: 10.1109/TPAMI.2016.2599174
    GUPTA S, GIRSHICK R, ARBELEZ P, et al. Learning rich features from RGB-D images for object detection and segmentation[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 345–360. doi: 10.1007/978-3-319-10584-0_23.
    FERNANDO B, GAVVES E, ORAMAS J, et al. Rank pooling for action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 773–787 doi: 10.1109/TPAMI.2016.2558148
    WANG Jiang, LIU Zicheng, WU Ying, et al. Learning actionlet ensemble for 3D human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(5): 914–927 doi: 10.1109/TPAMI.2013.198
    RAHMANI H and MIAN A. 3D action recognition from novel viewpoints[C]. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1506–1515. doi: 10.1109/CVPR.2016.167.
    RAHMANI H and MIAN A. Learning a non-linear knowledge transfer model for cross-view action recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 2458–2466. doi: 10.1109/CVPR.2015.7298860.
    RAHMANI H, MAHMOOD A, HUYNH D Q, et al. HOPC: Histogram of oriented principal components of 3D pointclouds for action recognition[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 742–757. doi: 10.1007/978-3-319-10605-2_48.
    JALAL A, KAMAL S, and KIM D. A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments[J]. Sensors, 2014, 14(7): 11735–11759 doi: 10.3390/s140711735
    MULLER M and RODER T. Motion templates for automatic classification and retrieval of motion capture data[C]. ACM Siggraph/eurographics Symposium on Computer Animation, Vienna, Austria, 2006: 137–146. doi: 10.1145/1218064.1218083.
    WANG Jiang, LIU Zicheng, WU Ying, et al. Mining actionlet ensemble for action recognition with depth cameras[C]. IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 1290–1297. doi: 10.1007/978-3-319-04561-0_2.
    CAVAZZA J, ZUNINO A, BIAGIO M S, et al. Kernelized covariance for action recognition[C]. International Conference on Pattern Recognition, Cancun, Mexico, 2016: 408–413. doi: 10.1109/ICPR.2016.7899668.
  • 加载中
图(4) / 表(2)
计量
  • 文章访问数:  1780
  • HTML全文浏览量:  542
  • PDF下载量:  67
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-05-21
  • 修回日期:  2018-12-04
  • 网络出版日期:  2018-12-14
  • 刊出日期:  2019-04-01

目录

    /

    返回文章
    返回