高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双模板Siamese网络的鲁棒视觉跟踪算法

侯志强 陈立琳 余旺盛 马素刚 范九伦

侯志强, 陈立琳, 余旺盛, 马素刚, 范九伦. 基于双模板Siamese网络的鲁棒视觉跟踪算法[J]. 电子与信息学报, 2019, 41(9): 2247-2255. doi: 10.11999/JEIT181018
引用本文: 侯志强, 陈立琳, 余旺盛, 马素刚, 范九伦. 基于双模板Siamese网络的鲁棒视觉跟踪算法[J]. 电子与信息学报, 2019, 41(9): 2247-2255. doi: 10.11999/JEIT181018
Zhiqiang HOU, Lilin CHEN, Wangsheng YU, Sugang MA, Jiulun FAN. Robust Visual Tracking Algorithm Based on Siamese Network with Dual Templates[J]. Journal of Electronics & Information Technology, 2019, 41(9): 2247-2255. doi: 10.11999/JEIT181018
Citation: Zhiqiang HOU, Lilin CHEN, Wangsheng YU, Sugang MA, Jiulun FAN. Robust Visual Tracking Algorithm Based on Siamese Network with Dual Templates[J]. Journal of Electronics & Information Technology, 2019, 41(9): 2247-2255. doi: 10.11999/JEIT181018

基于双模板Siamese网络的鲁棒视觉跟踪算法

doi: 10.11999/JEIT181018
基金项目: 国家自然科学基金(61473309, 61703423)
详细信息
    作者简介:

    侯志强:男,1973年生,教授,博士生导师,研究方向为图像处理、计算机视觉

    陈立琳:女,1989年生,硕士生,研究方向为计算机视觉、目标跟踪和深度学习

    余旺盛:男,1985年生,博士,研究方向为计算机视觉、图像处理,模式识别

    马素刚:男,1982年生,博士生,研究方向为计算机视觉、机器学习

    范九伦:男,1964年生,教授,博士生导师,研究方向为模式识别、图像处理

    通讯作者:

    陈立琳 454525999@qq.com

  • 中图分类号: TP391.4

Robust Visual Tracking Algorithm Based on Siamese Network with Dual Templates

Funds: The National Natural Science Foundation of China (61473309, 61703423)
  • 摘要: 近年来,Siamese网络由于其良好的跟踪精度和较快的跟踪速度,在视觉跟踪领域引起极大关注,但大多数Siamese网络并未考虑模型更新,从而引起跟踪错误。针对这一不足,该文提出一种基于双模板Siamese网络的视觉跟踪算法。首先,保留响应图中响应值稳定的初始帧作为基准模板R,同时使用改进的APCEs模型更新策略确定动态模板T。然后,通过对候选目标区域与2个模板匹配度结果的综合分析,对结果响应图进行融合,以得到更加准确的跟踪结果。在OTB2013和OTB2015数据集上的实验结果表明,与当前5种主流跟踪算法相比,该文算法的跟踪精度和成功率具有明显优势,不仅在尺度变化、平面内旋转、平面外旋转、遮挡、光照变化情况下具有较好的跟踪效果,而且达到了46 帧/s的跟踪速度。
  • 图  1  SiameseFC网络框架

    图  2  基于Siamese网络下的双模板跟踪

    图  3  模板与搜索区域

    图  4  本文和5种算法的部分跟踪结果对比

    图  5  OTB2013和OTB2015成功率和精度

    表  1  $\text{λ} $取值对精度、成功率的影响(OTB2015)

    $\lambda $0.500.600.700.800.8500.901.001.10
    成功率0.4470.5130.5870.6030.6140.6050.5850.591
    精度0.6420.6970.7420.7790.7930.7610.7610.774
    下载: 导出CSV

    表  2  基于Siamese网络下的双模版跟踪算法

     输入: 图像序列: I1, I2, In; 初始目标位置: ${P_0} = ({x_0},{y_0})$, 初始目标大小: ${s_0} = ({w_0},{h_0})$
     输出: 预估目标位置: ${P_{\rm{e}}} = ({x_{\rm{e}}},{y_{\rm{e}}})$, 预估目标大小: ${s_{\rm{e}}} = ({w_{\rm{e}}},{h_{\rm{e}}})$.
     for t=1, 2,···,n, do:
     步骤1  跟踪目标
     (1) 以上一帧中心位置${P_{t{\rm{ - 1}}}}$裁剪第t帧中的感兴趣区域ROI,放大为搜索区域;
     (2) 提取基准模板R,动态模板T和搜索区域的特征;
     (3) 使用式(4)计算两个模板特征与搜索区域特征的相似性,得到结果响应图,响应图中最高响应点即为预估目标位置。
     步骤2  模型更新
     (1) 使用式(5)计算跟踪置信度${\rm{APCEs}}$;
     (2) 计算${F_{{\rm{max}}}}$和${\rm{APCEs}}$的平均值${\rm{m}}{{\rm{F}}_{{\rm{max}}}}$和${\rm{mAPCEs}}$;
     (3) 如果${F_{{\rm{max}}}}{\rm{ > }}\lambda {\rm{m}}{{\rm{F}}_{{\rm{max}}}}$且${\rm{APCEs}} > \lambda {\rm{mAPCEs}}$,更新动态模板T
     Until图像序列的结束。
    下载: 导出CSV

    表  3  不同属性下算法的跟踪成功率对比结果

    算法SV(64)OPR(63)IPR(51)OCC(49)DEF(44)FM(39)IV(38)BC(31)MB(29)OV(14)LR(9)
    本文算法0.5770.5960.5950.6130.5730.6070.6050.5770.6330.5380.460
    SiameseFC0.5530.5490.5790.5640.5100.5690.5500.5720.5250.4670.584
    SiameseFC_3S0.5520.5580.5570.5670.5060.5680.5680.5230.5500.5060.618
    SRDCF0.5610.5500.5440.5690.5440.5970.6130.5830.5950.4600.514
    Staple0.5250.5350.5520.5610.5540.5370.5980.5740.5460.4810.459
    MEEM0.4700.5260.5290.4950.4890.5420.5170.5190.5570.4880.382
    下载: 导出CSV

    表  4  不同属性下算法的跟踪精度对比结果

    算法SV(64)OPR(63)IPR(51)OCC(49)DEF(44)FM(39)IV(38)BC(31)MB(29)OV(14)LR(9)
    本文算法0.7810.7960.8150.8110.8040.8160.8010.7700.7490.7170.878
    SiameseFC0.7320.7440.7800.7200.6900.7350.7110.7480.6540.6150.805
    SiameseFC_3S0.7350.7570.7420.7220.6900.7430.7360.6900.7050.6690.900
    SRDCF0.7450.5710.7450.7350.7340.7690.7920.7750.7670.5970.765
    Staple0.7270.7380.7700.7260.7480.6970.7920.7660.7080.6610.695
    MEEM0.7360.7950.7940.7410.7540.7520.7400.7460.7310.6850.808
    下载: 导出CSV

    表  5  本文算法与5种算法跟踪速度对比

    本文算法SiameseFCSiameseFC_3SSRDCFStapleMEEM
    CodeM+CM+CM+CM+CM+CM+C
    PlatformFPSGPU46(Y)GPU58(Y)GPU86(Y)GPU5(N)CPU80(Y)CPU10(N)
    下载: 导出CSV
  • 侯志强, 韩崇昭. 视觉跟踪技术综述[J]. 自动化学报, 2006, 32(4): 603–617.

    HOU Zhiqiang and HAN Chongzhao. A survey of visual tracking[J]. Acta Automatica Sinica, 2006, 32(4): 603–617.
    WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226
    HE Anfeng, LUO Chong, TIAN Xinmei, et al. A twofold Siamese network for real-time object tracking[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4834–4843.
    TAO Ran, GAVVES E, and SMEULDERS A W M. Siamese instance search for tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1420–1429.
    BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]. 2016 European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 850–865.
    WANG Qiang, TENG Zhu, XING Junliang, et al. Learning attentions: Residual attentional Siamese network for high performance online visual tracking[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4854–4863.
    ZHU Zheng, WU Wei, ZOU Wei, et al. End-to-end flow correlation tracking with spatial-temporal attention[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 548–557.
    VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5000–5008.
    GUO Qing, FENG Wei, ZHOU Ce, et al. Learning dynamic Siamese network for visual object tracking[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 1781–1789.
    WANG Qiang, ZHANG Mengdan, XING Junliang, et al. Do not lose the details: Reinforced representation learning for high performance visual tracking[C]. 2018 International Joint Conferences on Artificial Intelligence, Stockholm, Swedish, 2018.
    LI Bo, YAN Junjie, WU Wei, et al. High performance visual tracking with Siamese region proposal network[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8971–8980.
    ZHU Zheng, WANG Qiang, LI Bo, et al. Distractor-aware Siamese networks for visual object tracking[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 103–119.
    RUSSAKOVSKY O, DENG Jia, SU Hao, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252. doi: 10.1007/s11263-015-0816-y
    REAL E, SHLENS J, MAZZOCCHI S, et al. YouTube-boundingboxes: A large high-precision human-annotated data set for object detection in video[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 7464–7473.
    HERMANS A, BEYER L, and LEIBE B. In defense of the triplet loss for person re-identification[EB/OL]. https://arxiv.org/abs/1703.07737, 2017.
    WU Yi, LIM J, and YANG M H. Online object tracking: A benchmark[C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013: 2411–2418.
    KRISTAN M, MATAS J, LEONARDIS A, et al. The visual object tracking VOT2015 challenge results[J]. 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 564–586.
    SMEULDERS A W M, CHU D M, CUCCHIARA R, et al. Visual tracking: An experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442–1468. doi: 10.1109/TPAMI.2013.230
    WANG Mengmeng, LIU Yong, and HUANG Zeyi. Large margin object tracking with circulant feature maps[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4800–4808.
    ZHANG Jianming, MA Shugao, and SCLAROFF S. MEEM: Robust tracking via multiple experts using entropy minimization[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 188–203.
    BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1401–1409.
    DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310–4318. doi: 10.1109/ICCV.2015.490.
    KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84–90. doi: 10.1145/3065386
    LI Bo, WU Wei, WANG Qiang, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[EB/OL]. https://arxiv.org/pdf/1812.11703.pdf, 2018.
  • 加载中
图(5) / 表(5)
计量
  • 文章访问数:  3861
  • HTML全文浏览量:  1898
  • PDF下载量:  102
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-11-06
  • 修回日期:  2019-05-29
  • 网络出版日期:  2019-06-12
  • 刊出日期:  2019-09-10

目录

    /

    返回文章
    返回