高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于关键特征信息感知和在线自适应掩模的孪生网络目标跟踪

何志伟 聂佳浩 杜晨杰 高明煜 董哲康

何志伟, 聂佳浩, 杜晨杰, 高明煜, 董哲康. 基于关键特征信息感知和在线自适应掩模的孪生网络目标跟踪[J]. 电子与信息学报, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296
引用本文: 何志伟, 聂佳浩, 杜晨杰, 高明煜, 董哲康. 基于关键特征信息感知和在线自适应掩模的孪生网络目标跟踪[J]. 电子与信息学报, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296
HE Zhiwei, NIE Jiahao, DU Chenjie, GAO Mingyu, DONG Zhekang. Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296
Citation: HE Zhiwei, NIE Jiahao, DU Chenjie, GAO Mingyu, DONG Zhekang. Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296

基于关键特征信息感知和在线自适应掩模的孪生网络目标跟踪

doi: 10.11999/JEIT210296
基金项目: 国家自然科学基金(61571394),浙江省重点研发项目(2020C03098)
详细信息
    作者简介:

    何志伟:男,1979年生,教授,博士生导师,研究方向为计算机视觉、汽车电子技术和电池管理技术

    聂佳浩:男,1998年生,硕士生,研究方向为计算机视觉、目标跟踪

    杜晨杰:男,1991年生,博士生,研究方向为计算机视觉、目标跟踪

    高明煜:男,1963年生,教授,博士生导师,研究方向为汽车电子技术和嵌入式系统应用

    董哲康:男,1989年生,副教授,硕士生导师,研究方向为深度学习和神经形态系统

    通讯作者:

    杜晨杰 ducj@hdu.edu.cn

  • 中图分类号: TN911.73; TP391.41

Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking

Funds: The National Natural Science Foundation of China (61571394), The Key R&D Program of Zhejiang Province (2020C03098)
  • 摘要: 近年来,孪生网络在视觉目标跟踪的应用给跟踪器性能带来了极大的提升,可以同时兼顾准确率和实时性。然而,孪生网络跟踪器的准确率在很大程度上受到限制。为了解决上述问题,该文基于通道注意力机制,创新地提出了关键特征信息感知模块来增强网络模型的判别能力,使网络聚焦于目标的卷积特征变化;在此基础上,该文还提出了一种在线自适应掩模策略,根据在线学习到的互相关层输出状态,自适应掩模后续帧,以此来突出前景目标。在OTB100, GOT-10k数据集上进行实验验证,所提跟踪器在不影响实时性的前提下,准确率相较于基准有了显著提升,并且在遮挡、尺度变化以及背景杂乱等复杂场景下具有鲁棒的跟踪效果。
  • 图  1  本文跟踪器框图

    图  2  关键特征信息感知模块结构

    图  3  互相关层输出的特征可视化

    图  4  在线自适应掩模示意图

    图  5  10种算法在OTB100数据集上的跟踪性能对比

    图  6  10种算法在GOT-10k数据集上的跟踪性能对比

    图  7  5种算法的跟踪实例对比

    表  1  训练模型在OTB100上的AUC性能(%)

    SiamFCModel1Model2Model3Model4Model5
    58.458.760.960.157.460.3
    下载: 导出CSV

    表  2  两种不同关键信息感知结构对比

    算法参数量(MB)运算量(Byte)准确率(%)fps
    SiamFC3.300.6974258.4131
    SiamFC(结构1)3.320.6974862.4114
    SiamFC(结构2)3.360.6975962.8101
    SiamFC-DW2.450.8082562.7154
    SiamFC-DW(结构1)2.490.8084164.3147
    SiamFC-DW(结构2)2.550.8087265.2132
    下载: 导出CSV

    表  3  在线自适应掩模

     输入:后续帧搜索图像f (255×255×3)
     输出:掩模后的图像$ {f^ * } $
     (1) 通过式(5)初始化前i帧搜索图像$ {\sigma _{{x_k}}} $和$ {\sigma _{{y_k}}} $,其中k=1,2,···,i;
     (2) 根据响应图(Response map)计算前i帧的平均峰值相关能量
       (APCE);
     (3) 由式(8)得到APCE1, APCE2, ···, APCEi
     (4) 通过当前帧的前i帧APCE的平均值和APCEi+1,计算得到
       εdiv
     (5) if (εdiv大于或等于$ {\tau _1} $)
     (6)  /*减弱高斯掩模程度*/
        $ {\sigma _{{x_{i + 1}}}} $=$ {\sigma _{{x_i}}}(1 + \mu ) $,$ {\sigma _{{y_{i + 1}}}} $=$ {\sigma _{{y_i}}}(1 + \mu ) $;
     (7) else if (εdiv小于或等于$ {\tau _2} $)
     (8)   /*增强高斯掩模程度*/
         $ {\sigma _{{x_{i + 1}}}} $=$ {\sigma _{{x_i}}}(1 - \mu ) $,$ {\sigma _{{y_{i + 1}}}} $=$ {\sigma _{{y_i}}}(1 - \mu ) $;
     (9) else
     (10)  /*高斯掩模程度保持不变*/
         $ {\sigma _{{x_{i + 1}}}} $=$ {\sigma _{{x_{i + 1}}}} $,$ {\sigma _{{y_{i + 1}}}} $=$ {\sigma _{{y_{i + 1}}}} $;
     (11) 计算$ f_{i + 1}^ * $=$ f_{i + 1}^ * \cdot G({x_{i + 1}},{y_{i + 1}}) $得到第i+1帧高斯掩模后的
        搜索图像;
     (12) 重复执行步骤(1)—步骤(11)得到后续每帧经过高斯掩模后
        的搜索图像$ {f^ * } $。
    下载: 导出CSV

    表  4  10种算法在OTB数据集上不同场景的AUC定量对比

    场景视频数KCFStapleSRDCFDeepSRDCFCFNetSINTSiamFC-DWSiamRPNSiamFC本文
    BC310.4760.5600.5830.6270.5610.5840.5740.5910.5270.611
    IV370.5100.5920.6130.6210.5410.6490.6320.6490.5720.643
    SV630.4330.5210.5610.6050.5460.6100.6130.6150.5560.614
    OCC480.4600.5430.5590.6010.5270.5950.6010.5850.5490.611
    DEF430.3890.5510.5440.5660.5260.5720.5600.6170.5120.597
    MB290.4270.5410.5940.6250.5400.6340.6540.6220.5540.642
    FM390.4170.5400.5970.6280.5540.6160.6300.5990.5710.621
    IPR510.4600.5480.5440.5890.5670.6160.6060.6270.5590.628
    OPR630.4350.5330.5500.6070.5530.6210.6120.6250.5610.628
    OV140.3520.4750.4600.5530.4540.5720.5900.5420.4750.587
    LR90.3480.3940.5140.5610.6140.6220.5960.6390.6180.636
    OTB1001000.4710.5780.5980.6350.5870.6250.6270.6290.5840.639
    注:加粗字体为每行最优,斜体为次优。
    下载: 导出CSV

    表  5  10种算法在GOT-10k数据集上性能对比

    算法AOSR0.5SR0.75
    KCF0.2030.1770.076
    Staple0.2460.2390.089
    SRDCF0.2360.2270.094
    DeepSRDCF0.2860.2750.096
    CFNet0.2930.2650.087
    SINT0.3470.3750.124
    SiamFC-DW0.3840.4010.118
    SiamRPN0.3670.4240.102
    SiamFC0.3260.3530.098
    本文0.4110.4920.175
    注:加粗字体为每列最优,斜体为次优。
    下载: 导出CSV

    表  6  本文算法在OTB100上的消融实验结果

    算法创新1创新2曲线下面积精度帧数
    SiamFCΟΟ0.5840.772131
    SiamFCv1ΠΟ0.6240.831114
    SiamFCv2ΟΠ0.6060.818127
    本文ΠΠ0.6390.861111
    下载: 导出CSV
  • [1] 谭建豪, 殷旺, 刘力铭, 等. 引入全局上下文特征模块的DenseNet孪生网络目标跟踪[J]. 电子与信息学报, 2021, 43(1): 179–186. doi: 10.11999/JEIT190788

    TAN Jianhao, YIN Wang, LIU Liming, et al. DenseNet-siamese network with global context feature module for object tracking[J]. Journal of Electronics &Information Technology, 2021, 43(1): 179–186. doi: 10.11999/JEIT190788
    [2] KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]. 2018 European Conference on Computer Vision, Munich, Germany, 2019: 3–53.
    [3] 李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057–2080. doi: 10.11834/jig.190372

    LI Xi, CHA Yufei, ZHANG Tianzhu, et al. Survey of visual tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057–2080. doi: 10.11834/jig.190372
    [4] HENRIQUE J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/tpami.2014.2345390
    [5] BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1401–1409.
    [6] DANELLJAN M, HÄGER G, KHAN F S K, et al. Learning spatially regularized correlation filters for visual tracking[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310–4318.
    [7] DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]. 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 621–629.
    [8] CHOPRA S, HADSELL R, and LECUN Y. Learning a similarity metric discriminatively, with application to face verification[C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 539–546.
    [9] TAO Ran, GAVVES E, and SMEULDERS A W M. Siamese instance search for tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1420–1429.
    [10] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]. 2016 European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 850–865.
    [11] VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5000–5008.
    [12] LI Bo, YAN Junjie, WU Wei, et al. High performance visual tracking with Siamese region proposal network[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8971–8980.
    [13] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J] IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149.
    [14] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/TPAMI.2015.2389824
    [15] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [16] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
    [17] LI Bo, WU Wei, WANG Qiang, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4277–4286.
    [18] ZHANG Zhipeng and PENG Houwen. Deeper and wider Siamese networks for real-time visual tracking[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4586–4595.
    [19] DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 4655–4664.
    [20] WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226
    [21] HUANG Lianghua, ZHAO Xin, and HUANG Kaiqi. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562–1577. doi: 10.1109/TPAMI.2019.2957464
    [22] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J] IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011–2023.
    [23] WANG Mengmeng, LIU Yong, and HUANG Zeyi. Large margin object tracking with circulant feature maps[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4800–4808.
  • 加载中
图(7) / 表(6)
计量
  • 文章访问数:  831
  • HTML全文浏览量:  714
  • PDF下载量:  110
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-04-13
  • 修回日期:  2021-11-02
  • 录用日期:  2021-11-02
  • 网络出版日期:  2021-12-22
  • 刊出日期:  2022-05-25

目录

    /

    返回文章
    返回