Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking

HE Zhiwei; NIE Jiahao; DU Chenjie; GAO Mingyu; DONG Zhekang

doi:10.11999/JEIT210296

Volume 44 Issue 5

May 2022

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2022 > 44(5): 1714-1722

HE Zhiwei, NIE Jiahao, DU Chenjie, GAO Mingyu, DONG Zhekang. Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296

Citation:

HE Zhiwei, NIE Jiahao, DU Chenjie, GAO Mingyu, DONG Zhekang. Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1714-1722. doi: 10.11999/JEIT210296

Citation:

PDF( 7020 KB)

Siamese Object Tracking Based on Key Feature Information Perception and Online Adaptive Masking

doi: 10.11999/JEIT210296 cstr: 32379.14.JEIT210296

HE Zhiwei^{1, 2},
NIE Jiahao^{1, 2},
DU Chenjie^{1, 2
,
,},
GAO Mingyu^{1, 2},
DONG Zhekang^{1, 3}

1.
School of Electronics Information, Hangzhou Dianzi University, Hangzhou 310018, China
2.
Zhejiang Provincial Key Laboratory of Equipment Electronics, Hangzhou 310018, China
3.
Department of Electrical Engineering, The Hong Kong Polytechnic University, HongKong 999077, China

Funds: The National Natural Science Foundation of China (61571394), The Key R&D Program of Zhejiang Province (2020C03098)

Received Date: 2021-04-13
Accepted Date: 2021-11-02
Rev Recd Date: 2021-11-02

Available Online: 2021-12-22

Publish Date: 2022-05-25

Abstract

Abstract

The application of Siamese network to visual object tracking has greatly improved the performance of the tracker recently, which can take both accuracy and speed into account. However, the accuracy of Siamese network tracker is limited to a great extent. In order to solve the above problems, a key information feature perception module based on channel attention mechanism to enhance the discrimination ability of the network model is proposed, which make the network focus on the convolution feature changes of the target; On this basis, an online adaptive masking strategy is proposed, which adaptively masks the subsequent frames according to the output state of the cross-correlation layer learned online, so as to highlight the foreground target. Experiments on OTB100 and GOT-10k datasets show that without affecting the real-time performance, the proposed tracker has a significant improvement in accuracy compared with the benchmark, and has a robust tracking effect in complex scenes such as occlusion, scale change and background clutter.
- Object tracking,
- Siamese network,
- Key feature information perception,
- Adaptive mask

FullText(HTML)

References(23)

References

[1]	谭建豪, 殷旺, 刘力铭, 等. 引入全局上下文特征模块的DenseNet孪生网络目标跟踪[J]. 电子与信息学报, 2021, 43(1): 179–186. doi: 10.11999/JEIT190788 TAN Jianhao, YIN Wang, LIU Liming, et al. DenseNet-siamese network with global context feature module for object tracking[J]. Journal of Electronics &Information Technology, 2021, 43(1): 179–186. doi: 10.11999/JEIT190788
[2]	KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]. 2018 European Conference on Computer Vision, Munich, Germany, 2019: 3–53.
[3]	李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057–2080. doi: 10.11834/jig.190372 LI Xi, CHA Yufei, ZHANG Tianzhu, et al. Survey of visual tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057–2080. doi: 10.11834/jig.190372
[4]	HENRIQUE J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/tpami.2014.2345390
[5]	BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1401–1409.
[6]	DANELLJAN M, HÄGER G, KHAN F S K, et al. Learning spatially regularized correlation filters for visual tracking[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310–4318.
[7]	DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]. 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 621–629.
[8]	CHOPRA S, HADSELL R, and LECUN Y. Learning a similarity metric discriminatively, with application to face verification[C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 539–546.
[9]	TAO Ran, GAVVES E, and SMEULDERS A W M. Siamese instance search for tracking[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1420–1429.
[10]	BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]. 2016 European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 850–865.
[11]	VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5000–5008.
[12]	LI Bo, YAN Junjie, WU Wei, et al. High performance visual tracking with Siamese region proposal network[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8971–8980.
[13]	REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J] IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149.
[14]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/TPAMI.2015.2389824
[15]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
[16]	SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
[17]	LI Bo, WU Wei, WANG Qiang, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4277–4286.
[18]	ZHANG Zhipeng and PENG Houwen. Deeper and wider Siamese networks for real-time visual tracking[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4586–4595.
[19]	DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 4655–4664.
[20]	WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226
[21]	HUANG Lianghua, ZHAO Xin, and HUANG Kaiqi. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562–1577. doi: 10.1109/TPAMI.2019.2957464
[22]	HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J] IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011–2023.
[23]	WANG Mengmeng, LIU Yong, and HUANG Zeyi. Large margin object tracking with circulant feature maps[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4800–4808.