基于空间可靠性约束的鲁棒视觉跟踪算法

蒲磊; 冯新喜; 侯志强; 余旺盛

doi:10.11999/JEIT180780

基于空间可靠性约束的鲁棒视觉跟踪算法

doi: 10.11999/JEIT180780

1.
空军工程大学研究生院西安 710077
2.
空军工程大学信息与导航学院西安 710077
3.
西安邮电大学计算机学院西安 710121

基金项目: 国家自然科学基金(61571458, 61473309, 41601436)

详细信息

作者简介:
蒲磊：男，1991年生，博士生，研究方向为计算机视觉、目标跟踪

冯新喜：男，1964年生，教授，研究方向为信息融合、模式识别

侯志强：男，1973年生，教授，研究方向为图像处理、计算机视觉

余旺盛：男，1985年生，讲师，研究方向为图像处理、模式识别

通讯作者:
蒲磊　warmstoner@163.com

中图分类号: TP391.4
计量
- 文章访问数: 2338
- HTML全文浏览量: 836
- PDF下载量: 70
- 被引次数: 0
出版历程
- 收稿日期: 2018-08-07
- 修回日期: 2019-01-21
- 网络出版日期: 2019-02-15
- 刊出日期: 2019-07-01

Robust Visual Tracking Based on Spatial Reliability Constraint

1.
Graduate College, Air Force Engineering University, Xi’an 710077, China
2.
Institute of Information and Navigation, Air Force Engineering University, Xi’an 710077, China
3.
School of Computer Science and Technology, Xian University of Posts and Telecommunications, Xi’an 710121, China

Funds: The National Natural Science Foundation of China (61571458, 61473309, 41601436)

摘要

摘要: 针对复杂背景下目标容易发生漂移的问题，该文提出一种基于空间可靠性约束的目标跟踪算法。首先通过预训练卷积神经网络(CNN)模型提取目标的多层深度特征，并在各层上分别训练相关滤波器，然后对得到的响应图进行加权融合。接着通过高层特征图提取目标的可靠性区域信息，得到一个二值注意力矩阵，最后将得到的二值矩阵用于约束融合后响应图的搜索范围，范围内的最大响应值即为目标的中心位置。为了处理长时遮挡问题，该文提出一种基于首帧模板信息的随机选择更新策略。实验结果表明，该算法在应对相似背景干扰、遮挡、超出视野等多种场景均有良好的性能表现。
- 视觉跟踪 /
- 空间可靠性约束 /
- 深度特征 /
- 相关滤波 /
- 模型更新
Abstract: Because of the problem that the target is prone to drift in complex background, a robust tracking algorithm based on spatial reliability constraint is proposed. Firstly, the pre-trained Convolutional Neural Network (CNN) model is used to extract the multi-layer deep features of the target, and the correlation filters are respectively trained on each layer to perform weighted fusion of the obtained response maps. Then, the reliability region information of the target is extracted through the high-level feature map, a binary matrix is obtained. Finally, the obtained binary matrix is used to constrain the search area of the response map, and the maximum response value in the area is the target position. In addition, in order to deal with the long-term occlusion problem, a random selection model update strategy with the first frame template information is proposed. The experimental results show that the proposed algorithm has good performance in dealing with similar background interference, occlusion, and other scenes.
- Visual tracking /
- Spatial reliability constraint /
- Deep features /
- Correlation filter /
- Model update

HTML全文

图 1 卷积深度特征可视化

下载: 全尺寸图片幻灯片

图 2 算法流程图

下载: 全尺寸图片幻灯片

图 3 OTB100测试结果的精度曲线和成功率曲线

下载: 全尺寸图片幻灯片

图 4 TempleColor128测试结果的精度曲线和成功率曲线

下载: 全尺寸图片幻灯片

表 1 基于空间可靠性约束的鲁棒视觉跟踪算法

输入：图像序列I₁, I₂, ···, I_n，目标初始位置p₀=(x₀, y₀)，目标初始尺度s₀=(w₀, h₀)。
输出：每帧图像的跟踪结果p_t=(x_t, y_t), s_t=(w_t, h_t)。
对于t=1, 2, ···, n, do：
(1) 定位目标中心位置
(a) 利用前一帧目标位置p_t_–1确定第t帧ROI区域，并提取其分层卷积特征；
(b) 对于每一层的卷积特征，利用式(4)和式(5)计算其相关响应图；
(c) 利用式(6)对多个相关响应图进行融合，得到最终的相关响应图；　　　(d)通过式(7)和式(8)提取空间可靠性区域图并将用于约束响应图搜索范围；
(e) 利用式(9)确定第t 帧中目标的中心位置p_t。
(2) 确定目标最佳尺度
(a) 利用p_t和前一帧目标尺度s_t_–1进行多尺度采样，得到采样图像集I_s={$ I_{s_1},\ I_{s_2},\ ·\!·\!·,\ I_{s_m}$}；
(b) 采用文献[14]中的尺度估计方法确定第t帧中目标的最佳尺度s_t。
(3) 模型更新
(a) 通过得到响应图计算最大响应值；
(b) 依据响应值大小和式(10)—式(12)对滤波器进行更新。
结束

下载: 导出CSV

表 2 不同属性下算法的跟踪精度对比结果

算法	SV(60)	OCC(45)	IV(34)	BC(27)	DEF(42)	MB(29)	FM(37)	IPR(46)	OPR(57)	OV(13)	LR(8)
本文算法	0.827	0.799	0.855	0.872	0.801	0.813	0.800	0.879	0.844	0.756	0.870
HDT	0.811	0.753	0.803	0.855	0.817	0.764	0.800	0.851	0.804	0.663	0.749
HCF	0.800	0.748	0.805	0.857	0.788	0.772	0.788	0.863	0.807	0.680	0.778

下载: 导出CSV

表 3 不同属性下算法的跟踪成功率对比结果

算法	SV(60)	OCC(45)	IV(34)	BC(27)	DEF(42)	MB(29)	FM(37)	IPR(46)	OPR(57)	OV(13)	LR(8)
本文算法	0.580	0.594	0.635	0.627	0.570	0.624	0.609	0.605	0.597	0.556	0.510
HDT	0.491	0.528	0.540	0.593	0.546	0.545	0.549	0.557	0.533	0.541	0.376
HCF	0.490	0.526	0.547	0.602	0.532	0.557	0.550	0.599	0.534	0.542	0.383

下载: 导出CSV

表 4 算法各部分对跟踪性能影响对比实验

	SRCT	SRCT-S	SRCT-R	SRCT-S-R
成功率	0.624	0.618	0.610	0.603
跟踪精度	0.864	0.856	0.841	0.838

下载: 导出CSV

参考文献(30)

SMEULDERS A W M, CHU D M, CUCCHIARA R, et al. Visual tracking: An experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442–1468. doi: 10.1109/TPAMI.2013.230

WANG Naiyan, SHI Jianping, YEUNG D Y, et al. Understanding and diagnosing visual tracking systems[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3101–3109. doi: 1109/ICCV.2015.355.

RAWAT W and WANG Zenghui. Deep convolutional neural networks for image classification: A comprehensive review[J]. Neural Computation, 2017, 29(9): 2352–2449. doi: 10.1162/neco_a_00990

GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587.

SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683

WANG Naiyan and YEUNG D Y. Learning a deep compact image representation for visual tracking[C]. Proceedings of the 26th International Conference on Neural Information Processing Systems, South Lake Tahoe, Nevada, USA, 2013: 809–817.

HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C]. Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 2015: 597–606.

NAM H and HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4293–4302.

李寰宇, 毕笃彦, 杨源, 等. 基于深度特征表达与学习的视觉跟踪算法研究[J]. 电子与信息学报, 2015, 37(9): 2033–2039. doi: 10.11999/JEIT150031

LI Huanyu, BI Duyan, YANG Yuan, et al. Research on visual tracking algorithm based on deep feature expression and learning[J]. Journal of Electronics &Information Technology, 2015, 37(9): 2033–2039. doi: 10.11999/JEIT150031

侯志强, 戴铂, 胡丹, 等. 基于感知深度神经网络的视觉跟踪[J]. 电子与信息学报, 2016, 38(7): 1616–1623. doi: 10.11999/JEIT151449

HOU Zhiqiang, DAI Bo, HU Dan, et al. Robust visual tracking via perceptive deep neural network[J]. Journal of Electronics &Information Technology, 2016, 38(7): 1616–1623. doi: 10.11999/JEIT151449

HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy, 2012: 702–715. doi: 10.1007/978-3-642-33765-9_50.

DANELLJAN M, KHAN F S, FELSBERG M, et al. Adaptive color attributes for real-time visual tracking[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1090–1097. doi: 10.1109/CVPR.2014.143.

HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/tpami.2014.2345390

DANELLJAN M, HÄGER G, KHAN F S, et al. Accurate scale estimation for robust visual tracking[C]. Proceedings of British Machine Vision Conference, Nottingham, UK, 2014: 65.1–65.11. doi: 10.5244/C.28.65.

DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310–4318. doi: 10.1109/ICCV.2015.490.

DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: Learning continuous convolution operators for visual tracking[C]. Proceedings of the 14th European Conference, Amsterdam, the Netherlands, 2016: 472–488. doi: 10.1007/978-3-319-46454-1_29.

RUSSAKOVSKY O, DENG Jia, SU Hao, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252. doi: 10.1007/s11263-015-0816-y

KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105. doi: 10.1145/3065386.

SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. International Conference on Learning Representations, San Diego,USA,2015.

HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.

VEDALDI A and LENC K. Matconvnet: Convolutional neural networks for matlab[C]. Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 2015: 689–692. doi: 10.1145/2733373.2807412.

WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226

DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 58–66. doi: 10.1109/ICCVW.2015.84.

QI Yuankai, ZHANG Shengping, QIN Lei, et al. Hedged deep tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4303–4311. doi: 10.1109/CVPR.2016.466.

MA Chao, HUANG Jiabin, YANG Xiaokang, et al. Hierarchical convolutional features for visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3074–3082. doi: 10.1109/ICCV.2015.352.

ZHANG Jianming, MA Shugao, and SCLAROFF S. MEEM: Robust tracking via multiple experts using entropy minimization[C]. Proceedings of the 13th European Conference, Zurich, Switzerland, 2014: 188–203.

LIANG Pengpeng, BLASCH E, and LING Haibin. Encoding color information for visual tracking: Algorithms and benchmark[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5630–5644. doi: 10.1109/TIP.2015.2482905

TAO Ran, GAVVES E, and SMEULDERS A W M. Siamese instance search for tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1420–1429. doi: 10.1109/CVPR.2016.158.

BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]. European Conference on Computer Vision, Amsterdam, the Netherlands, 2016: 850–865.

侯志强, 张浪, 余旺盛, 等. 基于快速傅里叶变换的局部分块视觉跟踪算法[J]. 电子与信息学报, 2015, 37(10): 2397–2404. doi: 10.11999/JEIT150183

HOU Zhiqiang, ZHANG Lang, YU Wangsheng, et al. Local patch tracking algorithm based on fast fourier transform[J]. Journal of Electronics &Information Technology, 2015, 37(10): 2397–2404. doi: 10.11999/JEIT150183

施引文献

资源附件(0)

访问统计