基于空间和通道注意力机制的目标跟踪方法

刘嘉敏; 谢文杰; 黄鸿; 汤一明

doi:10.11999/JEIT200687

基于空间和通道注意力机制的目标跟踪方法

doi: 10.11999/JEIT200687 cstr: 32379.14.JEIT200687

重庆大学光电技术及系统教育部重点实验室重庆 400044

基金项目: 国家自然科学基金(41371338)、重庆市基础与前沿研究计划(cstc2018jcyjAX0093)、重庆市留学人员回国创业创新支持计划(cx2019144)、重庆市研究生科研创新项目(CYB19039, CYB18048)

详细信息

作者简介:
刘嘉敏：男，1973年生，副教授，研究方向为图像处理、模式识别

谢文杰：男，1995年生，硕士生，研究方向为图像处理、视频跟踪

黄鸿：男，1980年生，教授，研究方向为流形学习、模式识别和遥感影像智能化处理

汤一明：男，1993年生，博士生，研究方向为模式识别、图像处理、深度学习和视觉跟踪

通讯作者:
刘嘉敏　liujm@cqu.edu.cn

中图分类号: TN911.73; TP391.4
计量
- 文章访问数: 1870
- HTML全文浏览量: 1955
- PDF下载量: 219
- 被引次数: 0
出版历程
- 收稿日期: 2020-08-05
- 修回日期: 2021-03-20
- 网络出版日期: 2021-04-16
- 刊出日期: 2021-09-16

Spatial and Channel Attention Mechanism Method for Object Tracking

Key Laboratory of Optoelectronic Technique System of the Ministry of Education, Chongqing University, Chongqing 400044, China

Funds: The National Natural Science Foundation of China (41371338), Chongqing Basic and Frontier Research Program (cstc2018jcyjAX0093), Chongqing Returned Overseas Students’ Entrepreneurship and Innovation Support Program (cx2019144), Chongqing Graduate Research and Innovation Project (CYB19039, CYB18048)

摘要

摘要: 目标跟踪是计算机视觉中重要的研究领域之一，大多跟踪算法不能有效学习适合于跟踪场景的特征限制了跟踪算法性能的提升。该文提出了一种基于空间和通道注意力机制的目标跟踪算法(CNNSCAM)。该方法包括离线训练的表观模型和自适应更新的分类器层。在离线训练时，引入空间和通道注意力机制模块对原始特征进行重新标定，分别获得空间和通道权重，通过将权重归一化后加权到对应的原始特征上，以此挑选关键特征。在线跟踪时，首先训练全连接层和分类器层的网络参数，以及边界框回归。其次根据设定的阈值采集样本，每次迭代都选择分类器得分最高的负样本来微调网络层参数。在OTB2015数据集上的实验结果表明：相比其他主流的跟踪算法，该文所提算法获得了更好的跟踪精度，重叠成功率和误差成功率分别为67.6%，91.2%。
- 目标跟踪 /
- 深度学习 /
- 空间注意力 /
- 通道注意力 /
- 在线学习
Abstract: Object tracking is one of the important research fields in computer vision. However, most tracking algorithm can not effectively learn the features suitable for tracking scene, which limits the performance improvement of tracking algorithm. To overcome this problem, this paper proposes a target tracking algorithm based on CNN Spatial and Channel Attention Mechanisms (CNNSCAM). The method consists of an off-line training apparent model and an adaptive updating classifier layer. In the offline training, the spatial and channel attention mechanism module is introduced to recalibrate the original features, and the space and channel weights are obtained respectively. The key features are selected by normalizing the weights to the corresponding original features. In online tracking, the network parameters of the full connection layer and classifier layer are trained, and the boundary box regression is used. Secondly, samples are collected according to the set threshold, and the negative sample with the highest classifier score is selected for each iteration to fine tune the network layer parameters. The experimental results on OTB2015 dataset show that compared with other mainstream tracking algorithms, the proposed method achieves better tracking accuracy. The overlap success rate and error success rate are 67.6% and 91.2% respectively.
- Object tracking /
- Deep learning /
- Spatial attention /
- Channel attention /
- Online learning

HTML全文

图 1 算法模型

下载: 全尺寸图片幻灯片

图 2 空间注意力机制

下载: 全尺寸图片幻灯片

图 3 通道注意力机制

下载: 全尺寸图片幻灯片

图 4 在OTB2015数据集上网络嵌入CAM, SAM的精度和重合度成功率

下载: 全尺寸图片幻灯片

图 5 算法在OTB2015数据集上的整体精度和成功率

下载: 全尺寸图片幻灯片

图 6 多个序列中部分跟踪结果

下载: 全尺寸图片幻灯片

表 1 在OTB2015数据集中的11个跟踪场景下算法的重叠成功率

	IV	OPR	SV	OCC	MD	FM	IPR	OV	DEF	BC	LR
CNNSCAM	0.680	0.657	0.663	0.644	0.671	0.658	0.660	0.651	0.631	0.675	0.622
DaSiamRPN	0.662	0.644	0.641	0.617	0.625	0.621	0.652	0.537	0.652	0.642	0.588
TADT	0.681	0.646	0.655	0.643	0.671	0.657	0.621	0.625	0.607	0.622	0.634
MCPF	0.629	0.619	0.604	0.620	0.599	0.597	0.620	0.553	0.569	0.601	0.581
CNN-SVM	0.537	0.548	0.489	0.514	0.578	0.546	0.548	0.488	0.547	0.548	0.403
BACF	0.547	0.506	0.532	0.475	0.541	0.511	0.497	0.483	0.499	0.552	0.502

下载: 导出CSV

表 2 在OTB2015数据集中的11个跟踪场景下算法的距离误差成功率

Attribute	IV	OPR	SV	OCC	MD	FM	IPR	OV	DEF	BC	LR
CNNSCAM	0.905	0.901	0.910	0.862	0.862	0.869	0.910	0.864	0.880	0.927	0.889
DaSiamRPN	0.878	0.878	0.858	0.818	0.820	0.819	0.889	0.720	0.887	0.856	0.814
TADT	0.865	0.872	0.863	0.842	0.833	0.834	0.832	0.816	0.822	0.805	0.881
MCPF	0.882	0.816	0.862	0.862	0.840	0.845	0.888	0.764	0.815	0.823	0.911
CNN-SVM	0.792	0.798	0.785	0.727	0.751	0.747	0.813	0.650	0.791	0.776	0.811
BACF	0.665	0.650	0.673	0.590	0.649	0.627	0.645	0.613	0.655	0.700	0.665

下载: 导出CSV

表 3 在OTB2015数据集中固定v=1.00时，不同A取值的距离误差成功率

A取值	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9	1.00
Prec	0.686	0.770	0.834	0.850	0.877	0.912	0.876	0.886	0.875	0.858

下载: 导出CSV

表 4 在OTB2015数据集中固定A=0.6时，不同v取值的距离误差成功率

v取值	1.00	1.01	1.02	1.03	1.04	1.05	1.06	1.07	1.08	1.09	1.10
Suc	0.599	0.621	0.641	0.651	0.661	0.676	0.671	0.666	0.657	0.643	0.622

下载: 导出CSV

参考文献(15)

[1]	蒲磊, 冯新喜, 侯志强, 等. 基于自适应背景选择和多检测区域的相关滤波算法[J]. 电子与信息学报, 2020, 42(12): 3061–3067. doi: 10.11999/JEIT190931 PU Lei, FENG Xinxi, HOU Zhiqiang, et al. Correlation filter algorithm based on adaptive context selection and multiple detection areas[J]. Journal of Electronics &Information Technology, 2020, 42(12): 3061–3067. doi: 10.11999/JEIT190931
[2]	李康, 李亚敏, 胡学敏, 等. 基于卷积神经网络的鲁棒高精度目标跟踪算法[J]. 电子学报, 2018, 46(9): 2087–2093. doi: 10.3969/j.issn.0372-2112.2018.09.007 LI Kang, LI Yamin, HU Xuemin, et al. A robust and accurate object tracking algorithm based on convolutional neural network[J]. Acta Electronica Sinica, 2018, 46(9): 2087–2093. doi: 10.3969/j.issn.0372-2112.2018.09.007
[3]	王鹏, 孙梦宇, 王海燕, 等. 一种目标响应自适应的通道可靠性跟踪算法[J]. 电子与信息学报, 2020, 42(8): 1950–1958. doi: 10.11999/JEIT190569 WANG Peng, SUN Mengyu, WANG Haiyan, et al. An object tracking algorithm with channel reliability and target response adaptation[J]. Journal of Electronics &Information Technology, 2020, 42(8): 1950–1958. doi: 10.11999/JEIT190569
[4]	HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C]. Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 2015: 597–606.
[5]	ZHU Zheng, WANG Qiang, LI Bo, et al. Distractor-aware Siamese networks for visual object tracking[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 104–119. doi: 10.1007/978-3-030-01240-3_7.
[6]	LI Xin, MA Chao, WU Baoyuan, et al. Target-aware deep tracking[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 1369–1378. doi: 10.1109/CVPR.2019.00146.
[7]	WANG Ning, SONG Yibing, MA Chao, et al. Unsupervised deep tracking[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 1308–1317.
[8]	HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141. doi: 10.1109/CVPR.2018.00745.
[9]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 1352–1368. doi: 10.1007/978-3-030-01234-2_1.
[10]	WANG Qiang, TENG Zhu, XING Junliang, et al. Learning attentions: Residual attentional Siamese network for high performance online visual tracking[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4854–4863. doi: 10.1109/CVPR.2018.00510.
[11]	CHEN Boyu, LI Peixia, CHONG Sun, et al. Multi attention module for visual tracking[J]. Pattern Recognition, 2019, 87: 80–93. doi: 10.1016/j.patcog.2018.10.005
[12]	张文明, 姚振飞, 高雅昆, 等. 一种平衡准确性以及高效性的显著性目标检测深度卷积网络模型[J]. 电子与信息学报, 2020, 42(5): 1201–1208. doi: 10.11999/JEIT190229 ZHANG Wenming, YAO Zhenfei, GAO Yakun, et al. A deep convolutional network for saliency object detection with balanced accuracy and high efficiency[J]. Journal of Electronics &Information Technology, 2020, 42(5): 1201–1208. doi: 10.11999/JEIT190229
[13]	WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226
[14]	ZHANG Tianzhu, XU Changsheng, and YANG M H. Multi-task correlation particle filter for robust object tracking[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4819–4827. doi: 10.1109/CVPR.2017.512.
[15]	GALOOGAHI H K, FAGG A, and LUCEY S. Learning background-aware correlation filters for visual tracking[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 1144–1152. doi: 10.1109/ICCV.2017.129.