高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于卷积注意力模块和无锚框检测网络的行人跟踪算法

张红颖 贺鹏艺

张红颖, 贺鹏艺. 基于卷积注意力模块和无锚框检测网络的行人跟踪算法[J]. 电子与信息学报, 2022, 44(9): 3299-3307. doi: 10.11999/JEIT210634
引用本文: 张红颖, 贺鹏艺. 基于卷积注意力模块和无锚框检测网络的行人跟踪算法[J]. 电子与信息学报, 2022, 44(9): 3299-3307. doi: 10.11999/JEIT210634
ZHANG Hongying, HE Pengyi. Pedestrian Tracking Algorithm Based on Convolutional Block Attention Module and Anchor-free Detection Network[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3299-3307. doi: 10.11999/JEIT210634
Citation: ZHANG Hongying, HE Pengyi. Pedestrian Tracking Algorithm Based on Convolutional Block Attention Module and Anchor-free Detection Network[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3299-3307. doi: 10.11999/JEIT210634

基于卷积注意力模块和无锚框检测网络的行人跟踪算法

doi: 10.11999/JEIT210634
基金项目: 国家重点研发计划(2018YFB1601200),天津市研究生科研创新项目(2020YJSZXS14),四川省青年科技创新研究团队专项计划(2019JDTD0001)
详细信息
    作者简介:

    张红颖:女,博士,教授,硕士生导师,研究方向为图像工程与计算机视觉

    贺鹏艺:男,硕士生,研究方向为图像处理、计算机视觉

    通讯作者:

    张红颖 carole_zhang0716@163.com

  • 中图分类号: TN911.73; TP391.4

Pedestrian Tracking Algorithm Based on Convolutional Block Attention Module and Anchor-free Detection Network

Funds: The National Key R&D Program of China(2018YFB1601200), Tianjin Graduate Scientific Research Innovation Project (2020YJSZXS14), The Special Plan for Sichuan Youth Scientific and Technological Innovation Research Team (2019JDTD0001)
  • 摘要: 针对多目标跟踪过程中遮挡严重时的目标身份切换、跟踪轨迹中断等问题,该文提出一种基于卷积注意力模块 (CBAM)和无锚框(anchor-free)检测网络的行人跟踪算法。首先,在高分辨率特征提取网络HrnetV2的基础上,对stem阶段引入注意力机制,以提取更具表达力的特征,从而加强对重识别分支的训练;其次,为了提高算法的运算速度,使检测和重识别分支共享特征权重且并行运行,同时减少头网络的卷积通道数以降低参数运算量;最后,设定合适的参数对网络进行充分的训练,并使用多个测试集对算法进行测试。实验结果表明,该文算法相较于FairMOT在2DMOT15, MOT17, MOT20数据集上的精确度分别提升1.1%, 1.1%, 0.2%,速度分别提升0.82, 0.88, 0.41 fps;相较于其他几种主流算法拥有最少的目标身份切换次数。该文算法能够更好地适用于遮挡严重的场景,实时性也有所提高。
  • 图  1  FairMOT在MOT16-03上可视化效果图及对应的中心点热图

    图  2  框架结构

    图  3  本文网络结构

    图  4  CBAM添加策略

    图  5  特征可视化效果图

    图  6  head结构图

    图  7  ADL-Rundle-8跟踪结果

    图  8  PETS09-S2L1跟踪结果

    图  9  ETH-Pedcross2跟踪结果

    表  1  本文网络部分权重参数

    权重
    conv164×3×3×3
    caca.fc1(4×64×1×1) ca.fc2(64×4×1×1)
    sa1×2×7×7
    conv264×64×3×3
    Layer1[(64×64×1×1),(64×64×3×3),(64×256×1×1)]
    [(256×64×1×1),(64×64×3×3),(64×256×1×1)]×3
    ca1ca1.fc1(16×256×1×1) ca1.fc2(256×16×1×1)
    sa11×2×7×7
    ······
    last layer64×270×3×3,bias=64
    hmhm.0(64×64×3×3,bias=64) hm.2(1×64×1×1,bias=1)
    whwh.0(64×64×3×3,bias=64) wh.2(2×64×1×1,bias=2)
    idid.0(64×64×3×3,bias=64)id.2(128×64×1×1,bias=128)
    regreg.0(64×64×3×3,bias=64) reg.2(2×64×1×1,bias=2)
    下载: 导出CSV

    表  2  不同CBAM添加策略下的检测性能对比(%)

    骨干网络IDF1IDPIDR
    Hrnetv2-w1874.681.169.1
    HrnetV2-w18(stem)+CBAM(a)75.388.664.0
    HrnetV2-w18(stem)+CBAM(b)73.877.170.8
    HrnetV2-w18(stem)+CBAM(c)76.678.874.4
    下载: 导出CSV

    表  3  不同网络的计算量和参数量对比

    网络Total flops(GMac)Total params(MB)Head flops(GMac)Head params(MB)
    HrnetV2-w1870.4410.2025.8840.625
    本文51.099.746.4750.156
    下载: 导出CSV

    表  4  测评指标及其解释说明

    测评指标指标解释
    FP↓被误认为是正样本的比率,即误检率
    FN↓被误认为是负样本的比率,即漏检率
    IDS↓目标ID切换次数,即目标身份发生变化次数
    MOTA↑跟踪准确度。综合FP, FN, IDS等指标计算而来
    MOTP↑定位精度。检测响应与真实数据的行人框重合率
    FPS↑跟踪速度。每秒处理的帧数,用于衡量模型的实时性
    下载: 导出CSV

    表  5  本文算法与FairMOT的测试结果

    数据集算法MOTA↑MOTP↑IDS↓FN↓FP↓fps↑
    2DMOT15FairMOT71.778.61366100184918.31
    本文72.878.61194619301819.13
    MOT20FairMOT12.877.8442210982616243414.69
    本文13.077.2433111059075328815.10
    MOT17FairMOT75.181.12238550922644216.23
    本文76.284.587969141999617.11
    下载: 导出CSV

    表  6  本文算法与其他几种模型及算法的测试结果对比

    数据集跟踪算法MOTA↑MOTP↑IDS↓FN↓FP↓Time elapsed(s)
    MOT17_trainTube_TK[15]79.588.435705685086015316.88
    CSTrack[16]75.981.4196258947201781009.53
    TransCenter[17]70.184.8201794979380215948.00
    Fair(DLA-34)[5]76.380.61620429246366971.25
    Fair(HrnetV2-w18)[5]75.181.122385509226442982.79
    本文76.284.5879691419996932.24
    下载: 导出CSV
  • [1] 曹自强, 赛斌, 吕欣. 行人跟踪算法及应用综述[J]. 物理学报, 2020, 69(8): 084203. doi: 10.7498/aps.69.20191721

    CAO Ziqiang, SAI Bin, and LU Xin. Review of pedestrian tracking: Algorithms and applications[J]. Acta Physica Sinica, 2020, 69(8): 084203. doi: 10.7498/aps.69.20191721
    [2] LAW H and DENG Jia. CornerNet: Detecting objects as paired keypoints[C]. The 15th European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 734–750.
    [3] ZHOU Xingyi, WANG Dequan, and KRÄHENBÜHL P. Objects as points[J]. arXiv preprint arXiv: 1904.07850, 2019.
    [4] WANG Zhongdao, ZHENG Liang, LIU Yixuan, et al. Towards real-time multi-object tracking[J]. arXiv preprint arXiv: 1909.12605, 2020.
    [5] ZHAN Yifu, WANG Chunyu, WANG Xinggang, et al. A simple baseline for multi-object tracking[J]. arXiv preprint arXiv: 2004.01888, 2020.
    [6] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. The 15th European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 3–19.
    [7] SUN Ke, ZHAO Yang, JIANG Borui, et al. High-resolution representations for labeling pixels and regions[J]. arXiv preprint arXiv: 1904.04514, 2019.
    [8] LI Zeming, PENG Chao, YU Gang, et al. Light-head R-CNN: In defense of two-stage object detector[J]. arXiv preprint arXiv: 1711.07264, 2017.
    [9] XIAO Tong, LI Shuang, WANG Bochao, et al. Joint detection and identification feature learning for person search[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 3415–3424.
    [10] ZHENG Liang, ZHANG Hengheng, SUN Shaoyan, et al. Person re-identification in the wild[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 1367–1376.
    [11] MILAN A, LEAL-TAIXE L, REID I, et al. MOT16: A benchmark for multi-object tracking[J]. arXiv preprint arXiv: 1603.00831, 2016.
    [12] LEAL-TAIXÉ L, MILAN A, REID I, et al. MOTchallenge 2015: Towards a benchmark for multi-target tracking[J]. arXiv preprint arXiv: 1504.01942, 2015.
    [13] WOJKE N, BEWLEY A, and PAULUS D. Simple online and realtime tracking with a deep association metric[C]. 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017: 3645–3649.
    [14] DENDORFER P, REZATOFIGHI H, MILAN A, et al. MOT20: A benchmark for multi object tracking in crowded scenes[J]. arXiv preprint arXiv: 2003.09003, 2020.
    [15] PANG Bo, LI Yizhuo, ZHANG Yifan, et al. TubeTK: Adopting tubes to track multi-object in a one-step training model[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020.
    [16] LIANG Chao, ZHANG Zhipeng, LU Yi, et al. Rethinking the competition between detection and ReID in Multi-Object Tracking[J]. arXiv preprint arXiv: 2010.12138, 2020.
    [17] XU Yihong, BAN Yutong, DELORME G, et al. TransCenter: Transformers with dense queries for multiple-object tracking[J]. arXiv preprint arXiv: 2103.15145, 2021.
  • 加载中
图(9) / 表(6)
计量
  • 文章访问数:  1321
  • HTML全文浏览量:  838
  • PDF下载量:  173
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-06-28
  • 修回日期:  2021-09-14
  • 网络出版日期:  2021-09-28
  • 刊出日期:  2022-09-19

目录

    /

    返回文章
    返回