高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Anchor free与Anchor base算法结合的拥挤行人检测方法

谢明鸿 康斌 李华锋 张亚飞

谢明鸿, 康斌, 李华锋, 张亚飞. Anchor free与Anchor base算法结合的拥挤行人检测方法[J]. 电子与信息学报. doi: 10.11999/JEIT220444
引用本文: 谢明鸿, 康斌, 李华锋, 张亚飞. Anchor free与Anchor base算法结合的拥挤行人检测方法[J]. 电子与信息学报. doi: 10.11999/JEIT220444
XIE Minghong, KANG Bin, LI Huafeng, ZHANG Yafei. Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT220444
Citation: XIE Minghong, KANG Bin, LI Huafeng, ZHANG Yafei. Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT220444

Anchor free与Anchor base算法结合的拥挤行人检测方法

doi: 10.11999/JEIT220444
详细信息
    作者简介:

    谢明鸿:男,博士,高级工程师,研究方向为行人重识别与图像融合等

    康斌:男,硕士生,研究方向为图像处理与目标检测

    李华锋:男,博士,教授,研究方向为计算机视觉与图像处理

    张亚飞:女,博士,副教授,研究方向为图像处理与模式识别

    通讯作者:

    张亚飞 zyfeimail@163.com

  • 中图分类号: TN911.73; TP391.41

Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm

  • 摘要: 由于精度相对较高,Anchor base算法目前已成为拥挤场景下行人检测的研究热点。但是,该算法需要手工设计锚框,限制了其通用性。同时,单一的非极大值抑制(NMS)筛选阈值作用于不同密度的人群区域会导致一定程度的漏检和误检。为此,该文提出一种Anchor free与Anchor base检测器相结合的双头检测算法。具体地,先利用Anchor free检测器对图像进行粗检测,将粗检测结果进行自动聚类生成锚框后反馈给区域建议网络(RPN)模块,以代替RPN阶段手工设计锚框的步骤。同时,通过对粗检测结果信息的统计可得到不同区域人群的密度信息。该文设计一个行人头部-全身互监督检测框架,利用头部检测结果与全身的检测结果互相监督,从而有效减少被抑制与漏检的目标实例。提出一种新的NMS算法,该方法可以自适应地为不同密度的人群区域选择合适的筛选阈值,从而最大限度地减少NMS处理引起的误检。所提出的检测器在CrowdHuman数据集和CityPersons数据集进行了实验验证,取得了与目前最先进的行人检测方法相当的性能。
  • 图  1  本文模型的总体结构

    图  2  行人头部-全身互监督检测框架

    图  3  在拥挤场景下本文方法与Faster R-CNN的检测结果比较

    算法1 Stripping-NMS算法
     输入:
     预测得分:S = {s1, s2,···, sn},全身预测框:Bf = {bf1, bf2,···,
     bfm},头部预测框:Bh = {bh1, bh2,···, bhn},
     固定框:Ba = {ba1, ba2,···, bai},不同的密度区域:A = {A0, A1,
     A2, A3, A4},NMS threshold: ND = [0.5;0.6;0.65;0.7;0.8],
     B = { b1, b2,···, bi},Bm表示最大得分框,M表示最大得分预测
     框集合,R表示最终预测框集合。
     begin:
       RBa
       while B ≠ empty do
         m ← argmax S ;
         Mbm
         RFM; BBM
         for bi in B do
          if IoU(bai, bi) ≥ 0.9 then
           BBbi; S = S-si
          end
         end
         for Ai, NDi in A, ND do
          if Ai $ \supset $bi then
           if IoU(bm, bfi/hi) ≥ NDi then
            BBbfi(bhi); sisi (1–IoU(M,bi))
           else:
            BBbfi(bhi); S ssi
          end
        end
       end
     return RS
    下载: 导出CSV

    表  1  CityPersons训练集与CrowdHuman训练集

    图像数目人数每张图人数有效行人
    CityPersons2975192386.4719238
    CrowdHuman1500033956522.64339565
    下载: 导出CSV

    表  2  CityPersons数据集与CrowdHuman数据集的行人遮挡程度[16]

    IoU>0.5IoU>0.6IoU>0.7IoU>0.8IoU>0.9
    CityPersons0.320.170.080.020.00
    CrowdHuman2.401.010.330.070.01
    下载: 导出CSV

    表  3  在CityPersons数据集上的消融实验结果(%)

    方法交叉
    网络
    互监督
    检测器
    Stripping-NMSReasonable
    (MR–2)
    Heavy
    (MR–2)
    baseline11.7445.24
    本文

    11.48
    10.76
    43.64
    42.44
    10.4840.81
    下载: 导出CSV

    表  4  在CrowdHuman数据集上的消融实验结果(%)

    方法新的第
    一阶段
    互监督
    检测器
    Stripping-NMSMR–2APRecall
    baseline42.7385.8880.74
    本文


    42.38
    41.88
    88.63
    89.44
    83.04
    83.41
    41.0491.2584.26
    下载: 导出CSV

    表  5  不同方法在CityPersons数据集上的性能比较(%)

    方法主干网络输入尺度Reasonable(MR–2)Heavy(MR–2)
    OR-CNN[20]
    MGAN[21]
    Adaptive-NMS[9]
    R2NMS[10]
    EGCL[22]
    RepLoss[6]
    CrowDet[23]
    文献[24]
    RepLoss[6]
    CrowDet[23]
    NOH-NMS[25]
    baseline
    本文方法
    VGG-16
    VGG-16
    VGG-16
    VGG-16
    VGG-16
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50








    1.3×
    1.3×
    1.3×
    1.3×
    1.3×
    12.80
    11.50
    12.90
    11.10
    11.50
    13.20
    12.10
    11.60
    11.60
    10.70
    10.80
    11.74
    10.48
    55.70
    51.70
    56.40
    53.30
    50.00
    56.90
    40.00
    47.30
    55.30
    38.00

    45.24
    40.81
    下载: 导出CSV

    表  6  不同方法在CrowdHuman数据集上的性能比较(%)

    方法主干网络MR–2APRecall
    Faster R-CNN[11]
    Adaptive-NMS[9]
    JointDet[7]
    R2NMS[10]
    CrowdDet[23]
    DETR[26]
    NOH-NMS[25]
    文献[8]
    V2F-Net[27]
    baseline
    本文方法
    增益
    VGG-16
    VGG-16
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    ResNet-50
    51.21
    49.73
    46.50
    43.35
    41.40
    45.57
    43.90
    50.16
    42.28
    42.73
    41.04
    –1.69
    85.09
    84.71

    89.29
    90.70
    89.54
    89.00
    87.31
    91.03
    85.88
    91.25
    +5.37
    77.24
    91.27

    93.33
    83.70
    94.00
    92.90
    FPN
    84.20
    80.74
    84.26
    +3.52
    下载: 导出CSV

    表  7  DetNet与Cascade R-CNN结合的性能(%)

    方法主干网络MR–2APRecall
    本文方法
    Cascade R-CNN+本文方法
    Detnet-59
    Detnet-59
    39.94
    38.02
    91.23
    91.75
    93.05
    93.14
    下载: 导出CSV

    表  8  本文方法的泛化性能(%)

    方法训练测试Reasonable(MR–2)Heavy(MR–2)
    文献[8]CrowdHumanCityPersons10.150.2
    本文方法CrowdHuman(h&f)CityPersons40.23
    CrowdHuman+CityPersons(v&f)CityPersons8.8439.27
    下载: 导出CSV
  • [1] YE Mang, SHEN Jianbing, LIN Gaojie, et al. Deep learning for person Re-identification: A survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872–2893. doi: 10.1109/TPAMI.2021.3054775
    [2] MARVASTI-ZADEH S M, CHENG Li, GHANEI-YAKHDAN H, et al. Deep learning for visual tracking: A comprehensive survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943–3968. doi: 10.1109/TITS.2020.3046478
    [3] 贲晛烨, 徐森, 王科俊. 行人步态的特征表达及识别综述[J]. 模式识别与人工智能, 2012, 25(1): 71–81. doi: 10.3969/j.issn.1003-6059.2012.01.010

    BEN Xianye, XU Sen, and WANG Kejun. Review on pedestrian gait feature expression and recognition[J]. Pattern Recognition and Artificial Intelligence, 2012, 25(1): 71–81. doi: 10.3969/j.issn.1003-6059.2012.01.010
    [4] 邹逸群, 肖志红, 唐夏菲, 等. Anchor-free的尺度自适应行人检测算法[J]. 控制与决策, 2021, 36(2): 295–302. doi: 10.13195/j.kzyjc.2020.0124

    ZOU Yiqun, XIAO Zhihong, TANG Xiafei, et al. Anchor-free scale adaptive pedestrian detection algorithm[J]. Control and Decision, 2021, 36(2): 295–302. doi: 10.13195/j.kzyjc.2020.0124
    [5] ZHOU Chunluan and YUAN Junsong. Bi-box regression for pedestrian detection and occlusion estimation[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 135–151.
    [6] WANG Xinlong, XIAO Tete, JIANG Yuning, et al. Repulsion loss: Detecting pedestrians in a crowd[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7774–7783.
    [7] CHI Cheng, ZHANG Shifeng, XING Junliang, et al. Relational learning for joint head and human detection[C]. The Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, USA, 2020: 10647–10654.
    [8] 陈勇, 谢文阳, 刘焕淋, 等. 结合头部和整体信息的多特征融合行人检测[J]. 电子与信息学报, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268

    CHEN Yong, XIE Wenyang, LIU Huanlin, et al. Multi-feature fusion pedestrian detection combining head and overall information[J]. Journal of Electronics &Information Technology, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268
    [9] LIU Songtao, HUANG Di, and WANG Yunhong. Adaptive NMS: Refining pedestrian detection in a crowd[C]. The 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6452–6461.
    [10] HUANG Xin, GE Zheng, JIE Zequn, et al. NMS by representative region: Towards crowded pedestrian detection by proposal pairing[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10747–10756.
    [11] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
    [12] ZHOU Xingyi, WANG Dequan, and KRÄHENBÜHL P. Objects as points[EB/OL]. https://arxiv.org/abs/1904.07850, 2019.
    [13] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318–327. doi: 10.1109/TPAMI.2018.2858826
    [14] ZHENG Zhaohui, WANG Ping, REN Dongwei, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574–8586. doi: 10.1109/TCYB.2021.3095305
    [15] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS--improving object detection with one line of code[C]. The 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5562–5570.
    [16] SHAO Shuai, ZHAO Zijian, LI Boxun, et al. CrowdHuman: A benchmark for detecting human in a crowd[EB/OL]. https://arxiv.org/abs/1805.00123, 2018.
    [17] ZHANG Shanshan, BENENSON R, and SCHIELE B. CityPersons: A diverse dataset for pedestrian detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017, 4457–4465.
    [18] SHAO Xiaotao, WANG Qing, YANG Wei, et al. Multi-scale feature pyramid network: A heavily occluded pedestrian detection network based on ResNet[J]. Sensors, 2021, 21(5): 1820. doi: 10.3390/s21051820
    [19] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. The 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944.
    [20] ZHANG Shifeng, WEN Longyin, BIAN Xiaobian, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 657–674.
    [21] PANG Yanwei, XIE Jin, KHAN M H, et al. Mask-guided attention network for occluded pedestrian detection[C]. The 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 4966–4974.
    [22] LIN Zebin, PEI Wenjie, CHEN Fanglin, et al. Pedestrian detection by exemplar-guided contrastive learning[J]. IEEE Transactions on Image Processing, 2022: 1. doi: 10.1109/TIP.2022.3189803
    [23] CHU Xuangeng, ZHENG Anlin, ZHANG Xiangyu, et al. Detection in crowded scenes: One proposal, multiple predictions[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 12211–12220.
    [24] 陈勇, 刘曦, 刘焕淋. 基于特征通道和空间联合注意机制的遮挡行人检测方法[J]. 电子与信息学报, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606

    CHEN Yong, LIU Xi, and LIU Huanlin. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information[J]. Journal of Electronics &Information Technology, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606
    [25] ZHOU Penghao, ZHOU Chong, PENG Pai, et al. NOH-NMS: Improving pedestrian detection by nearby objects hallucination[C]. The 28th ACM International Conference on Multimedia, Seattle, USA, 2020: 1967–1975.
    [26] LIN M, LI Chuming, BU Xingyuan, et al. DETR for crowd pedestrian detection[EB/OL]. https://arxiv.org/abs/2012.06785, 2020.
    [27] SHANG Mingyang, XIANG Dawei, WANG Zhicheng, et al. V2F-Net: Explicit decomposition of occluded pedestrian detection[EB/OL]. https://arxiv.org/abs/2104.03106, 2021.
    [28] LI Zeming, PENG Chao, YU Gang, et al. DetNet: A backbone network for object detection[EB/OL]. https://arxiv.org/abs/1804.06215, 2018.
    [29] CAI Zhaowei and VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6154–6162.
  • 加载中
图(3) / 表(9)
计量
  • 文章访问数:  143
  • HTML全文浏览量:  65
  • PDF下载量:  38
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-14
  • 录用日期:  2022-09-06
  • 修回日期:  2022-08-31
  • 网络出版日期:  2022-09-08

目录

    /

    返回文章
    返回