高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Anchor free与Anchor base算法结合的拥挤行人检测方法

谢明鸿 康斌 李华锋 张亚飞

谢明鸿, 康斌, 李华锋, 张亚飞. Anchor free与Anchor base算法结合的拥挤行人检测方法[J]. 电子与信息学报, 2023, 45(5): 1833-1841. doi: 10.11999/JEIT220444
引用本文: 谢明鸿, 康斌, 李华锋, 张亚飞. Anchor free与Anchor base算法结合的拥挤行人检测方法[J]. 电子与信息学报, 2023, 45(5): 1833-1841. doi: 10.11999/JEIT220444
XIE Minghong, KANG Bin, LI Huafeng, ZHANG Yafei. Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1833-1841. doi: 10.11999/JEIT220444
Citation: XIE Minghong, KANG Bin, LI Huafeng, ZHANG Yafei. Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1833-1841. doi: 10.11999/JEIT220444

Anchor free与Anchor base算法结合的拥挤行人检测方法

doi: 10.11999/JEIT220444
详细信息
    作者简介:

    谢明鸿:男,博士,高级工程师,研究方向为行人重识别与图像融合等

    康斌:男,硕士生,研究方向为图像处理与目标检测

    李华锋:男,博士,教授,研究方向为计算机视觉与图像处理

    张亚飞:女,博士,副教授,研究方向为图像处理与模式识别

    通讯作者:

    张亚飞  zyfeimail@163.com

  • 中图分类号: TN911.73; TP391.41

Crowded Pedestrian Detection Method Combining Anchor Free and Anchor Base Algorithm

  • 摘要: 由于精度相对较高,Anchor base算法目前已成为拥挤场景下行人检测的研究热点。但是,该算法需要手工设计锚框,限制了其通用性。同时,单一的非极大值抑制(NMS)筛选阈值作用于不同密度的人群区域会导致一定程度的漏检和误检。为此,该文提出一种Anchor free与Anchor base检测器相结合的双头检测算法。具体地,先利用Anchor free检测器对图像进行粗检测,将粗检测结果进行自动聚类生成锚框后反馈给区域建议网络(RPN)模块,以代替RPN阶段手工设计锚框的步骤。同时,通过对粗检测结果信息的统计可得到不同区域人群的密度信息。该文设计一个行人头部-全身互监督检测框架,利用头部检测结果与全身的检测结果互相监督,从而有效减少被抑制与漏检的目标实例。提出一种新的NMS算法,该方法可以自适应地为不同密度的人群区域选择合适的筛选阈值,从而最大限度地减少NMS处理引起的误检。所提出的检测器在CrowdHuman数据集和CityPersons数据集进行了实验验证,取得了与目前最先进的行人检测方法相当的性能。
  • 图  1  本文模型的总体结构

    图  2  行人头部-全身互监督检测框架

    图  3  在拥挤场景下本文方法与Faster R-CNN的检测结果比较

    算法1 Stripping-NMS算法
      输入:
     预测得分: S = { s 1, s 2,···, s n },全身预测框 :B f = { b f1 , b f2 ,···,   b fm },头部预测框: B h = { b h1 , b h2 ,···, b hn },
     固定框: B a = { b a1 , b a2 ,···, b ai },不同的密度区域: A = { A 0, A 1,   A 2, A 3, A 4},NMS threshold: N D = [0.5;0.6;0.65;0.7;0.8],
      B = { b 1, b 2,···, b i }, B m表示最大得分框, M表示最大得分预测  框集合, R表示最终预测框集合。
      begin:
        RB a
       while B ≠ empty do
          m ← argmax S ;
          Mb m
          RFM; BBM
         for b i in B do
          if IoU( b ai , b i ) ≥ 0.9 then
            BBb i ; S = S- s i
          end
         end
         for A i , N D i in A, N D do
          if A i $ \supset $ b i then
           if IoU( b m , b fi/hi ) ≥ N D i then
             BBb fi ( b hi ); s is i (1–IoU( M, b i ))
           else:
             BBb fi ( b hi ); Sss i
          end
        end
       end
      return RS
    下载: 导出CSV

    表  1  CityPersons训练集与CrowdHuman训练集

    图像数目 人数 每张图人数 有效行人
    CityPersons 2975 19238 6.47 19238
    CrowdHuman 15000 339565 22.64 339565
    下载: 导出CSV

    表  2  CityPersons数据集与CrowdHuman数据集的行人遮挡程度 [ 16]

    IoU>0.5 IoU>0.6 IoU>0.7 IoU>0.8 IoU>0.9
    CityPersons 0.32 0.17 0.08 0.02 0.00
    CrowdHuman 2.40 1.01 0.33 0.07 0.01
    下载: 导出CSV

    表  3  在CityPersons数据集上的消融实验结果(%)

    方法 交叉网络 互监督检测器 Stripping-NMS Reasonable(MR –2) Heavy(MR –2)
    baseline 11.74 45.24
    本文 √√ 11.4810.76 43.6442.44
    10.48 40.81
    下载: 导出CSV

    表  4  在CrowdHuman数据集上的消融实验结果(%)

    方法 新的第一阶段 互监督检测器 Stripping-NMS MR –2 AP Recall
    baseline 42.73 85.88 80.74
    本文 √√ 42.3841.88 88.6389.44 83.0483.41
    41.04 91.25 84.26
    下载: 导出CSV

    表  5  不同方法在CityPersons数据集上的性能比较(%)

    方法 主干网络 输入尺度 Reasonable(MR –2) Heavy(MR –2)
    OR-CNN [ 20] MGAN [ 21] Adaptive-NMS [ 9] R 2NMS [ 10] EGCL [ 22] RepLoss [ 6] CrowDet [ 23] 文献[ 24]RepLoss [ 6] CrowDet [ 23] NOH-NMS [ 25] baseline本文方法 VGG-16VGG-16VGG-16VGG-16VGG-16ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50 1×1×1×1×1×1×1×–1.3×1.3×1.3×1.3×1.3× 12.8011.5012.9011.1011.5013.2012.1011.6011.6010.7010.8011.74 10.48 55.7051.7056.4053.3050.0056.9040.0047.3055.30 38.00–45.2440.81
    下载: 导出CSV

    表  6  不同方法在CrowdHuman数据集上的性能比较(%)

    方法 主干网络 MR –2 AP Recall
    Faster R-CNN [ 11] Adaptive-NMS [ 9] JointDet [ 7] R 2NMS [ 10] CrowdDet [ 23] DETR [ 26] NOH-NMS [ 25] 文献[ 8]V2F-Net [ 27] baseline本文方法 VGG-16VGG-16ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50ResNet-50 51.2149.7346.5043.3541.4045.5743.9050.1642.2842.73 41.04 85.0984.71–89.2990.7089.5489.0087.3191.0385.88 91.25 77.2491.27–93.3383.70 94.0092.90FPN84.2080.7484.26
    增益 –1.69 +5.37 +3.52
    下载: 导出CSV

    表  7  DetNet与Cascade R-CNN结合的性能(%)

    方法 主干网络 MR –2 AP Recall
    本文方法Cascade R-CNN+本文方法 Detnet-59Detnet-59 39.9438.02 91.2391.75 93.0593.14
    下载: 导出CSV

    表  8  本文方法的泛化性能(%)

    方法 训练 测试 Reasonable(MR –2) Heavy(MR –2)
    文献[ 8] CrowdHuman CityPersons 10.10 50.20
    本文方法 CrowdHuman(h&f) CityPersons 40.23
    CrowdHuman+CityPersons(v&f) CityPersons 8.84 39.27
    下载: 导出CSV
  • [1] YE Mang, SHEN Jianbing, LIN Gaojie, et al. Deep learning for person Re-identification: A survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872–2893. doi: 10.1109/TPAMI.2021.3054775
    [2] MARVASTI-ZADEH S M, CHENG Li, GHANEI-YAKHDAN H, et al. Deep learning for visual tracking: A comprehensive survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943–3968. doi: 10.1109/TITS.2020.3046478
    [3] 贲晛烨, 徐森, 王科俊. 行人步态的特征表达及识别综述[J]. 模式识别与人工智能, 2012, 25(1): 71–81. doi: 10.3969/j.issn.1003-6059.2012.01.010

    BEN Xianye, XU Sen, and WANG Kejun. Review on pedestrian gait feature expression and recognition[J]. Pattern Recognition and Artificial Intelligence, 2012, 25(1): 71–81. doi: 10.3969/j.issn.1003-6059.2012.01.010
    [4] 邹逸群, 肖志红, 唐夏菲, 等. Anchor-free的尺度自适应行人检测算法[J]. 控制与决策, 2021, 36(2): 295–302. doi: 10.13195/j.kzyjc.2020.0124

    ZOU Yiqun, XIAO Zhihong, TANG Xiafei, et al. Anchor-free scale adaptive pedestrian detection algorithm[J]. Control and Decision, 2021, 36(2): 295–302. doi: 10.13195/j.kzyjc.2020.0124
    [5] ZHOU Chunluan and YUAN Junsong. Bi-box regression for pedestrian detection and occlusion estimation[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 135–151.
    [6] WANG Xinlong, XIAO Tete, JIANG Yuning, et al. Repulsion loss: Detecting pedestrians in a crowd[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7774–7783.
    [7] CHI Cheng, ZHANG Shifeng, XING Junliang, et al. Relational learning for joint head and human detection[C]. The Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, USA, 2020: 10647–10654.
    [8] 陈勇, 谢文阳, 刘焕淋, 等. 结合头部和整体信息的多特征融合行人检测[J]. 电子与信息学报, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268

    CHEN Yong, XIE Wenyang, LIU Huanlin, et al. Multi-feature fusion pedestrian detection combining head and overall information[J]. Journal of Electronics& Information Technology, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268
    [9] LIU Songtao, HUANG Di, and WANG Yunhong. Adaptive NMS: Refining pedestrian detection in a crowd[C]. The 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6452–6461.
    [10] HUANG Xin, GE Zheng, JIE Zequn, et al. NMS by representative region: Towards crowded pedestrian detection by proposal pairing[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10747–10756.
    [11] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
    [12] ZHOU Xingyi, WANG Dequan, and KRÄHENBÜHL P. Objects as points[EB/OL]. https://arxiv.org/abs/1904.07850, 2019.
    [13] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318–327. doi: 10.1109/TPAMI.2018.2858826
    [14] ZHENG Zhaohui, WANG Ping, REN Dongwei, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574–8586. doi: 10.1109/TCYB.2021.3095305
    [15] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS--improving object detection with one line of code[C]. The 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5562–5570.
    [16] SHAO Shuai, ZHAO Zijian, LI Boxun, et al. CrowdHuman: A benchmark for detecting human in a crowd[EB/OL]. https://arxiv.org/abs/1805.00123, 2018.
    [17] ZHANG Shanshan, BENENSON R, and SCHIELE B. CityPersons: A diverse dataset for pedestrian detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017, 4457–4465.
    [18] SHAO Xiaotao, WANG Qing, YANG Wei, et al. Multi-scale feature pyramid network: A heavily occluded pedestrian detection network based on ResNet[J]. Sensors, 2021, 21(5): 1820. doi: 10.3390/s21051820
    [19] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. The 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944.
    [20] ZHANG Shifeng, WEN Longyin, BIAN Xiaobian, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 657–674.
    [21] PANG Yanwei, XIE Jin, KHAN M H, et al. Mask-guided attention network for occluded pedestrian detection[C]. The 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 4966–4974.
    [22] LIN Zebin, PEI Wenjie, CHEN Fanglin, et al. Pedestrian detection by exemplar-guided contrastive learning[J]. IEEE Transactions on Image Processing, 2023, 32: 2003–2016. doi: 10.1109/TIP.2022.3189803
    [23] CHU Xuangeng, ZHENG Anlin, ZHANG Xiangyu, et al. Detection in crowded scenes: One proposal, multiple predictions[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 12211–12220.
    [24] 陈勇, 刘曦, 刘焕淋. 基于特征通道和空间联合注意机制的遮挡行人检测方法[J]. 电子与信息学报, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606

    CHEN Yong, LIU Xi, and LIU Huanlin. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information[J]. Journal of Electronics& Information Technology, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606
    [25] ZHOU Penghao, ZHOU Chong, PENG Pai, et al. NOH-NMS: Improving pedestrian detection by nearby objects hallucination[C]. The 28th ACM International Conference on Multimedia, Seattle, USA, 2020: 1967–1975.
    [26] LIN M, LI Chuming, BU Xingyuan, et al. DETR for crowd pedestrian detection[EB/OL]. https://arxiv.org/abs/2012.06785, 2020.
    [27] SHANG Mingyang, XIANG Dawei, WANG Zhicheng, et al. V2F-Net: Explicit decomposition of occluded pedestrian detection[EB/OL]. https://arxiv.org/abs/2104.03106, 2021.
    [28] LI Zeming, PENG Chao, YU Gang, et al. DetNet: A backbone network for object detection[EB/OL]. https://arxiv.org/abs/1804.06215, 2018.
    [29] CAI Zhaowei and VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6154–6162.
  • 加载中
图(3) / 表(9)
计量
  • 文章访问数:  506
  • HTML全文浏览量:  189
  • PDF下载量:  120
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-14
  • 修回日期:  2022-08-31
  • 录用日期:  2022-09-06
  • 网络出版日期:  2022-09-08
  • 刊出日期:  2023-05-10

目录

    /

    返回文章
    返回