高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于特征增强模块的小尺度行人检测

陈勇 金曼莉 刘焕淋 汪波 黄美永

陈勇, 金曼莉, 刘焕淋, 汪波, 黄美永. 基于特征增强模块的小尺度行人检测[J]. 电子与信息学报, 2023, 45(4): 1445-1453. doi: 10.11999/JEIT220122
引用本文: 陈勇, 金曼莉, 刘焕淋, 汪波, 黄美永. 基于特征增强模块的小尺度行人检测[J]. 电子与信息学报, 2023, 45(4): 1445-1453. doi: 10.11999/JEIT220122
CHEN Yong, JIN Manli, LIU Huanlin, WANG Bo, HUANG Meiyong. Small-scale Pedestrian Detection Based on Feature Enhancement Strategy[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1445-1453. doi: 10.11999/JEIT220122
Citation: CHEN Yong, JIN Manli, LIU Huanlin, WANG Bo, HUANG Meiyong. Small-scale Pedestrian Detection Based on Feature Enhancement Strategy[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1445-1453. doi: 10.11999/JEIT220122

基于特征增强模块的小尺度行人检测

doi: 10.11999/JEIT220122 cstr: 32379.14.JEIT220122
基金项目: 国家自然科学基金 (51977021),重庆市技术创新与应用发展专项(cstc2019jscx-mbdX0004),重庆市教委科学技术研究项目(KJQN201900614)
详细信息
    作者简介:

    陈勇:男,博士,教授,主要研究方向为图像处理与模式识别

    金曼莉:女,硕士生,研究方向为图像处理

    刘焕淋:女,博士生导师,教授,主要研究方向为信号处理

    汪波:男,硕士生,研究方向为图像处理

    黄美永:女,硕士生,研究方向为图像处理

    通讯作者:

    陈勇 chenyong@cqupt.edu.cn

  • 中图分类号: TN911.73; TP391.41

Small-scale Pedestrian Detection Based on Feature Enhancement Strategy

Funds: The National Natural Science Foundation of China (51977021), Chongqing Key Technology Innovation Project (cstc2019jscx-mbdX0004), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN201900614)
  • 摘要: 行人检测中,小尺度行人时常被漏检、误检。为了提升小尺度行人的检测准确率并且降低其漏检率,该文提出一个特征增强模块。首先,考虑到小尺度行人随着网络加深特征逐渐减少的问题,特征融合策略突破特征金字塔层级结构的约束,融合深层、浅层特征图,保留了大量小尺度行人特征。然后,考虑到小尺度行人特征容易与背景信息发生混淆的问题,通过自注意力模块联合通道注意力模块建模特征图空间、通道关联性,利用小尺度行人上下文信息和通道信息,增强了小尺度行人特征并且抑制了背景信息。最后,基于特征增强模块构建了一个小尺度行人检测器。所提方法在CrowdHuman数据集中小尺度行人的检测准确率为19.8%,检测速度为22帧/s,在CityPersons数据集中小尺度行人的误检率为13.1%。结果表明该方法对于小尺度行人的检测效果优于其他对比算法且实现了较快的检测速度。
  • 图  1  模型总体结构

    图  2  CSPDarknet网络部分组件

    图  3  特征增强模块

    图  4  自注意力模块

    图  5  通道注意力模块

    图  6  对比实验效果图

    表  1  COCO数据集中目标尺度划分标准

    区域目标尺度
    ${\rm{area}} < {32^2}$个像素点small
    ${32^2} < {\rm{area}} < {96^2}$个像素点middle
    ${96^2} < {\rm{area}}$个像素点large
    下载: 导出CSV

    表  2  CityPersons[13]数据集中部分子集划分标准

    子集行人高度遮挡程度
    Bare>50 PXs0.1≤occlusion
    Reasonable>50 PXsocclusion<0.35
    Partial>50 PXs0.1<occlusion≤0.35
    Heavy>50 PXs0.35<occlusion≤0.8
    下载: 导出CSV

    表  3  模块验证实验结果(%)

    方法AP ↑AP50 ↑AP75 ↑Small AP ↑Middle AP ↑Large AP ↑
    Baseline45.271.747.518.044.762.5
    Baseline+特征融合策略45.271.747.820.145.660.5
    Baseline+特征融合策略+自注意力模块45.772.348.419.544.961.8
    Baseline+通道注意力模块44.571.346.919.145.059.9
    本文模型46.972.749.819.846.563.8
    下载: 导出CSV

    表  4  对比实验结果(%)

    方法AP↑AP50↑AP75↑Small AP ↑Middle AP↑Large AP ↑
    RetinaNet[17]31.760.229.79.630.547.3
    Yolov4[16]33.466.232.416.438.242.1
    CenterNet[18]28.457.025.611.132.636.7
    Baseline45.271.747.518.044.762.5
    本文模型46.972.749.819.846.563.8
    下载: 导出CSV

    表  5  运行时间实验结果

    方法fps(帧/s)↑
    Yolov4[16]21.3
    Faster R-CNN[19]9.8
    CenterNet[18]25.6
    本文模型22.1
    下载: 导出CSV

    表  6  泛化性实验结果(%)

    方法Bare MR↓Reasonable MR↓Partial MR↓Heavy MR↓Small MR↓Medium MR↓Large MR↓
    RepLoss[20]7.613.216.856.9
    TLL[21]10.015.517.253.6
    ALFNet[22]8.412.011.451.919.05.76.6
    CAFL[23]7.611.412.150.4
    LBST[24]12.8
    OR-CNN[25]6.712.815.355.7
    CSP[26]7.311.010.449.316.03.76.5
    MFGP[27]8.010.910.949.9
    文献[28]7.910.610.250.214.33.57.0
    本文模型7.210.611.350.713.13.77.5
    下载: 导出CSV
  • [1] 张功国, 吴建, 易亿, 等. 基于集成卷积神经网络的交通标志识别[J]. 重庆邮电大学学报:自然科学版, 2019, 31(4): 571–577. doi: 10.3979/j.issn.1673-825X.2019.04.019

    ZHANG Gongguo, WU Jian, YI Yi, et al. Traffic sign recognition based on ensemble convolutional neural network[J]. Journal of Chongqing University of Posts and Telecommunications:Natural Science Edition, 2019, 31(4): 571–577. doi: 10.3979/j.issn.1673-825X.2019.04.019
    [2] 高新波, 路文, 查林, 等. 超高清视频画质提升技术及其芯片化方案[J]. 重庆邮电大学学报:自然科学版, 2020, 32(5): 681–697. doi: 10.3979/j.issn.1673-825X.2020.05.001

    GAO Xinbo, LU Wen, ZHA Lin, et al. Quality elevation technique for UHD video and its VLSI solution[J]. Journal of Chongqing University of Posts and Telecommunications:Natural Science Edition, 2020, 32(5): 681–697. doi: 10.3979/j.issn.1673-825X.2020.05.001
    [3] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37.
    [4] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944.
    [5] LU Chengye, WU Sheng, JIANG Chunxiao, et al. Weak harmonic signal detection method in chaotic interference based on extended Kalman filter[J]. Digital Communications and Networks, 2019, 5(1): 51–55. doi: 10.1016/j.dcan.2018.10.004
    [6] LUO Xiong, LI Jianyuan, WANG Weiping, et al. Towards improving detection performance for malware with a correntropy-based deep learning method[J]. Digital Communications and Networks, 2021, 7(4): 570–579. doi: 10.1016/j.dcan.2021.02.003
    [7] LI Jianan, LIANG Xiaodan, WEI Yunchao, et al. Perceptual generative adversarial networks for small object detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 1951–1959.
    [8] CAI Zhaowei and VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6154–6162.
    [9] HU Han, GU Jiayuan, ZHANG Zheng, et al. Relation networks for object detection[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3588–3597.
    [10] KRISHNA H and JAWAHAR C V. Improving small object detection[C]. Proceedings of the 4th IAPR Asian Conference on Pattern Recognition, Nanjing, China, 2017: 340–345.
    [11] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNeT: A new backbone that can enhance learning capability of CNN[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, United States, 2020: 1571–1580.
    [12] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/tpami.2015.2389824
    [13] GE Zheng, LIU Songtao, WANG Feng, et al. YOLOX: Exceeding YOLO series in 2021[EB/OL]. https://arxiv.org/abs/2107.08430, 2021.
    [14] SHAO Shuai, ZHAO Zijian, LI Boxun, et al. CrowdHuman: A benchmark for detecting human in a crowd[EB/OL]. https://arxiv.org/abs/1805.00123, 2018.
    [15] ZHANG Shanshan, BENENSON R, and SCHIELE B. CityPersons: A diverse dataset for pedestrian detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4457–4465.
    [16] BOCHKOVSKIY A, WANG C Y, and LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. https://arxiv.org/abs/2004.10934, 2020.
    [17] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318–327. doi: 10.1109/TPAMI.2018.2858826
    [18] ZHOU Xingyi, WANG Dequan, and KRÄHENBÜHL P. Objects as points[EB/OL]. https://arxiv.org/abs/1904.07850, 2019.
    [19] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]. Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 91–99.
    [20] WANG Xinlong, XIAO Tete, JIANG Yuning, et al. Repulsion loss: Detecting pedestrians in a crowd[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7774–7783.
    [21] SONG Tao, SUN Leiyu, XIE Di, et al. Small-scale pedestrian detection based on topological line localization and temporal feature aggregation[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 554–569.
    [22] LIU Wei, LIAO Shengcai, HU Weidong, et al. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 643–659.
    [23] FEI Chi, LIU Bin, CHEN Zhu, et al. Learning pixel-level and instance-level context-aware features for pedestrian detection in crowds[J]. IEEE Access, 2019, 7: 94944–94953. doi: 10.1109/ACCESS.2019.2928879
    [24] CAO Jiale, PANG Yanwei, HAN Jungong, et al. Taking a look at small-scale pedestrians and occluded pedestrians[J]. IEEE Transactions on Image Processing, 2019, 29: 3143–3152. doi: 10.1109/TIP.2019.2957927
    [25] ZHANG Shifeng, WEN Longyin, BIAN Xiao, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 657–674.
    [26] LIU Wei, LIAO Shengcai, REN Weiqiang, et al. High-level semantic feature detection: A new perspective for pedestrian detection[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 5182–5191.
    [27] ZHANG Yihan. Multi-scale object detection model with anchor free approach and center of gravity prediction[C]. Proceedings of 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 2020: 38–45.
    [28] 陈勇, 谢文阳, 刘焕淋, 等. 结合头部和整体信息的多特征融合行人检测[J]. 电子与信息学报, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268

    CHEN Yong, XIE Wenyang, LIU Huanlin, et al. Multi-feature fusion pedestrian detection combining head and overall information[J]. Journal of Electronics &Information Technology, 2022, 44(4): 1453–1460. doi: 10.11999/JEIT210268
  • 加载中
图(6) / 表(6)
计量
  • 文章访问数:  721
  • HTML全文浏览量:  550
  • PDF下载量:  148
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-29
  • 修回日期:  2022-05-28
  • 网络出版日期:  2022-06-22
  • 刊出日期:  2023-04-10

目录

    /

    返回文章
    返回