高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

复杂环境下多尺度行人实时检测方法

周薇娜 孙丽华 徐志京

周薇娜, 孙丽华, 徐志京. 复杂环境下多尺度行人实时检测方法[J]. 电子与信息学报, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436
引用本文: 周薇娜, 孙丽华, 徐志京. 复杂环境下多尺度行人实时检测方法[J]. 电子与信息学报, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436
Weina ZHOU, Lihua SUN, Zhijing XU. A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment[J]. Journal of Electronics & Information Technology, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436
Citation: Weina ZHOU, Lihua SUN, Zhijing XU. A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment[J]. Journal of Electronics & Information Technology, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436

复杂环境下多尺度行人实时检测方法

doi: 10.11999/JEIT200436
基金项目: 国家自然科学基金(61404083, 52071200),中国博士后科学基金(2015M581527),专用集成电路与系统国家重点实验室开放研究课题(2021KF010)
详细信息
    作者简介:

    周薇娜:女,1982年生,副教授,研究方向为图像处理、电路和嵌入式系统、人工智能

    孙丽华:女,1995年生,硕士生,研究方向为模式识别与图像处理

    徐志京:男,1972年生,副教授,研究方向为海上智能交通系统、信息获取与智能处理

    通讯作者:

    周薇娜 wnzhou@shmtu.edu.cn

  • 中图分类号: TN911.73

A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment

Funds: The National Natural Science Foundation of China (61404083, 52071200), China Postdoctoral Science Foundation (2015M581527), The State Key Laboratory of ASIC & System (2021KF010)
  • 摘要: 作为计算机视觉和图像处理研究领域中的经典课题,行人检测技术在智能驾驶、视频监控等领域中具有广泛的应用空间。然而,面对一些复杂的环境和情况,如阴雨、雾霾、被遮挡、照明度变化、目标尺度差异大等,常见的基于可见光或红外图像的行人检测方法的效果尚不尽如人意,无论是在检测准确率还是检测速度上。该文分析并抓住可见光和红外检测系统中行人特征差异较大,但在不同环境中又各有优势的特点,并结合多尺度特征提取方法,提出一种适用于多样复杂环境下多尺度行人实时检测的方法——融合行人检测网络(FPDNet)。该网络主要由特征提取骨干网络、多尺度检测和信息决策融合3个部分构成,可自适应提取可见光或红外背景下的多尺度行人。实验结果证明,该检测网络在多种复杂视觉环境下都具有较好的适应能力,在检测准确性和检测速度上均能满足实际应用的需求。
  • 图  1  FPDNet顶层框图

    图  2  多尺度检测网络内部结构图

    图  3  骨干基础网络基本单元

    图  4  SPP层结构

    图  5  多尺度检测模块

    图  6  基于决策融合的目标检测流程

    图  7  4幅行人检测实验图

    图  8  融合检测效果对比图

    表  1  骨干基础网络结构表

    重复次数类别卷积核卷积核尺寸输出特征图大小
    Conv647×7/2208×208
    Max2×2/2104×104
    3Conv643×3/1
    Conv643×3/1
    Res104×104
    Conv1283×3/2
    Conv1283×3/1
    Res52×52
    3Conv1283×3/1
    Conv1283×3/1
    Res52×52
    Conv2563×3/2
    Conv2563×3/1
    Res26×26
    4Conv2563×3/1
    Conv2563×3/1
    Res26×26
    Conv5123×3/2
    Conv5123×3/1
    Res13×13
    2Conv5123×3/1
    Conv5123×3/1
    Res13×13
    下载: 导出CSV

    表  2  候选框的宽度和高度表

    检测层尺寸(像素)(宽度,高度)(宽度,高度)(宽度,高度)
    13×13(41,103)(53,138)(77,205)
    26×26(30,74)(30,94)(35,84)
    104×104(20,30)(20,51)(27,61)
    下载: 导出CSV

    表  3  网络模型的对比结果表

    模型mAP(%)FPS
    ACF+T+THOG71.4932
    HalFus+TSDCNN88.242.5
    TSDCNN+Ada89.031.3
    SSD88.0142
    YOLOv391.3545
    YOLOv3-tiny80.57155
    FPDNet91.2968
    下载: 导出CSV
  • [1] SAGAR U, RAJA R, and SHEKHAR H. Deep learning for pedestrian detection[J]. International Journal of Scientific and Research Publications, 2019, 9(8): 66–69. doi: 10.29322/IJSRP.9.08.2019.p9212
    [2] PRISCILLA C V and SHEILA S P A. Pedestrian detection - A survey[C]. Proceedings of the 1st International Conference on Innovative Computing and Cutting-edge Technologies, Istanbul, Turkey, 2020: 349–358. doi: 10.1007/978-3-030-38501-9_35.
    [3] CHEN Runxing, WANG Xiaofei, LIU Yong, et al. A survey of pedestrian detection based on deep learning[C]. Proceedings of the 8th International Conference on Communications, Signal Processing, and Systems, Singapore, 2020: 1511–1516.
    [4] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. doi: 10.1023/B:VISI.0000029664.99615.94
    [5] 孙锐, 陈军, 高隽. 基于显著性检测与HOG-NMF特征的快速行人检测方法[J]. 电子与信息学报, 2013, 35(8): 1921–1926. doi: 10.3724/SP.J.1146.2012.01700

    SUN Rui, CHEN Jun, and GAO Jun. Fast pedestrian detection based on saliency detection and HOG-NMF features[J]. Journal of Electronics &Information Technology, 2013, 35(8): 1921–1926. doi: 10.3724/SP.J.1146.2012.01700
    [6] FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part- based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627–1645. doi: 10.1109/TPAMI.2009.167
    [7] HASTIE T, ROSSET S, ZHU Ji, et al. Multi-class AdaBoost[J]. Statistics and its Interface, 2009, 2(3): 349–360. doi: 10.4310/SII.2009.v2.n3.a8
    [8] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5–32. doi: 10.1023/A:1010933404324
    [9] 陈勇, 刘曦, 刘焕淋. 基于特征通道和空间联合注意机制的遮挡行人检测方法[J]. 电子与信息学报, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606

    CHEN Yong, LIU Xi, and LIU Huanlin. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information[J]. Journal of Electronics &Information Technology, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606
    [10] REN Jing, REN Rui, GREEN M, et al. Defect detection from X-ray images using a three-stage deep learning algorithm[C]. Proceedings of 2019 IEEE Canadian Conference of Electrical and Computer Engineering, Edmonton, Canada, 2019: 1–4. doi: 10.1109/CCECE.2019.8861944.
    [11] PAN Meiyan, CHEN Jianjun, WANG Shengli, et al. A novel approach for marine small target detection based on deep learning[C]. Proceedings of the IEEE 4th International Conference on Signal and Image Processing, Wuxi, China, 2019: 395–399. doi: 10.1109/SIPROCESS.2019.8868862.
    [12] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587. doi: 10.1109/CVPR.2014.81.
    [13] GIRSHICK R. Fast R-CNN[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448. doi: 10.1109/ICCV.2015.169.
    [14] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
    [15] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788. doi: 10.1109/CVPR.2016.91.
    [16] REDMON J and FARHADI A. YOLO9000: Better, faster, stronger[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517–6525. doi: 10.1109/CVPR.2017.690.
    [17] REDMON J and FARHADI A. YOLOv3: An incremental improvement[J]. arXiv: 1804.02767, 2018.
    [18] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2.
    [19] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/tpami.2015.2389824
    [20] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    [21] LIU Weiyang, WEN Yandong, YU Zhiding, et al. Large-margin Softmax loss for convolutional neural networks[C]. Proceedings of the 33rd International Conference on Machine Learning, New York, USA, 2016: 507–516.
    [22] HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: Benchmark dataset and baseline[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1037–1045. doi: 10.1109/CVPR.2015.7298706.
    [23] KANUNGO T, MOUNT D M, NETANYAHU N S, et al. An efficient K-means clustering algorithm: Analysis and implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881–892. doi: 10.1109/TPAMI.2002.1017616
    [24] BOTTOU L. Stochastic gradient descent tricks[M]. Neural Networks: Tricks of the Trade. 2nd ed. Berlin Germany: Springer, 2012: 421–436. doi: 10.1007/978-3-642-35289-8_25.
    [25] RAHMAN M A and WANG Yang. Optimizing intersection-over-union in deep neural networks for image segmentation[C]. Proceedings of the 12th International Symposium on Advances in Visual Computing, Las Vegas, USA, 2016: 234–244. doi: 10.1007/978-3-319-50835-1_22.
    [26] KROTOSKY S J and TRIVEDI M M. On color-, infrared-, and multimodal-stereo approaches to pedestrian detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2007, 8(4): 619–629. doi: 10.1109/TITS.2007.908722
    [27] LIU Jingjing, ZHANG Shaoting, WANG Shu, et al. Multispectral deep neural networks for pedestrian detection[C]. Proceedings of 2016 British Machine Vision Conference, York, UK, 2016: 73.1–73.13. doi: 10.5244/C.30.73.
    [28] KÖNIG D, ADAM M, JARVERS C, et al. Fully convolutional region proposal networks for multispectral person detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 243–250. doi: 10.1109/CVPRW.2017.36.
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  1112
  • HTML全文浏览量:  689
  • PDF下载量:  133
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-06-01
  • 修回日期:  2020-12-01
  • 网络出版日期:  2021-03-31
  • 刊出日期:  2021-07-10

目录

    /

    返回文章
    返回