A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment

Weina ZHOU; Lihua SUN; Zhijing XU

doi:10.11999/JEIT200436

Volume 43 Issue 7

Jul. 2021

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2021 > 43(7): 2063-2070

Weina ZHOU, Lihua SUN, Zhijing XU. A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment[J]. Journal of Electronics & Information Technology, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436

Citation:

Weina ZHOU, Lihua SUN, Zhijing XU. A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment[J]. Journal of Electronics & Information Technology, 2021, 43(7): 2063-2070. doi: 10.11999/JEIT200436

Citation:

PDF( 2994 KB)

A Real-time Detection Method for Multi-scale Pedestrians in Complex Environment

doi: 10.11999/JEIT200436 cstr: 32379.14.JEIT200436

Shanghai Maritime University, Shanghai 201306, China

Funds: The National Natural Science Foundation of China (61404083, 52071200), China Postdoctoral Science Foundation (2015M581527), The State Key Laboratory of ASIC & System (2021KF010)

Received Date: 2020-06-01
Rev Recd Date: 2020-12-01

Available Online: 2021-03-31

Publish Date: 2021-07-10

Abstract

Abstract

As a classic subject in computer vision and image processing, pedestrian detection has a wide range of applications to intelligence driving and video monitoring fields. However, most of pedestrian detection methods based on visible or infrared images have no satisfying result in some complex environments or situations, such as rain, smog, occlusion, variation of illuminance and target scales, no matter in terms of detection accuracy or speed. This paper analyzes and finds out that, pedestrians usually have quite different characteristics in visible and infrared image, and which have their own advantages in different environments. Therefore, combining fusion and multi-scale technology, a real-time multi-scale pedestrian detection algorithm suitable for complex environment named FPDNet (Fusion Pedestrian Detection Network) is proposed. The detection framework is consisted by three main modules: feature extraction backbone network, multi-scale detection network and decision-level fusion network. The proposed method is able to extract multi-scale pedestrian characteristics under visible or infrared background adaptively. Experimental results prove that the detection network has good adaptability in complex visual environments, and can meet the demands of practical applications to detection accuracy and speed.
- Pedestrian detection,
- Complex environment,
- Adaptive extracting,
- Multi-scale,
- Decision-level fusion

FullText(HTML)

References(28)

References

[1]	SAGAR U, RAJA R, and SHEKHAR H. Deep learning for pedestrian detection[J]. International Journal of Scientific and Research Publications, 2019, 9(8): 66–69. doi: 10.29322/IJSRP.9.08.2019.p9212
[2]	PRISCILLA C V and SHEILA S P A. Pedestrian detection - A survey[C]. Proceedings of the 1st International Conference on Innovative Computing and Cutting-edge Technologies, Istanbul, Turkey, 2020: 349–358. doi: 10.1007/978-3-030-38501-9_35.
[3]	CHEN Runxing, WANG Xiaofei, LIU Yong, et al. A survey of pedestrian detection based on deep learning[C]. Proceedings of the 8th International Conference on Communications, Signal Processing, and Systems, Singapore, 2020: 1511–1516.
[4]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. doi: 10.1023/B:VISI.0000029664.99615.94
[5]	孙锐, 陈军, 高隽. 基于显著性检测与HOG-NMF特征的快速行人检测方法[J]. 电子与信息学报, 2013, 35(8): 1921–1926. doi: 10.3724/SP.J.1146.2012.01700 SUN Rui, CHEN Jun, and GAO Jun. Fast pedestrian detection based on saliency detection and HOG-NMF features[J]. Journal of Electronics &Information Technology, 2013, 35(8): 1921–1926. doi: 10.3724/SP.J.1146.2012.01700
[6]	FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part- based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627–1645. doi: 10.1109/TPAMI.2009.167
[7]	HASTIE T, ROSSET S, ZHU Ji, et al. Multi-class AdaBoost[J]. Statistics and its Interface, 2009, 2(3): 349–360. doi: 10.4310/SII.2009.v2.n3.a8
[8]	BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5–32. doi: 10.1023/A:1010933404324
[9]	陈勇, 刘曦, 刘焕淋. 基于特征通道和空间联合注意机制的遮挡行人检测方法[J]. 电子与信息学报, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606 CHEN Yong, LIU Xi, and LIU Huanlin. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information[J]. Journal of Electronics &Information Technology, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606
[10]	REN Jing, REN Rui, GREEN M, et al. Defect detection from X-ray images using a three-stage deep learning algorithm[C]. Proceedings of 2019 IEEE Canadian Conference of Electrical and Computer Engineering, Edmonton, Canada, 2019: 1–4. doi: 10.1109/CCECE.2019.8861944.
[11]	PAN Meiyan, CHEN Jianjun, WANG Shengli, et al. A novel approach for marine small target detection based on deep learning[C]. Proceedings of the IEEE 4th International Conference on Signal and Image Processing, Wuxi, China, 2019: 395–399. doi: 10.1109/SIPROCESS.2019.8868862.
[12]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587. doi: 10.1109/CVPR.2014.81.
[13]	GIRSHICK R. Fast R-CNN[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448. doi: 10.1109/ICCV.2015.169.
[14]	REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
[15]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788. doi: 10.1109/CVPR.2016.91.
[16]	REDMON J and FARHADI A. YOLO9000: Better, faster, stronger[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517–6525. doi: 10.1109/CVPR.2017.690.
[17]	REDMON J and FARHADI A. YOLOv3: An incremental improvement[J]. arXiv: 1804.02767, 2018.
[18]	LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2.
[19]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/tpami.2015.2389824
[20]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
[21]	LIU Weiyang, WEN Yandong, YU Zhiding, et al. Large-margin Softmax loss for convolutional neural networks[C]. Proceedings of the 33rd International Conference on Machine Learning, New York, USA, 2016: 507–516.
[22]	HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: Benchmark dataset and baseline[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1037–1045. doi: 10.1109/CVPR.2015.7298706.
[23]	KANUNGO T, MOUNT D M, NETANYAHU N S, et al. An efficient K-means clustering algorithm: Analysis and implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881–892. doi: 10.1109/TPAMI.2002.1017616
[24]	BOTTOU L. Stochastic gradient descent tricks[M]. Neural Networks: Tricks of the Trade. 2nd ed. Berlin Germany: Springer, 2012: 421–436. doi: 10.1007/978-3-642-35289-8_25.
[25]	RAHMAN M A and WANG Yang. Optimizing intersection-over-union in deep neural networks for image segmentation[C]. Proceedings of the 12th International Symposium on Advances in Visual Computing, Las Vegas, USA, 2016: 234–244. doi: 10.1007/978-3-319-50835-1_22.
[26]	KROTOSKY S J and TRIVEDI M M. On color-, infrared-, and multimodal-stereo approaches to pedestrian detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2007, 8(4): 619–629. doi: 10.1109/TITS.2007.908722
[27]	LIU Jingjing, ZHANG Shaoting, WANG Shu, et al. Multispectral deep neural networks for pedestrian detection[C]. Proceedings of 2016 British Machine Vision Conference, York, UK, 2016: 73.1–73.13. doi: 10.5244/C.30.73.
[28]	KÖNIG D, ADAM M, JARVERS C, et al. Fully convolutional region proposal networks for multispectral person detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 243–250. doi: 10.1109/CVPRW.2017.36.