基于多尺度加权特征融合网络的地铁行人目标检测算法

董小伟; 韩悦; 张正; 曲洪斌; 高国飞; 陈明钿; 李博

doi:10.11999/JEIT200450

基于多尺度加权特征融合网络的地铁行人目标检测算法

doi: 10.11999/JEIT200450

1.
北方工业大学信息学院北京 100144
2.
中国石油管道局工程有限公司国际事业部北京 065000
3.
北京城建设计发展集团股份有限公司城市轨道交通绿色与安全建造技术国家工程实验室北京 100037

基金项目: 北京市自然科学基金(4192002)，北方工业大学科研启动基金

详细信息

作者简介:
董小伟：女，1978年生，博士，研究方向为高速信号处理

韩悦：女，1996年生，硕士生，研究方向为人工智能与图像处理

张正：男，1983年生，副研究员，研究方向为人工智能与图像处理

曲洪斌：男，1976年生，工程师，研究方向为信息化应用

高国飞：男，1983年生，高级工程师，研究方向为城市轨道交通安全

陈明钿：男，1991年生，工程师，研究方向为城市轨道交通安全

李博：男，1995年生，硕士生，研究方向为人工智能与图像处理

通讯作者:
韩悦　hanyue_428@163.com

中图分类号: TN911.73
计量
- 文章访问数: 1515
- HTML全文浏览量: 635
- PDF下载量: 629
- 被引次数: 0
出版历程
- 收稿日期: 2020-06-02
- 修回日期: 2020-10-18
- 网络出版日期: 2020-10-21
- 刊出日期: 2021-07-10

Metro Pedestrian Detection Algorithm Based on Multi-scale Weighted Feature Fusion Network

1.
School of Information Science and Technology, North China University of Technology, Beijing 100144, China
2.
International Business Department, China Petroleum Pipeline Engineering Co., Ltd., Beijing 065000, China
3.
Beijing Urban Construction Design and Development Group Co., Ltd., National Engineering Laboratory for Green and Safe Construction Technology of Urban Rail Transit, Beijing 100037, China

Funds: Beijing Natural Science Foundation (4192002), The Scientific Research Foundation of North University of Technology

摘要

摘要: 随着地铁乘客的大量增加，实时准确地监测地铁站内客流量对于保证乘客安全具有重要意义。针对地铁场景复杂、行人目标小等特点，该文提出了多尺度加权特征融合(MWF)网络，实现地铁客流量的精准实时监测。在数据预处理阶段，该文提出过采样目标增强算法，对小目标占比不足的图片进行拼接处理，增加小目标在训练时的迭代频率。其次，在单镜头多核检测器(SSD)网络基础上添加了基于VGG16网络的特征提取层，将不同尺度的特征层以不同方式进行加权融合，并选出最优的特征融合方式。最终，结合小目标过采样增强算法，得到多尺度加权特征融合模型。实验证明，该方法与SSD网络相比，在保证实时性的同时，检测精度提升了5.82%。
- 目标检测 /
- 小目标 /
- 深度网络 /
- 加权特征融合
Abstract: With the large increase of passengers in metro stations, precise and real-time monitoring of passenger flow in subway stations is of great significance for ensuring passenger safety. Based on the features of complicated subway scenes and small pedestrian targets, a Multi-scale Weighted Feature (MWF) fusion network to achieve accurate real-time monitoring of subway passengers is proposed. In the data preprocessing stage, an oversampling target enhancement algorithm is proposed to stitch the pictures with an insufficient proportion of small targets to increase the iteration frequency of small targets during training. Secondly, feature extraction layers based on the VGG16 network are added to the Single Shot multibox Detector (SSD) network. The feature layers of different scales are weighted and fused in different ways, and the optimal feature fusion method is selected. Finally, combined with the small target oversampling enhancement algorithm, a multi-scale weighted feature fusion model is obtained. Experiments show that the detection accuracy of this method has improved by 5.82 percent compared with the SSD network and doesn’t reduce the speed of detection.
- Target detection /
- Small target /
- Deep network /
- Weighted feature fusion

HTML全文

图 1 SSD网络检测流程

下载: 全尺寸图片幻灯片

图 2 MWFSSD检测流程图

下载: 全尺寸图片幻灯片

图 3 地铁行人样本库示例

下载: 全尺寸图片幻灯片

图 4 COCO数据集和地铁行人数据集目标尺寸分布

下载: 全尺寸图片幻灯片

图 5 小目标过采样增强算法

下载: 全尺寸图片幻灯片

图 6 SSD与MWFSSD检测效果对比图

下载: 全尺寸图片幻灯片

表 1 特征层融合精度结果比较

方法	预训练	融合层	mAP
SSD300	×	None	64.89
SSD300	√	None	83.90
MWFSSD300	×	Conv4+Conv7+ Conv8	82.79
MWFSSD300	√	Conv4+Conv7+ Conv8	85.12
MWFSSD300	√	Conv3+Conv7	83.67
MWFSSD300	√	Conv7+ Conv8	82.10
MWFSSD300	√	Conv3+Conv4+Conv7+ Conv8	86.48
MWFSSD300	√	Conv4+Conv7+ Conv8	85.22

下载: 导出CSV

表 2 不同权重分配的检测结果

权重分配方式	w₁	w₂	w₃	w₄	mAP
0	0	0	0	0	86.48
1	0.4	0.3	0.2	0.1	86.39
2	0.5	0.3	0.1	0.1	87.92
3	0.2	0.2	0.3	0.3	86.44

下载: 导出CSV

表 3 MWFSSD与主流检测方法检测结果对比

网络	mAP(%)	fps(frame/s)
MWFSSD	89.72	32
SSD	83.90	43
Faster-RCNN	88.86	20
YoloV3	86.50	38

下载: 导出CSV

参考文献(21)

[1]	GIRSHICK R. Fast R-CNN[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448. doi: 10.1109/ICCV.2015.169.
[2]	REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
[3]	王殿伟, 何衍辉, 李大湘, 等. 改进的YOLOv3红外视频图像行人检测算法[J]. 西安邮电大学学报, 2018, 23(4): 48–52, 67. doi: 10.13682/j.issn.2095-6533.2018.04.008 WANG Dianwei, HE Yanhui, LI Daxiang, et al. An improved infrared video image pedestrian detection algorithm[J]. Journal of Xi’an University of Posts and Telecommunications, 2018, 23(4): 48–52, 67. doi: 10.13682/j.issn.2095-6533.2018.04.008
[4]	LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[J]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2
[5]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944. doi: 10.1109/CVPR.2017.106.
[6]	吕俊奇, 邱卫根, 张立臣, 等. 多层卷积特征融合的行人检测[J]. 计算机工程与设计, 2018, 39(11): 3481–3485. doi: 10.16208/j.issn1000-7024.2018.11.032 LÜ Junqi, QIU Weigen, ZHANG Lichen, et al. Multi-scale convolutional feature fusion for pedestrian detection[J]. Computer Engineering and Design, 2018, 39(11): 3481–3485. doi: 10.16208/j.issn1000-7024.2018.11.032
[7]	张文明, 姚振飞, 高雅昆, 等. 一种平衡准确性以及高效性的显著性目标检测深度卷积网络模型[J]. 电子与信息学报, 2020, 42(5): 1201–1208. doi: 10.11999/JEIT190229 ZHANG Wenming, YAO Zhenfei, GAO Yakun, et al. A deep convolutional network for saliency object detection with balanced accuracy and high efficiency[J]. Journal of Electronics &Information Technology, 2020, 42(5): 1201–1208. doi: 10.11999/JEIT190229
[8]	刘晴, 唐林波, 赵保军, 等. 基于自适应多特征融合的均值迁移红外目标跟踪[J]. 电子与信息学报, 2012, 34(5): 1137–1141. doi: 10.3724/SP.J.1146.2011.01077 LIU Qing, TANG Linbo, ZHAO Baojun, et al. Infrared target tracking based on adaptive multiple features fusion and mean shift[J]. Journal of Electronics &Information Technology, 2012, 34(5): 1137–1141. doi: 10.3724/SP.J.1146.2011.01077
[9]	颜伟, 耿路, 周雷, 等. 基于海情和三次样条插值算法的舰船雷达散射截面优化分析方法[J]. 电子与信息学报, 2018, 40(3): 579–586. doi: 10.11999/JEIT170562 YAN Wei, GENG Lu, ZHOU Lei, et al. Optimization analysis method on ship RCS based on sea conditions and cubic spline interpolation algorithm[J]. Journal of Electronics &Information Technology, 2018, 40(3): 579–586. doi: 10.11999/JEIT170562
[10]	邓苗, 张基宏, 柳伟, 等. 基于全变分的权值优化的多尺度变换图像融合[J]. 电子与信息学报, 2013, 35(7): 1657–1663. doi: 10.3724/SP.J.1146.2012.01183 DENG Miao, ZHANG Jihong, LIU Wei, et al. A total variation-based lowpass weight function optimization in multiscale image fusion[J]. Journal of Electronics &Information Technology, 2013, 35(7): 1657–1663. doi: 10.3724/SP.J.1146.2012.01183
[11]	李秋华, 李吉成, 沈振康. 基于多尺度特征融合的红外图像小目标检测[J]. 系统工程与电子技术, 2005, 27(9): 1557–1560. doi: 10.3321/j.issn:1001-506X.2005.09.018 LI Qiuhua, LI Jicheng, and SHEN Zhenkang. IR image small target detection based on multi-scale feature fusion[J]. Systems Engineering and Electronics, 2005, 27(9): 1557–1560. doi: 10.3321/j.issn:1001-506X.2005.09.018
[12]	姜文涛, 张驰, 张晟翀, 等. 多尺度特征图融合的目标检测[J]. 中国图象图形学报, 2019, 24(11): 1918–1931. doi: 10.11834/jig.190021 JIANG Wentao, ZHANG Chi, ZHANG Shengchong, et al. Multiscale feature map fusion algorithm for target detection[J]. Journal of Image and Graphics, 2019, 24(11): 1918–1931. doi: 10.11834/jig.190021
[13]	王瑶, 王正勇, 何小海, 等. 基于多尺度训练库与多特征融合的人脸识别[J]. 电视技术, 2015, 39(1): 121–126. doi: 10.16280/j.videoe.2015.01.031 WANG Yao, WANG Zhengyong, HE Xiaohai, et al. Face recognition by features fusion based on multiscale training set[J]. Video Engineering, 2015, 39(1): 121–126. doi: 10.16280/j.videoe.2015.01.031
[14]	余春艳, 徐小丹, 钟诗俊. 面向显著性目标检测的SSD改进模型[J]. 电子与信息学报, 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118 YU Chunyan, XU Xiaodan, and ZHONG Shijun. An improved SSD model for saliency object detection[J]. Journal of Electronics &Information Technology, 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118
[15]	孙彦景, 石韫开, 云霄, 等. 基于多层卷积特征的自适应决策融合目标跟踪算法[J]. 电子与信息学报, 2019, 41(10): 2464–2470. doi: 10.11999/JEIT180971 SUN Yanjing, SHI Yunkai, YUN Xiao, et al. Adaptive strategy fusion target tracking based on multi-layer convolutional features[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2464–2470. doi: 10.11999/JEIT180971
[16]	张思宇, 张轶. 基于多尺度特征融合的小目标行人检测[J]. 计算机工程与科学, 2019, 41(9): 1627–1634. doi: 10.3969/j.issn.1007-130X.2019.09.014 ZHANG Siyu and ZHANG Yi. Small target pedestrian detection based on multi-scale feature fusion[J]. Computer Engineering and Science, 2019, 41(9): 1627–1634. doi: 10.3969/j.issn.1007-130X.2019.09.014
[17]	DOLLAR P, WOJEK C, SCHIELE B, et al. Pedestrian detection: An evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4): 743–761. doi: 10.1109/TPAMI.2011.155
[18]	汪荣贵, 韩梦雅, 杨娟, 等. 多级注意力特征网络的小样本学习[J]. 电子与信息学报, 2020, 42(3): 772–778. doi: 10.11999/JEIT190242 WANG Ronggui, HAN Mengya, YANG Juan, et al. Multi-level attention feature network for few-shot learning[J]. Journal of Electronics &Information Technology, 2020, 42(3): 772–778. doi: 10.11999/JEIT190242
[19]	代科学, 李国辉, 涂丹, 等. 监控视频运动目标检测减背景技术的研究现状和展望[J]. 中国图象图形学报, 2007, 11(7): 919–927. doi: 10.3969/j.issn.1006-8961.2006.07.002 DAI Kexue, LI Guohui, TU Dan, et al. Prospects and current studies on background subtraction techniques for moving objects detection from surveillance video[J]. Journal of Image and Graphics, 2007, 11(7): 919–927. doi: 10.3969/j.issn.1006-8961.2006.07.002
[20]	陈勇, 刘曦, 刘焕淋. 基于特征通道和空间联合注意机制的遮挡行人检测方法[J]. 电子与信息学报, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606 CHEN Yong, LIU Xi, and LIU Huanlin. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information[J]. Journal of Electronics &Information Technology, 2020, 42(6): 1486–1493. doi: 10.11999/JEIT190606
[21]	贺丰收, 何友, 刘准钆, 等. 卷积神经网络在雷达自动目标识别中的研究进展[J]. 电子与信息学报, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899 HE Fengshou, HE You, LIU Zhunga, et al. Research and development on applications of convolutional neural networks of radar automatic target recognition[J]. Journal of Electronics &Information Technology, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899