Crowd Counting Method Based on Multi-Scale Enhanced Network

Tao XU; Yinong DUAN; Jiahao DU; Caihua LIU

doi:10.11999/JEIT200331

Volume 43 Issue 6

Jun. 2021

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2021 > 43(6): 1764-1771

Tao XU, Yinong DUAN, Jiahao DU, Caihua LIU. Crowd Counting Method Based on Multi-Scale Enhanced Network[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1764-1771. doi: 10.11999/JEIT200331

Citation:

Tao XU, Yinong DUAN, Jiahao DU, Caihua LIU. Crowd Counting Method Based on Multi-Scale Enhanced Network[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1764-1771. doi: 10.11999/JEIT200331

Citation:

PDF( 2183 KB)

Crowd Counting Method Based on Multi-Scale Enhanced Network

doi: 10.11999/JEIT200331 cstr: 32379.14.JEIT200331

1.
College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
2.
Information Technology Base of Civil Aviation Administration of China, Civil Aviation University of China, Tianjin 300300, China

Funds: The Natural Science Foundation of Tianjin (18JCYBJC85100), The Fundamental Research Funds for the Central Universities from the Civil Aviation University of China (3122018C024), The Scientific Research Startup Project of the Civil Aviation University of China (2017QD16X)

Received Date: 2020-04-28
Rev Recd Date: 2020-10-12

Available Online: 2020-10-16

Publish Date: 2021-06-18

Abstract

Abstract

The performance of the crowd counting methods is degraded due to the commonly used Euclidean loss ignoring the local correlation of images and the limited ability of the model to cope with multi-scale information. A crowd counting method based on Multi-Scale Enhanced Network(MSEN) is proposed to address the above problems. Firstly, an embedded GAN module with a multi-branch generator and a regional discriminator is designed to initially generate crowd density maps and optimize their local correlation. Then, a well-designed scale enhancement module is connected after the embedded GAN module to extract further local features of different scales from different regions, which will strengthen the generalization ability of the model. Extensive experimental results on three challenging public datasets demonstrate that the performance of the proposed method can effectively improve the accuracy and robustness of the prediction.
- Crowd counting,
- Image local correlation,
- Multi-scale feature,
- Embedded GAN module,
- Scale-enhancement module

FullText(HTML)

References(20)

References

[1]	陈朋, 汤一平, 王丽冉, 等. 多层次特征融合的人群密度估计[J]. 中国图象图形学报, 2018, 23(8): 1181–1192. doi: 10.11834/jig.180017
[2]	XIE Weidi, NOBLE J A, and ZISSERMAN A. Microscopy cell counting and detection with fully convolutional regression networks[J]. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2018, 6(3): 283–292. doi: 10.1080/21681163.2016.1149104
[3]	左静, 窦祥胜. 视频车辆分类与计数的模型与应用[J]. 运筹与管理, 2020, 29(1): 124–130.
[4]	CUI Kai, HU Cheng, WANG Rui, et al. Deep-learning-based extraction of the animal migration patterns from weather radar images[J]. Science China Information Sciences, 2020, 63(4): 140304. doi: 10.1007/s11432-019-2800-0
[5]	孙彦景, 石韫开, 云霄, 等. 基于多层卷积特征的自适应决策融合目标跟踪算法[J]. 电子与信息学报, 2019, 41(10): 2464–2470. doi: 10.11999/JEIT180971 SUN Yanjing, SHI Yunkai, YUN Xiao, et al. Adaptive strategy fusion target tracking based on multi-layer convolutional features[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2464–2470. doi: 10.11999/JEIT180971
[6]	蒲磊, 冯新喜, 侯志强, 等. 基于空间可靠性约束的鲁棒视觉跟踪算法[J]. 电子与信息学报, 2019, 41(7): 1650–1657. doi: 10.11999/JEIT180780 PU Lei, FENG Xinxi, HOU Zhiqiang, et al. Robust visual tracking based on spatial reliability constraint[J]. Journal of Electronics &Information Technology, 2019, 41(7): 1650–1657. doi: 10.11999/JEIT180780
[7]	ZHANG Cong, LI Hongshen, WANG Xiaogang, et al. Cross-scene crowd counting via deep convolutional neural networks[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 833–841. doi: 10.1109/CVPR.2015.7298684.
[8]	ZHANG Yingying, ZHOU Desen, CHEN Siqin, et al. Single-image crowd counting via multi-column convolutional neural network[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 589–597. doi: 10.1109/CVPR.2016.70.
[9]	SAM D B, SURYA S, and BABU R V. Switching convolutional neural network for crowd counting[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4031–4039.
[10]	SHEN Zan, XU Yi, NI Bingbing, et al. Crowd counting via adversarial cross-scale consistency pursuit[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 5245–5254. doi: 10.1109/CVPR.2018.00550.
[11]	LI Yuhong, ZHANG Xiaofan, and CHEN Deming. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1091–1110. doi: 10.1109/CVPR.2018.00120.
[12]	CAO Xinkun, WANG Zhipeng, ZHAO Yanyun, et al. Scale aggregation network for accurate and efficient crowd counting[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 757–773.
[13]	SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The International Conference on Learning Representations, San Diego, USA, 2015: 1–14.
[14]	ISOLA P, ZHU Junyan, ZHOU Tinghui, et al. Image-to-image translation with conditional adversarial networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5967–5976.
[15]	ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6230–6239. doi: 10.1109/CVPR.2017.660.
[16]	IDREES H, SALEEMI I, SEIBERT C, et al. Multi-source multi-scale counting in extremely dense crowd images[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013: 2547–2554. doi: 10.1109/CVPR.2013.329
[17]	IDREES H, TAYYAB M, ATHREY K, et al. Composition loss for counting, density map estimation and localization in dense crowds[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 544–559. doi: 10.1007/978-3-030-01216-8_33.
[18]	QU Yanyun, CHEN Yizi, HUANG Jingying, et al. Enhanced pix2pix dehazing network[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 8152–8160. doi: 10.1109/CVPR.2019.00835.
[19]	SHI Miaojing, YANG Zhaohui, XU Chao, et al. Revisiting perspective information for efficient crowd counting[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 7271–7280.
[20]	JIANG Xiaolong, XIAO Zehao, ZHANG Baochang, et al. Crowd counting and density estimation by trellis encoder-decoder networks[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6126–6135. doi: 10.1109/CVPR.2019.00629.