智能交通感知新范式：面向元宇宙的交通标志检测架构

王俊帆; 陈毅; 高明煜; 何志伟; 董哲康; 缪其恒

doi:10.11999/JEIT230357

智能交通感知新范式：面向元宇宙的交通标志检测架构

doi: 10.11999/JEIT230357

王俊帆^{1, 2},
陈毅³,
高明煜^{1, 2},
何志伟^{1, 2},
董哲康^{1, 2, 3, ,},
缪其恒⁴

1.
杭州电子科技大学电子信息学院杭州 310018
2.
浙江省装备电子重点实验室杭州 310018
3.
浙江大学电气工程学院杭州 310027
4.
浙江华锐捷技术有限公司杭州 310051

基金项目: 浙江省研发攻关计划项目(2023C01132)，杭州市重大科技创新项目 (2022AIZD0009)

详细信息

作者简介:
王俊帆：女，博士生，研究方向为智能驾驶、车路协同、目标检测、计算机视觉

陈毅：男，硕士，研究方向为智能驾驶、车路协同、目标检测

高明煜：男，教授，研究方向为车路协同、深度学习

何志伟：男，教授，研究方向为智能交通、深度学习

董哲康：男，副教授，研究方向为计算机视觉、机器学习、智能驾驶

缪其恒：男，博士，研究方向为自动驾驶、计算机视觉

通讯作者:
董哲康　englishp@hdu.edu.cn

中图分类号: TN911.7; TP183
计量
- 文章访问数: 688
- HTML全文浏览量: 181
- PDF下载量: 106
- 被引次数: 0
出版历程
- 收稿日期: 2023-05-04
- 修回日期: 2023-12-01
- 网络出版日期: 2023-12-12
- 刊出日期: 2024-03-27

A New Paradigm for Intelligent Traffic Perception: A Traffic Sign Detection Architecture for the Metaverse

WANG Junfan^{1, 2},
CHEN Yi³,
GAO Mingyu^{1, 2},
HE Zhiwei^{1, 2},
DONG Zhekang^{1, 2, 3
, ,},
MIAO Qiheng⁴

1.
School of Electronics Information, Hangzhou Dianzi University, Hangzhou 310018, China
2.
Zhejiang Provincial Key Lab of Equipment Electronics, Hangzhou 310018, China
3.
College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China
4.
Zhejiang Huaruijie Technology Co., Ltd., Hangzhou 310051, China

Funds: Zhejiang Provincial Major Research and Development Project of China (2023C01132), Hangzhou Major Science and Technology Innovation Project of China (2022AIZD0009)

摘要

摘要: 交通标志检测对智能交通系统和智能驾驶的安全稳定运行具有重要作用。数据分布不平衡、场景单一会对模型性能造成较大影响，而建立一个完备的真实交通场景数据集需要昂贵的时间成本和人工成本。基于此，该文提出一个面向元宇宙的交通标志检测新范式以缓解现有方法对真实数据的依赖。首先，通过建立元宇宙和物理世界之间的场景映射和模型映射，实现检测算法在虚实世界之间的高效运行。元宇宙作为一个虚拟化的数字世界，能够基于物理世界完成自定义场景构建，为模型提供海量多样的虚拟场景数据。同时，该文结合知识蒸馏和均值教师模型建立模型映射，应对元宇宙和物理世界之间存在的数据差异问题。其次，为进一步提高元宇宙下的训练模型对真实驾驶环境的适应性，该文提出启发式注意力机制，通过对特征的定位和学习来提高检测模型的泛化能力。所提架构在CURE-TSD, KITTI, VKITTI数据集上进行实验验证。实验结果表明，所提面向元宇宙的交通标志检测器在物理世界具有优异的检测效果而不依赖大量真实场景，检测准确率达到89.7%，高于近年来其他检测方法。
- 元宇宙 /
- 智能交通系统 /
- 交通标志检测 /
- 场景映射 /
- 模型映射
Abstract: Traffic sign detection plays an important role in the safe and stable operation of intelligent transportation systems and intelligent driving. Unbalanced data distribution and monotonous scene will lead to poor model performance, but building a complete real traffic scene dataset requires expensive time and labor costs. Based on this, a new metaverse-oriented traffic sign detection paradigm is proposed to alleviate the dependence of existing methods on real data. Firstly, by establishing the scene mapping and model mapping between the metaverse and the physical world, the efficient operation of the detection algorithm between the virtual and real worlds is realized. As a virtualized digital world, Metaverse can complete custom scene construction based on the physical world, and provide massive and diverse virtual scene data for the model. At the same time, knowledge distillation and the mean teacher model is combined in this paper to establish a model mapping to deal with the problem of data differences between the metaverse and the physical world. Secondly, in order to further improve the adaptability of the training model under the Metaverse to the real driving environment, a heuristic attention mechanism is designed to improve the generalization ability of the detection model by locating and learning features. The proposed architecture is experimentally verified on the CURE-TSD, KITTI, Virtual KITTI (VKITTI) datasets. Experimental results show that the proposed metaverse-oriented traffic sign detector has excellent detection results in the physical world without relying on a large number of real scenes, and the detection accuracy reaches 89.7%, which is higher than other detection methods of recent years.
- Metaverse /
- Intelligent transportation systems /
- Traffic sign detection /
- Scene mapping /
- Model mapping

HTML全文

图 1 面向元宇宙的交通标志检测框架

下载: 全尺寸图片幻灯片

图 2 元宇宙虚拟交通场景构建

下载: 全尺寸图片幻灯片

图 3 基于视神经科学的跨域目标检测网络

下载: 全尺寸图片幻灯片

图 4 元宇宙下的跨域检测框架图

下载: 全尺寸图片幻灯片

图 5 CURE-TSD类别信息

下载: 全尺寸图片幻灯片

图 6 本文提出的方法在CURE-TSD数据集上的测试结果

下载: 全尺寸图片幻灯片

图 7 本文所提提方法在VKITTI和KITTI上的测试结果

下载: 全尺寸图片幻灯片

图 8 本文所提方法在Meta-CURE数据集上的测试样例

下载: 全尺寸图片幻灯片

图 9 本文所提方法与其他方法的热力图对比

下载: 全尺寸图片幻灯片

表 1 本文主要贡献

参考文献	核心瓶颈	本文贡献
[2–8]	现有基于深度学习的交通标志检测算法依赖于大量的多样数据集进行训练，且实际对算法的测试成本较高，安全性无法得到保证。	本文首次提出了在元宇宙和物理世界实现交通标志检测的新范式。为此，建立了一种场景映射机制，以基于来自物理世界的场景信息构建元宇宙中的交通场景。此外，引入模型映射机制，通过虚拟世界表示增强模型对物理世界中交通标志的识别能力
[9–15]	元宇宙下训练和测试的模型应用于物理世界要求其具备更好的泛化能力，模型性能无差别实现于虚实世界。	本文设计基于启发式注意力的目标检测器。所提出的启发式注意力机制受视神经科学和CAM的启发，结合3维注意力权重的能量函数和目标定位引导，从而提高检测器的特征提取能力和泛化能力。

下载: 导出CSV

表 2 在CURE-TSD数据集上的对比实验

方法	精度	召回率	mAP	AP₅₀	AP_S	AP_M	AP_L
文献[15]	0.892	0.842	0.489	0.869	0.561	0.806	0.879
文献[3]	0.904	0.834	0.492	0.883	0.557	0.813	0.900
文献[36]	0.896	0.827	0.473	0.878	0.558	0.790	0.876
文献[10]	0.885	0.833	0.481	0.866	0.545	0.784	0.861
本文所提方法(不使用跨域结构)	0.924	0.835	0.514	0.883	0.563	0.801	0.889
本文所提方法+跨域训练	0.897	0.808	0.480	0.848	0.548	0.772	0.863

下载: 导出CSV

表 3 不同训练数据配置下的对比实验

训练数据	方法	精度	召回率	mAP	AP₅₀	AP_S	AP_M	AP_L
20k CURE-TSD中真实场景数据	文献[15]	0.880	0.831	0.468	0.860	0.552	0.793	0.867
	文献[3]	0.886	0.819	0.479	0.874	0.544	0.801	0.889
	文献[36]	0.876	0.801	0.463	0.863	0.539	0.782	0.864
	文献[10]	0.871	0.815	0.468	0.858	0.528	0.772	0.849
	本文所提方法+跨域训练	0.904	0.822	0.504	0.870	0.555	0.794	0.876
10k Meta-TSD和CURE-TSD中虚拟场景数据+5k CURE-TSD中真实场景数据	文献[15]	0.853	0.801	0.429	0.810	0.519	0.736	0.851
	文献[3]	0.871	0.802	0.437	0.821	0.525	0.741	0.861
	文献[36]	0.863	0.792	0.422	0.827	0.513	0.749	0.848
	文献[10]	0.821	0.804	0.445	0.801	0.502	0.729	0.837
	本文所提方法+跨域训练	0.892	0.804	0.458	0.826	0.537	0.757	0.862

下载: 导出CSV

表 4 提出方法在KITTI, VKITTI数据集上测试结果

数据集	精度	平均置信度
KITTI	0.757	0.747	0.821	0.793	0.781	0.755	0.735
VKITTI	0.781	0.774	0.843	0.825	0.817	0.776	0.768
*备注：表示数据集中对该类交通标志检测的平均置信度

下载: 导出CSV

表 5 在CURE-TSD数据集上的消融实验

启发式注意力	跨域检测结构	精度	召回率	GFLOPs
		0.887	0.796	15.6
√		0.924	0.835	16.9
	√	0.853	0.764	31.2
√	√	0.897	0.808	33.8

下载: 导出CSV

参考文献(36)

[1]	KUSUMA A T and SUPANGKAT S H. Metaverse fundamental technologies for smart city: A literature review[C]. 2022 International Conference on ICT for Smart Society (ICISS), Bandung, Indonesia, 2022: 1–7. doi: 10.1109/ICISS55894.2022.9915079.
[2]	TEMEL D, CHEN M H, and ALREGIB G. Traffic sign detection under challenging conditions: A deeper look into performance variations and spectral characteristics[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(9): 3663–3673. doi: 10.1109/TITS.2019.2931429.
[3]	LIU Yuanyuan, PENG Jiyao, XUE Jinghao, et al. TSingNet: Scale-aware and context-rich feature learning for traffic sign detection and recognition in the wild[J]. Neurocomputing, 2021, 447: 10–22. doi: 10.1016/j.neucom.2021.03.049.
[4]	LARSSON F and FELSBERG M. Using fourier descriptors and spatial models for traffic sign recognition[C]. The 17th Scandinavian Conference on Image Analysis, Ystad, Sweden, 2011: 238–249. doi: 10.1007/978-3-642-21227-7_23.
[5]	董哲康, 钱智凯, 周广东, 等. 基于忆阻的全功能巴甫洛夫联想记忆电路的设计、实现与分析[J]. 电子与信息学报, 2022, 44(6): 2080–2092. doi: 10.11999/JEIT210376. DONG Zhekang, QIAN Zhikai, ZHOU Guangdong, et al. Memory circuit design, implementation and analysis based on memristor full-function pavlov associative[J]. Journal of Electronics & Information Technology, 2022, 44(6): 2080–2092. doi: 10.11999/JEIT210376.
[6]	HORN D and HOUBEN S. Fully automated traffic sign substitution in real-world images for large-scale data augmentation[C]. 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, USA, 2020: 465–471. doi: 10.1109/IV47402.2020.9304547.
[7]	杨宇翔, 曹旗, 高明煜, 等. 基于多阶段多尺度彩色图像引导的道路场景深度图像补全[J]. 电子与信息学报, 2022, 44(11): 3951–3959. doi: 10.11999/JEIT210967. YANG Yuxiang, CAO Qi, GAO Mingyu, et al. Multi-stage multi-scale color guided depth image completion for road scenes[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3951–3959. doi: 10.11999/JEIT210967.
[8]	MIN Weidong, LIU Ruikang, HE Daojing, et al. Traffic sign recognition based on semantic scene understanding and structural traffic sign location[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(9): 15794–15807. doi: 10.1109/TITS.2022.3145467.
[9]	董哲康, 杜晨杰, 林辉品, 等. 基于多通道忆阻脉冲耦合神经网络的多帧图像超分辨率重建算法[J]. 电子与信息学报, 2020, 42(4): 835–843. doi: 10.11999/JEIT190868. DONG Zhekang, DU Chenjie, LIN Huipin, et al. Multi-channel memristive pulse coupled neural network based multi-frame images super-resolution reconstruction algorithm[J]. Journal of Electronics & Information Technology, 2020, 42(4): 835–843. doi: 10.11999/JEIT 190868.
[10]	LI Zhishan, CHEN Mingmu, HE Yifan, et al. An efficient framework for detection and recognition of numerical traffic signs[C]. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 2022: 2235–2239. doi: 10.1109/ICASSP43922.2022.9747406.
[11]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010.
[12]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 213–229. doi: 10.1007/978-3-030-58452-8_13.
[13]	HAN Kai, WANG Yunhe, CHEN Hanting, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 87–110. doi: 10.1109/TPAMI.2022.3152247.
[14]	WEI Hongyang, ZHANG Qianqian, QIAN Yurong, et al. MTSDet: Multi-scale traffic sign detection with attention and path aggregation[J]. Applied Intelligence, 2023, 53(1): 238–250. doi: 10.1007/s10489-022-03459-7.
[15]	WANG Junfan, CHEN Yi, DONG Zhekang, et al. Improved YOLOv5 network for real-time multi-scale traffic sign detection[J]. Neural Computing and Applications, 2023, 35(10): 7853–7865. doi: 10.1007/s00521-022-08077-5.
[16]	KIM J Y and OH J M. Opportunities and challenges of metaverse for automotive and mobility industries[C]. The 13th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 2022: 113–117. doi: 10.1109/ICTC55196.2022.9952976.
[17]	ZHANG Hui, LUO Guiyang, LI Yidong, et al. Parallel vision for intelligent transportation systems in metaverse: Challenges, solutions, and potential applications[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 53(6): 3400–3413. doi: 10.1109/TSMC.2022.3228314.
[18]	JIANG Pengtao, ZHANG Changbin, HOU Qibin, et al. LayerCAM: Exploring hierarchical class activation maps for localization[J]. IEEE Transactions on Image Processing, 2021, 30: 5875–5888. doi: 10.1109/TIP.2021.3089943.
[19]	THORPE S, FIZE D, and MARLOT C. Speed of processing in the human visual system[J]. Nature, 1996, 381(6582): 520–522. doi: 10.1038/381520a0.
[20]	GAIDON A, WANG Qiao, CABON Y, et al. VirtualWorlds as proxy for multi-object tracking analysis[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 4340–4349. doi: 10.1109/CVPR.2016.470.
[21]	GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: The KITTI dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231–1237. doi: 10.1177/0278364913491297.
[22]	SHREINER D. OpenGL Programming Guide: The Official Guide to Learning OpenGL, Versions 3.0 and 3.1[M]. Addison-Wesley Professional, 2009.
[23]	TORII A, HAVLENA M, and PAJDLA T. From google street view to 3D city models[C]. The IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan, 2009: 2188–2195. doi: 10.1109/ICCVW.2009.5457551.
[24]	NJOKU J N, NWAKANMA C I, AMAIZU G C, et al. Prospects and challenges of Metaverse application in data-driven intelligent transportation systems[J]. IET Intelligent Transport Systems, 2023, 17(1): 1–21. doi: 10.1049/itr2.12252.
[25]	PAMUCAR D, DEVECI M, GOKASAR I, et al. A metaverse assessment model for sustainable transportation using ordinal priority approach and Aczel-Alsina norms[J]. Technological Forecasting and Social Change, 2022, 182: 121778. doi: 10.1016/j.techfore.2022.121778.
[26]	SONG Jie, CHEN Ying, YE Jingwen, et al. Spot-adaptive knowledge distillation[J]. IEEE Transactions on Image Processing, 2022, 31: 3359–3370. doi: 10.1109/TIP.2022.3170728.
[27]	LIU Yuyuan, TIAN Yu, CHEN Yuanhong, et al. Perturbed and strict mean teachers for semi-supervised semantic segmentation[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 2022: 4248–4257. doi: 10.1109/CVPR52688.2022.00422.
[28]	张润丰, 姚伟, 石重托, 等. 融合虚拟对抗训练和均值教师模型的主导失稳模式识别半监督学习框架[J]. 中国电机工程学报, 2022, 42(20): 7497–7508. doi: 10.13334/j.0258-8013.pcsee.211673. ZHANG Runfeng, YAO Wei, SHI Zhongtuo, et al. Semi-supervised learning framework of dominant instability mode identification via fusion of virtual adversarial training and mean teacher model[J]. Proceedings of the CSEE, 2022, 42(20): 7497–7508. doi: 10.13334/j.0258-8013.pcsee.211673.
[29]	DING Xiaohan, ZHANG Xiangyu, MA Ningning, et al. RepVGG: Making VGG-style ConvNets great again[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 2021: 13728–13737. doi: 10.1109/CVPR46437.2021.01352.
[30]	WANG Jibin and ZHANG Shuo. An improved deep learning approach based on exponential moving average algorithm for atrial fibrillation signals identification[J]. Neurocomputing, 2022, 513: 127–136. doi: 10.1016/j.neucom.2022.09.079.
[31]	CHAUDHARI S, MITHAL V, POLATKAN G, et al. An attentive survey of attention models[J]. ACM Transactions on Intelligent Systems and Technology, 2021, 12(5): 53. doi: 10.1145/3465055.
[32]	RUEDA M R, POZUELOS J P, CÓMBITA L M, et al. Cognitive neuroscience of attention from brain mechanisms to individual differences in efficiency[J]. AIMS Neuroscience, 2015, 2(4): 183–202. doi: 10.3934/Neuroscience.2015.4.183.
[33]	ROSSI L F, HARRIS K D, and CARANDINI M. Spatial connectivity matches direction selectivity in visual cortex[J]. Nature, 2020, 588(7839): 648–652. doi: 10.1038/s41586-020-2894-4.
[34]	LUO Zhengding, LI Junting, and ZHU Yuesheng. A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition[J]. IEEE Signal Processing Letters, 2021, 28: 1060–1064. doi: 10.1109/LSP.2021.3079850.
[35]	WANG Junfan, CHEN Yi, JI Xiaoyue, et al. Vehicle-mounted adaptive traffic sign detector for small-sized signs in multiple working conditions[J]. IEEE Transactions on Intelligent Transportation Systems, doi: 10.1109/TITS.2023.3309644.
[36]	GU Yang and SI Bingfeng. A novel lightweight real-time traffic sign detection integration framework based on YOLOv4[J]. Entropy, 2022, 24(4): 487. doi: 10.3390/e24040487.