高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于自适应特征选择的车路协同3D目标检测

梁燕 杨会林 邵凯

梁燕, 杨会林, 邵凯. 基于自适应特征选择的车路协同3D目标检测[J]. 电子与信息学报. doi: 10.11999/JEIT250601
引用本文: 梁燕, 杨会林, 邵凯. 基于自适应特征选择的车路协同3D目标检测[J]. 电子与信息学报. doi: 10.11999/JEIT250601
LIANG Yan, YANG Huilin, SHAO Kai. Vehicle-Infrastructure Cooperative 3D Object Detection Based on Adaptive Feature Selection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250601
Citation: LIANG Yan, YANG Huilin, SHAO Kai. Vehicle-Infrastructure Cooperative 3D Object Detection Based on Adaptive Feature Selection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250601

基于自适应特征选择的车路协同3D目标检测

doi: 10.11999/JEIT250601 cstr: 32379.14.JEIT250601
基金项目: 重庆市自然科学基金 (CSTB2025NSCQ-GPX1253)
详细信息
    作者简介:

    梁燕:女,高级工程师,硕士生导师,主要研究方向为计算机视觉、物联网AI

    杨会林:男,硕士生,研究方向为协作感知、3D目标检测等

    邵凯:男,副教授,硕士生导师,研究方向为智能感知与信息系统、信号与信息智能处理等

    通讯作者:

    梁燕 liangyan@cqupt.edu.cn

  • 中图分类号: TP391

Vehicle-Infrastructure Cooperative 3D Object Detection Based on Adaptive Feature Selection

  • 摘要: 车路协同场景完成三维目标检测需解决车载端和路侧端之间通信带宽受限和信息聚合能力有限两个问题。该文基于空间过滤理论,设计了自适应特征选择的车路协同3D目标检测方案(AFS-VIC3D)。首先,为解决通信带宽受限问题,在路侧端设计了包含两个基本模块的自适应特征选择方案:(1)图结构特征增强模块(GSFEM)利用图神经网络(GNN)通过交互更新节点和边的权重来增强前景目标区域特征响应,并减少背景区域特征响应,以提升目标区域特征判别性;(2)自适应特征通信掩码构建模块(ACMGM)通过动态分析特征重要性分布,自适应选择信息量高的特征以构建稀疏二维通信掩码图实现特征优化传输;其次,为了提升信息聚合能力,在车载端设计了多尺度特征聚合模块(MSFA),通过空间-通道聚合协同机制,在尺度、空间和通道层次上融合车载端和路侧端特征,提高目标检测精度和鲁棒性。所提AFS-VIC3D在公开数据集DAIR-V2X和V2XSet上进行验证,均以交互比(IoU)阈值分别为0.3/0.5/0.7时平均精度(AP)为指标。在DAIR-V2X数据集上,该方案以$ {2^{20.15}} $字节的通信量达到了83.65%/79.34%/64.45%的检测精度;在V2XSet数据集上,以$ {2^{20.16}} $字节的通信量达到了94.14%/93.08%/86.69%的检测精度。结果表明,所提AFS-VIC3D方案自适应选择并传输对目标检测起关键作用的特征,在降低通信带宽消耗的同时提升3D目标检测性能,能实现检测性能与通信带宽之间的最佳权衡。
  • 图  1  AFS-VIC3D整体方案

    图  2  图结构特征增强模块GSFEM

    图  3  复杂度-密度自适应稀疏特征评估网络C-DASFAN

    图  4  多尺度特征聚合模块 MSFA

    图  5  在DAIR-V2X数据集上,不同方法在不同通信量下的协同感知性能对比。

    图  6  DAIR-V2X数据集可视化结果

    图  7  DAIR-V2X数据集上经典模型在不同场景上的可视化结果

    图  8  路端采样特征图可视化

    表  1  DAIR-V2X和V2XSet数据集上各算法对比实验结果

    方法阶段DAIR-V2XV2XSet
    AP@0.3↑AP@0.5↑AP@0.7↑Comm↓AP@0.3↑AP@0.5↑AP@0.7↑Comm↓
    No CollaborationN64.0661.0551.83079.6377.3351.340
    Early FusionE79.6174.4558.8227.4890.2084.6054.8228.01
    Late FusionL78.5569.7244.048.3987.6280.8750.398.57
    F-Cooper[7]79.3872.0054.4024.6286.6179.0052.6824.87
    V2VNet[8]79.7072.1954.0424.6290.7085.4361.4024.87
    V2X-ViT[9]79.6071.6153.2624.6291.6587.3666.6524.87
    When2com[10]79.0871.8454.4324.6284.1679.5556.1224.87
    Where2comm[11]80.4174.9357.0323.0692.0488.8768.8223.08
    How2comm[12]80.9876.0559.4522.8590.1090.2873.8722.86
    SCOPE[25]81.8878.5263.2924.8891.6491.2782.1924.88
    CoAlign[16]81.6077.4563.8524.2592.7991.6983.1224.25
    Fusion2comm[13]71.2456.7221.04
    SparseComm[14]91.8277.6020.35
    本文方法 83.65↑ 79.34↑ 64.45↑ 20.15↓ 94.14 ↑ 93.08↑ 86.69↑ 20.16↓
    下载: 导出CSV

    表  2  DAIR-V2X和V2XSet数据集下的消融实验结果

    Modules DAIR-V2X V2XSet
    GSFEM ACMGM MSFA AP@0.3↑ AP@0.5↑ AP@0.7↑ Comm↓ AP@0.3↑ AP@0.5↑ AP@0.7↑ Comm↓
    × × × 80.41 74.93 57.03 23.06 92.04 88.87 68.82 23.08
    × × 80.95 76.01 59.20 24.16 93.00 89.88 72.45 24.18
    × × 81.92 77.15 62.86 21.15 93.30 90.04 71.72 21.16
    × × 81.96 77.27 61.34 23.06 92.80 91.10 80.22 23.08
    × 83.20 78.55 63.32 20.15 93.28 92.21 84.85 20.16
    × 81.70 77.27 62.75 24.16 93.24 90.03 72.03 24.16
    × 82.17 77.48 62.00 21.15 94.02 92.79 84.56 21.16
    83.65 79.34 64.45 20.15 94.14 93.08 86.69 20.16
    下载: 导出CSV

    表  3  不同特征选择方式对通信量以及目标检测性能的影响

    采用方法 特征选
    择方法
    DAIR-V2X V2XSet
    AP@0.3↑ AP@0.5↑ AP@0.7↑ Comm↓ AP@0.3↑ AP@0.5↑ AP@0.7↑ Comm↓
    基线 SF 80.41 74.93 57.03 23.06 92.04 88.87 68.82 23.08
    方法(1) FC 79.32 71.38 54.59 24.06 88.76 85.46 67.46 24.50
    方法(2) FC 78.59 70.63 56.46 22.70 87.25 84.35 66.82 22.80
    方法(3) SF 80.56 76.46 57.45 22.60 91.56 88.06 69.63 22.62
    方法(4) SF 81.05 76.16 58.49 21.60 92.51 88.74 69.13 21.80
    方法(5) SF 80.86 77.02 59.46 20.70 93.01 88.76 70.52 20.80
    方法(6) SF 81.92 77.15 62.86 20.15 93.30 90.04 71.72 20.16
    下载: 导出CSV
  • [1] SHAO Shilin, ZHOU Yang, LI Zhenglin, et al. Frustum PointVoxel-RCNN: A high-performance framework for accurate 3D object detection in point clouds and images[C]. Proceedings of the 2024 4th International Conference on Computer, Control and Robotics (ICCCR), Shanghai, China, 2024: 56–60. doi: 10.1109/ICCCR61138.2024.10585339.
    [2] 邵凯, 吴广, 梁燕, 等. 基于局部特征编解码的自动驾驶3D目标检测[J]. 系统工程与电子技术, 2025, 47(10): 3168–3178. doi: 10.12305/j.issn.1001-506X.2025.10.05.

    SHAO Kai, WU Guang, LIANG Yan, et al. Local feature encode-decoding based 3D target detection of autonomous driving[J]. Systems Engineering and Electronics, 2025, 47(10): 3168–3178. doi: 10.12305/j.issn.1001-506X.2025.10.05.
    [3] ZHANG Yezheng, FAN Zhijie, HOU Jiawei, et al. Incentivizing point cloud-based accurate cooperative perception for connected vehicles[J]. IEEE Transactions on Vehicular Technology, 2025, 74(4): 5637–5648. doi: 10.1109/TVT.2024.3519626.
    [4] HU Senkang, FANG Zhengru, DENG Yiqin, et al. Collaborative perception for connected and autonomous driving: Challenges, possible solutions and opportunities[J]. IEEE Wireless Communications, 2025, 32(5): 228–234. doi: 10.1109/MWC.002.2400348.
    [5] LI Jing, NIU Yong, WU Hao, et al. Effective joint scheduling and power allocation for URLLC-oriented V2I communications[J]. IEEE Transactions on Vehicular Technology, 2024, 73(8): 11694–11705. doi: 10.1109/TVT.2024.3381924.
    [6] LIU Gang, HU Jiewen, MA Zheng, et al. Joint optimization of communication latency and platoon control based on uplink RSMA for future V2X networks[J]. IEEE Transactions on Vehicular Technology, 2025, 74(9): 13458–13470. doi: 10.1109/TVT.2025.3560709.
    [7] CHEN Qi, MA Xu, TANG Sihai, et al. F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds[C]. Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, USA, 2019: 88–100. doi: 10.1145/3318216.3363300.
    [8] WANG T H, MANIVASAGAM S, LIANG Ming, et al. V2VNet: Vehicle-to-vehicle communication for joint perception and prediction[C]. Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, 2020: 605–621. doi: 10.1007/978-3-030-58536-5_36.
    [9] XU Runsheng, XIANG Hao, TU Zhengzhong, et al. V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer[C]. Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 107–124. doi: 10.1007/978-3-031-19842-7_7.
    [10] LIU Y C, TIAN Junjiao, GLASER N, et al. When2com: Multi-agent perception via communication graph grouping[C]. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 4105–4114. doi: 10.1109/CVPR42600.2020.00416.
    [11] HU Yue, FANG Shaoheng, LEI Zixing, et al. Where2comm: Communication-efficient collaborative perception via spatial confidence maps[C]. Proceedings of the 36th International Conference on Neural Information Processing System, New Orleans, USA, 2022: 352. doi: 10.5555/3600270.3600622.
    [12] YANG Dingkang, YANG Kun, WANG Yuzheng, et al. How2comm: Communication-efficient and collaboration-pragmatic multi-agent perception[J]. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, USA, 2023: 1093.
    [13] CHU Huazhen, LIU Haizhuang, ZHUO Junbao, et al. Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection[J]. Pattern Recognition, 2025, 157: 110939. doi: 10.1016/j.patcog.2024.110939.
    [14] LIU Haizhuang, CHU Huazhen, ZHUO Junbao, et al. SparseComm: An efficient sparse communication framework for vehicle-infrastructure cooperative 3D detection[J]. Pattern Recognition, 2025, 158: 110961. doi: 10.1016/j.patcog.2024.110961.
    [15] YANG Kun, YANG Dingkang, ZHANG Jingyu, et al. Spatio-temporal domain awareness for multi-agent collaborative perception[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 23383–23392. doi: 10.1109/ICCV51070.2023.02137.
    [16] LU Yifan, LI Quanhao, LIU Baoan, et al. Robust collaborative 3D object detection in presence of pose errors[C]. Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023: 4812–4818. doi: 10.1109/ICRA48891.2023.10160546.
    [17] LANG A H, VORA S, CAESAR H, et al. PointPillars: Fast encoders for object detection from point clouds[C]. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 12689–12697. doi: 10.1109/CVPR.2019.01298.
    [18] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    [19] XUE Yuanliang, JIN Guodong, SHEN Tao, et al. SmallTrack: Wavelet pooling and graph enhanced classification for UAV small object tracking[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5618815. doi: 10.1109/TGRS.2023.3305728.
    [20] ZHANG Jingyu, YANG Kun, WANG Yilei, et al. ERMVP: Communication-efficient and collaboration-robust multi-vehicle perception in challenging environments[C]. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024: 12575–12584. doi: 10.1109/CVPR52733.2024.01195.
    [21] 陶新民, 李俊轩, 郭心悦, 等. 基于超球体密度聚类的自适应不均衡数据过采样算法[J]. 电子与信息学报, 2025, 47(7): 2347–2360. doi: 10.11999/JEIT241037.

    TAO Xinmin, LI Junxuan, GUO Xinyue, et al. Density clustering hypersphere-based self-adaptively oversampling algorithm for imbalanced datasets[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2347–2360. doi: 10.11999/JEIT241037.
    [22] LIU Haisong, TENG Yao, LU Tao, et al. SparseBEV: High-performance sparse 3D object detection from multi-camera videos[C]. Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 18534–18544. doi: 10.1109/ICCV51070.2023.01703.
    [23] LIU Mushui, DAN Jun, LU Ziqian, et al. CM-UNet: Hybrid CNN-mamba UNet for remote sensing image semantic segmentation[J]. arXiv preprint arXiv: 2405.10530, 2024. doi. org/10.48550/arXiv. 2405.10530. (查阅网上资料, 不确定本条文献的格式和类型, 请确认)
    [24] YU Haibao, LUO Yizhen, SHU Mao, et al. DAIR-V2X: A large-scale dataset for vehicle-infrastructure cooperative 3D object detection[C]. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 2022: 21329–21338. doi: 10.1109/CVPR52688.2022.02067.
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  26
  • HTML全文浏览量:  12
  • PDF下载量:  4
  • 被引次数: 0
出版历程
  • 修回日期:  2025-11-17
  • 录用日期:  2025-11-17
  • 网络出版日期:  2025-11-26

目录

    /

    返回文章
    返回