高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

动态场景下基于视觉同时定位与地图构建技术的多层次语义地图构建方法

梅天灿 秦宇晟 杨宏 高智 李皓冉

梅天灿, 秦宇晟, 杨宏, 高智, 李皓冉. 动态场景下基于视觉同时定位与地图构建技术的多层次语义地图构建方法[J]. 电子与信息学报, 2023, 45(5): 1737-1746. doi: 10.11999/JEIT220153
引用本文: 梅天灿, 秦宇晟, 杨宏, 高智, 李皓冉. 动态场景下基于视觉同时定位与地图构建技术的多层次语义地图构建方法[J]. 电子与信息学报, 2023, 45(5): 1737-1746. doi: 10.11999/JEIT220153
MEI Tiancan, QIN Yusheng, YANG Hong, GAO Zhi, LI Haoran. Multilevel Semantic Maps Based on Visual Simultaneous Localization and Mapping in Dynamic Scenarios[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1737-1746. doi: 10.11999/JEIT220153
Citation: MEI Tiancan, QIN Yusheng, YANG Hong, GAO Zhi, LI Haoran. Multilevel Semantic Maps Based on Visual Simultaneous Localization and Mapping in Dynamic Scenarios[J]. Journal of Electronics & Information Technology, 2023, 45(5): 1737-1746. doi: 10.11999/JEIT220153

动态场景下基于视觉同时定位与地图构建技术的多层次语义地图构建方法

doi: 10.11999/JEIT220153
基金项目: 湖北省自然科学基金(2021CFA088)
详细信息
    作者简介:

    梅天灿:男,副教授,研究方向为计算机视觉、模式识别、机器学习

    秦宇晟:男,硕士生,研究方向为计算机视觉、视觉SLAM

    杨宏:男,高级工程师,研究方向为航空遥感及数据应用技术

    高智:男,教授,研究方向为人工智能、计算机视觉、智能无人系统、遥感

    李皓冉:男,硕士生,研究方向为机器视觉与人工智能

    通讯作者:

    梅天灿 mtc@whu.edu.cn

  • 中图分类号: TP242.6; TN919.82

Multilevel Semantic Maps Based on Visual Simultaneous Localization and Mapping in Dynamic Scenarios

Funds: The Natural Science Foundation of Hubei Province (2021CFA088)
  • 摘要: 为提高视觉同时定位与地图构建(SLAM)技术的环境适应性和语义信息理解能力,该文提出一种可以在动态场景下实现多层次语义地图构建的视觉SLAM方案。首先利用被迫移动物体与动态目标间的空间位置关系,并结合目标检测网络和光流约束判断真正的动态目标,从而剔除动态特征点;其次提出一种基于超体素的快速点云分割方案,将基于静态区域构建的3维地图进行优化,构建了物体级的点云语义地图;同时构建的语义地图可以提供更高精度的训练数据样本,进一步用来提升目标检测网络性能。在TUM和ICL-NUIM数据集上的实验结果表明,该方法在定位精度上远优于目前主流的动态场景下的视觉SLAM方案,证明了该方法在高动态场景中具有较好的稳定性和鲁棒性;在建图精度和质量上,经过将重建的不同种类地图与各个现有方法进行比较,验证了提出的多层次语义地图构建的方法在静态和高动态场景中的有效性与适用性。
  • 图  1  系统总体框架

    图  2  关键帧处理示意图

    图  3  地图1和地图2的结果

    图  4  地图3的生成过程

    图  5  用地图2生成目标检测结果的示意图

    图  6  高动态场景估计轨迹对比图

    图  7  背景重建质量的定量分析效果图

    图  8  背景重建效果对比图

    图  9  fr3-W-xyz数据集上生成的多层次地图

    图  10  生成复杂样本的真值

    表  1  绝对轨迹误差(Absolute Trajectory Error, ATE)和相对位姿误差(Relative Pose Error, RPE)在不同方法下的数据对比

    数据集ATE(m)
    现有方法本文方法
    ORB-SLAM2BaMVOSPWLC-CRFDynaSLAMDS-SLAM目标检测光流约束最终结果
    fr3-W-static0.35160.00820.02350.01110.00720.00810.00940.01120.0069
    fr3-W-xyz0.45630.22330.05510.01580.01640.02470.01520.02310.0132
    fr3-W-half0.46730.19400.05630.03180.02400.03030.02760.02590.0228
    fr3-W-rpy0.79460.14770.15930.05160.03670.04420.13250.07890.0488
    RPE(m/s)
    fr3-W-static0.01530.00680.00840.00620.00910.00660.01120.00940.0059
    fr3-W-xyz0.02610.02480.00860.00940.02170.02330.02380.01890.0116
    fr3-W-half0.04110.02440.03750.01580.01480.02970.02230.03860.0145
    ffr3-W-rpy0.06470.01550.01980.04230.03140.05030.03520.02790.0248
    下载: 导出CSV

    表  2  不同数据训练YOLOv3后准确率和召回率的对比

    类别准确率/召回率
    显示屏椅子盆栽水瓶鼠标
    VOC61.7/63.856.5/60.735.6/51.7
    VOC+本文样本71.2/76.168.6/73.254.1/62.9
    COCO65.3/66.449.9/52.327.7/36.823.6/34.569.9/71.1
    COCO+本文样本72.4/75.861.2/63.953.1/66.251.5/60.877.4/78.9
    下载: 导出CSV
  • [1] ROSINOL A, ABATE M, CHANG Yun, et al. Kimera: An open-source library for real-time metric-semantic localization and mapping[C]. 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 2020: 1689–1696.
    [2] QIN Tong, LI Peiliang, and SHEN Shaojie. VINS-mono: A robust and versatile monocular visual-inertial state estimator[J]. IEEE Transactions on Robotics, 2018, 34(4): 1004–1020. doi: 10.1109/TRO.2018.2853729
    [3] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874–1890. doi: 10.1109/TRO.2021.3075644
    [4] MUR-ARTAL R and TARDÓS J D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255–1262. doi: 10.1109/TRO.2017.2705103
    [5] LONG Ran, RAUCH C, ZHANG Tianwei, et al. RigidFusion: Robot localisation and mapping in environments with large dynamic rigid objects[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 3703–3710. doi: 10.1109/LRA.2021.3066375
    [6] JI Tete, WANG Chen, and XIE Lihua. Towards real-time semantic RGB-D SLAM in dynamic environments[C]. 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi'an, China, 2021: 11175–11181.
    [7] ZHANG Tianwei, ZHANG Huayan, LI Yang, et al. FlowFusion: Dynamic dense RGB-D SLAM based on optical flow[C]. 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 2020: 7322–7328.
    [8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
    [9] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37.
    [10] HE Kaiming, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980–2988.
    [11] BADRINARAYANAN V, KENDALL A, and CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481–2495. doi: 10.1109/TPAMI.2016.2644615
    [12] RUNZ M, BUFFIER M, and AGAPITO L. MaskFusion: Real-time recognition, tracking and reconstruction of multiple moving objects[C]. 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany, 2018: 10–20.
    [13] BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4076–4083. doi: 10.1109/LRA.2018.2860039
    [14] YU Chao, LIU Zuxin, LIU Xinjun, et al. DS-SLAM: A semantic visual SLAM towards dynamic environments[C]. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 2018: 1168–1174.
    [15] LIU Yubao and MIURA J. RDS-SLAM: Real-time dynamic SLAM using semantic segmentation methods[J]. IEEE Access, 2021, 9: 23772–23785. doi: 10.1109/ACCESS.2021.3050617
    [16] MCCORMAC J, HANDA A, DAVISON A, et al. SemanticFusion: Dense 3D semantic mapping with convolutional neural networks[C]. 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017: 4628–4635.
    [17] FAN Yingchun, ZHANG Qichi, LIU Shaofeng, et al. Semantic SLAM with more accurate point cloud map in dynamic environments[J]. IEEE Access, 2020, 8: 112237–112252. doi: 10.1109/ACCESS.2020.3003160
    [18] CHENG Jiyu, WANG Chaoqun, MAI Xiaochun, et al. Improving dense mapping for mobile robots in dynamic environments based on semantic information[J]. IEEE Sensors Journal, 2021, 21(10): 11740–11747. doi: 10.1109/JSEN.2020.3023696
    [19] REDMON J and FARHADI A. YOLOv3: An incremental improvement[C]. Computer Vision and Pattern Recognition, Berlin, Heidelberg, Germany, 2018: 1804–2767.
    [20] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]. 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 740–755.
    [21] PHAM T T, EICH M, REID I, et al. Geometrically consistent plane extraction for dense indoor 3D maps segmentation[C]. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea (South), 2016: 4199–4204.
    [22] ZHONG Fangwei, WANG Sheng, ZHANG Ziqi, et al. Detect-SLAM: Making object detection and SLAM mutually beneficial[C]. 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, USA, 2018: 1001–1010.
    [23] STURM J, ENGELHARD N, ENDRES F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 2012: 573–580.
    [24] HANDA A, WHELAN T, MCDONALD J, et al. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM[C]. 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 2014: 1524–1531.
    [25] KIM D H and KIM J H. Effective background model-based RGB-D dense visual odometry in a dynamic environment[J]. IEEE Transactions on Robotics, 2016, 32(6): 1565–1573. doi: 10.1109/TRO.2016.2609395
    [26] LI Shile and LEE D. RGB-D SLAM in dynamic environments using static point weighting[J]. IEEE Robotics and Automation Letters, 2017, 2(4): 2263–2270. doi: 10.1109/LRA.2017.2724759
    [27] DU Zhengjun, HUANG Shisheng, MU Taijiang, et al. Accurate dynamic SLAM using CRF-based long-term consistency[J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 28(4): 1745–1757. doi: 10.1109/TVCG.2020.3028218
    [28] WHELAN T, SALAS-MORENO R F, GLOCKER B, et al. ElasticFusion: Real-time dense SLAM and light source estimation[J]. The International Journal of Robotics Research, 2016, 35(14): 1697–1716. doi: 10.1177/0278364916669237
    [29] SCONA R, JAIMEZ M, PETILLOT Y R, et al. StaticFusion: Background reconstruction for dense RGB-D SLAM in dynamic environments[C]. 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 2018: 3849–3856.
  • 加载中
图(10) / 表(2)
计量
  • 文章访问数:  585
  • HTML全文浏览量:  600
  • PDF下载量:  155
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-02-18
  • 修回日期:  2022-06-18
  • 网络出版日期:  2022-06-25
  • 刊出日期:  2023-05-10

目录

    /

    返回文章
    返回