高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

融合神经辐射场和视觉同时定位与地图构建的混合场景表示方法

周非 周志远 张宇曈 谢源远

周非, 周志远, 张宇曈, 谢源远. 融合神经辐射场和视觉同时定位与地图构建的混合场景表示方法[J]. 电子与信息学报, 2024, 46(11): 4178-4187. doi: 10.11999/JEIT240316
引用本文: 周非, 周志远, 张宇曈, 谢源远. 融合神经辐射场和视觉同时定位与地图构建的混合场景表示方法[J]. 电子与信息学报, 2024, 46(11): 4178-4187. doi: 10.11999/JEIT240316
ZHOU Fei, ZHOU Zhiyuan, ZHANG Yutong, XIE Yuanyuan. Hybrid Scene Representation Method Integrating Neural Radiation Fields and Visual Simultaneous Localization and Mapping[J]. Journal of Electronics & Information Technology, 2024, 46(11): 4178-4187. doi: 10.11999/JEIT240316
Citation: ZHOU Fei, ZHOU Zhiyuan, ZHANG Yutong, XIE Yuanyuan. Hybrid Scene Representation Method Integrating Neural Radiation Fields and Visual Simultaneous Localization and Mapping[J]. Journal of Electronics & Information Technology, 2024, 46(11): 4178-4187. doi: 10.11999/JEIT240316

融合神经辐射场和视觉同时定位与地图构建的混合场景表示方法

doi: 10.11999/JEIT240316
基金项目: 国家自然科学基金(62271096)
详细信息
    作者简介:

    周非:男,教授,研究方向为图像处理、机器视觉、信息安全等

    周志远:男,硕士生,研究方向为图像处理、视觉SLAM等

    张宇曈:男,硕士生,研究方向为图像处理等

    谢源远:男,硕士生,研究方向为图像处理等

    通讯作者:

    周志远 s220101223@stu.cqupt.edu.cn

  • 中图分类号: TN911.73; TP391.41

Hybrid Scene Representation Method Integrating Neural Radiation Fields and Visual Simultaneous Localization and Mapping

Funds: The National Natural Science Foundation of China (62271096)
  • 摘要: 目前,传统显式场景表示的同时定位与地图构建(SLAM)系统对场景进行离散化,不适用于连续性场景重建。该文提出一种基于神经辐射场(NeRF)的混合场景表示的深度相机(RGB-D)SLAM系统,利用扩展显式八叉树符号距离函数(SDF)先验粗略表示场景,并通过多分辨率哈希编码以不同细节级别表示场景,实现场景几何的快速初始化,并使场景几何更易于学习。此外,运用外观颜色分解法,结合视图方向将颜色分解为漫反射颜色和镜面反射颜色,实现光照一致性的重建,使得重建结果更加真实。通过在Replica和TUM RGB-D数据集上进行实验,Replica数据集场景重建完成率达到93.65%,相较于Vox-Fusion定位精度,在Replica数据集上平均领先87.50%,在TUM RGB-D数据集上平均领先81.99%。
  • 图  1  系统框架

    图  2  八叉树SDF先验

    图  3  扩展体素分配

    图  4  渲染图像

    图  5  Replica数据集重建结果

    图  6  Replica数据集重建物体网格

    图  7  Apartment数据集重建结果

    图  8  八叉树SDF先验消融实验

    图  9  扩展体素分配消融实验

    图  10  外观颜色分解消融实验

    表  1  超参设定值

    超参 设定值 超参 设定值 超参 设定值 超参 设定值
    L 16 F2 2 Mf 11 ${\alpha _3}$ 0.000 01
    T 216 ${N_t}$ 1024 ${\alpha _1}$ 5.0 ${\alpha _4}$ 1000
    F1 1 $ M $ 32 ${\alpha _2}$ 0.1 ${\alpha _5}$ 10
    下载: 导出CSV

    表  2  Replica数据集重建质量对比

    方法 重建质量指标
    Depth
    L1(cm)↓
    Acc.
    (cm)↓
    Comp.
    (cm)↓
    Comp.
    Ratio(%)↑
    iMAP 4.64 3.62 4.93 80.50
    NICE-SLAM 3.53 2.85 3.00 89.33
    Vox-Fusion 2.91 2.37 2.28 92.86
    vMAP 3.33 3.20 2.39 92.99
    DNS SLAM 3.16 2.76 2.74 91.73
    本文 1.76 2.29 2.11 93.65
    下载: 导出CSV

    表  3  Replica数据集轨迹误差

    方法room0room1office0office1office3office4平均值
    iMAP70.004.532.321.7458.402.6223.27
    NICE-SLAM1.692.040.990.903.973.082.11
    Vox-Fusion1.374.708.482.041.112.943.44
    vMAP///////
    DNS SLAM0.490.460.340.350.620.600.48
    本文0.410.520.310.370.460.530.43
    下载: 导出CSV

    表  4  TUM-RGBD数据集轨迹误差

    方法fr1/deskfr2/xyzfr3/office平均值
    iMAP4.92.05.84.23
    NICE-SLAM2.71.83.02.50
    Vox-Fusion3.51.526.010.33
    vMAP2.61.63.02.40
    DNS SLAM////
    本文2.01.52.11.86
    下载: 导出CSV

    表  5  Replica数据集消融实验的定量分析

    Acc.(cm)↓Comp.(cm)↓Comp. Ratio(%)↑
    w/o 八叉树SDF先验2.992.2093.88
    w/o 扩展体素分配2.882.1095.05
    w/o 外观颜色分解2.361.9595.36
    本文2.271.9295.75
    下载: 导出CSV

    表  6  添加体素的点的阈值分析

    点数量阈值Acc.(cm)↓Comp.(cm)↓Comp. Ratio(%)↑
    55.002.0893.11
    10/本文2.271.9295.75
    152.371.9495.67
    202.291.9395.63
    下载: 导出CSV

    表  7  Replica数据集损失函数消融实验

    Acc.(cm)↓ Comp.(cm)↓ Comp. Ratio(%)↑
    w/o $ {{L}_{{\mathrm{rgb}}}} $ 2.47 1.94 95.68
    w/o $ {{L}_{\text{d}}} $ 2.48 1.93 95.54
    w/o $ {{L}_{{\text{specular}}}} $ 2.45 1.95 95.64
    w/o $ {{L}_{{\text{sdf}}}} $ 2.68 2.04 94.68
    w/o $ {{L}_{{f_{\text{s}}}}} $ 2.30 1.95 95.46
    本文 2.27 1.92 95.75
    下载: 导出CSV

    表  8  Replica数据集性能对比

    方法 Avg. fps↑ GPU Mem. (G)↓ param. (M)↓
    iMAP 0.13 6.44 0.32
    NICE-SLAM 0.61 4.70 17.4
    Vox-Fusion 0.74 21.22 0.87
    vMAP 4.03 \ 0.66
    DNS SLAM 0.13 \ \
    本文 4.93 2.93 0.34
    下载: 导出CSV
  • [1] HORNUNG A, WURM K M, BENNEWITZ M, et al. OctoMap: An efficient probabilistic 3D mapping framework based on octrees[J]. Autonomous Robots, 2013, 34(3): 189–206. doi: 10.1007/s10514-012-9321-0.
    [2] OLEYNIKOVA H, TAYLOR Z, FEHR M, et al. Voxblox: Incremental 3D euclidean signed distance fields for on-board MAV planning[C]. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017: 1366–1373. doi: 10.1109/IROS.2017.8202315.
    [3] NEWCOMBE R A, IZADI S, HILLIGES O, et al. KinectFusion: Real-time dense surface mapping and tracking[C]. 2011 10th IEEE International Symposium on Mixed and Augmented Reality, Basel, Switzerland, 2011: 127–136. doi: 10.1109/ISMAR.2011.6092378.
    [4] FEHR M, FURRER F, DRYANOVSKI I, et al. TSDF-based change detection for consistent long-term dense reconstruction and dynamic object discovery[C]. 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, Singapore, 2017: 5237–5244. doi: 10.1109/ICRA.2017.7989614.
    [5] DAI A, NIEßNER M, ZOLLHÖFER M, et al. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration[J]. ACM Transactions on Graphics (ToG), 2017, 36(4): 76a. doi: 10.1145/3072959.3054739.
    [6] NIEßNER M, ZOLLHÖFER M, IZADI S, et al. Real-time 3D reconstruction at scale using voxel hashing[J]. ACM Transactions on Graphics (ToG), 2013, 32(6): 169. doi: 10.1145/2508363.2508374.
    [7] KÄHLER O, PRISACARIU V A, REN C Y, et al. Very high frame rate volumetric integration of depth images on mobile devices[J]. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(11): 1241–1250. doi: 10.1109/TVCG.2015.2459891.
    [8] WANG Kaixuan, GAO Fei, and SHEN Shaojie. Real-time scalable dense surfel mapping[C]. 2019 International Conference on Robotics and Automation (ICRA), Montreal, Canada, 2019: 6919–6925. doi: 10.1109/ICRA.2019.8794101.
    [9] WHELAN T, SALAS-MORENO R F, GLOCKER B, et al. ElasticFusion: Real-time dense SLAM and light source estimation[J]. The International Journal of Robotics Research, 2016, 35(14): 1697–1716. doi: 10.1177/0278364916669237.
    [10] RUETZ F, HERNÁNDEZ E, PFEIFFER M, et al. OVPC mesh: 3D free-space representation for local ground vehicle navigation[C]. 2019 International Conference on Robotics and Automation (ICRA), Montreal, Canada, 2019: 8648–8654. doi: 10.1109/ICRA.2019.8793503.
    [11] ZHONG Xingguang, PAN Yue, BEHLEY J, et al. SHINE-mapping: Large-scale 3D mapping using sparse hierarchical implicit neural representations[C]. 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023: 8371–8377. doi: 10.1109/ICRA48891.2023.10160907.
    [12] MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99–106. doi: 10.1145/3503250.
    [13] SUCAR E, LIU Shikun, ORTIZ J, et al. iMAP: Implicit mapping and positioning in real-time[C]. The 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 6209–6218. doi: 10.1109/ICCV48922.2021.00617.
    [14] ZHU Zihan, PENG Songyou, LARSSON V, et al. NICE-SLAM: Neural implicit scalable encoding for slam[C]. The 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 12776–12786. doi: 10.1109/CVPR52688.2022.01245.
    [15] YANG Xingrui, LI Hai, ZHAI Hongjia, et al. Vox-Fusion: Dense tracking and mapping with voxel-based neural implicit representation[C]. 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Singapore, 2022: 499–507. doi: 10.1109/ISMAR55827.2022.00066.
    [16] KONG Xin, LIU Shikun, TAHER M, et al. vMAP: Vectorised object mapping for neural field SLAM[C]. The 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 952–961. doi: 10.1109/CVPR52729.2023.00098.
    [17] LI Kunyi, NIEMEYER M, NAVAB N, et al. DNS SLAM: Dense neural semantic-informed SLAM[J]. arXiv preprint arXiv: 2312.00204, 2023. doi: 10.48550/arXiv.2312.00204.
    [18] WU Xingming, LIU Zimeng, TIAN Yuxin, et al. KN-SLAM: Keypoints and neural implicit encoding SLAM[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 2512712. doi: 10.1109/TIM.2024.3378264.
    [19] WANG Haocheng, CAO Yanlong, WEI Xiaoyao, et al. Structerf-SLAM: Neural implicit representation SLAM for structural environments[J]. Computers & Graphics, 2024, 119: 103893. doi: 10.1016/j.cag.2024.103893.
    [20] MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics (ToG), 2022, 41(4): 102. doi: 10.1145/3528223.3530127.
    [21] TANG Jiaxiang, ZHOU Hang, CHEN Xiaokang, et al. Delicate textured mesh recovery from NeRF via adaptive surface refinement[C]. The 2023 IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 17693–17703. doi: 10.1109/ICCV51070.2023.01626.
    [22] ZHANG Xiuming, SRINIVASAN P P, DENG Boyang, et al. NeRFactor: Neural factorization of shape and reflectance under an unknown illumination[J]. ACM Transactions on Graphics (ToG), 2021, 40(6): 237. doi: 10.1145/3478513.3480496.
    [23] WANG Peng, LIU Lingjie, LIU Yuan, et al. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction[C]. The 35th International Conference on Neural Information Processing Systems, 2021: 2081.
    [24] YARIV L, GU Jiatao, KASTEN Y, et al. Volume rendering of neural implicit surfaces[C]. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021: 367.
    [25] AZINOVIĆ D, MARTIN-BRUALLA R, GOLDMAN D B, et al. Neural RGB-D surface reconstruction[C]. The 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 6280–6291. doi: 10.1109/CVPR52688.2022.00619.
    [26] STRAUB J, WHELAN T, MA Lingni, et al. The replica dataset: A digital replica of indoor spaces[J]. arXiv preprint arXiv: 1906.05797, 2019. doi: 10.48550/arXiv.1906.05797.
    [27] STURM J, ENGELHARD N, ENDRES F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 2012: 573–580. doi: 10.1109/IROS.2012.6385773.
  • 加载中
图(10) / 表(8)
计量
  • 文章访问数:  156
  • HTML全文浏览量:  36
  • PDF下载量:  21
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-04-22
  • 修回日期:  2024-08-26
  • 网络出版日期:  2024-08-30
  • 刊出日期:  2024-11-10

目录

    /

    返回文章
    返回