高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

稀疏视角下基于几何一致性的神经辐射场卫星城市场景渲染与数字表面模型生成

孙文博 高智 张依晨 朱军 李衍璋 路遥

孙文博, 高智, 张依晨, 朱军, 李衍璋, 路遥. 稀疏视角下基于几何一致性的神经辐射场卫星城市场景渲染与数字表面模型生成[J]. 电子与信息学报, 2025, 47(6): 1679-1689. doi: 10.11999/JEIT240898
引用本文: 孙文博, 高智, 张依晨, 朱军, 李衍璋, 路遥. 稀疏视角下基于几何一致性的神经辐射场卫星城市场景渲染与数字表面模型生成[J]. 电子与信息学报, 2025, 47(6): 1679-1689. doi: 10.11999/JEIT240898
SUN Wenbo, GAO Zhi, ZHANG Yichen, ZHU Jun, Li Yanzhang, LU Yao. Geometrically Consistent Based Neural Radiance Field for Satellite City Scene Rendering and Digital Surface Model Generation in Sparse Viewpoints[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1679-1689. doi: 10.11999/JEIT240898
Citation: SUN Wenbo, GAO Zhi, ZHANG Yichen, ZHU Jun, Li Yanzhang, LU Yao. Geometrically Consistent Based Neural Radiance Field for Satellite City Scene Rendering and Digital Surface Model Generation in Sparse Viewpoints[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1679-1689. doi: 10.11999/JEIT240898

稀疏视角下基于几何一致性的神经辐射场卫星城市场景渲染与数字表面模型生成

doi: 10.11999/JEIT240898 cstr: 32379.14.JEIT240898
基金项目: 民用航天项目(D040103)
详细信息
    作者简介:

    孙文博:男,博士生,研究方向为遥感影像智能解译

    高智:男,教授,研究方向为遥感影像智能解译、计算机视觉

    张依晨:男,硕士生,研究方向为遥感影像智能解译

    朱军:男,研究员,研究方向为遥感影像智能解译

    李衍璋:男,工程师,研究方向为遥感影像智能解译

    路遥:男,助理研究员,研究方向为遥感影像智能解译

    通讯作者:

    高智 gaozhinus@gmail.com

  • 中图分类号: TN911.6;TP391.4

Geometrically Consistent Based Neural Radiance Field for Satellite City Scene Rendering and Digital Surface Model Generation in Sparse Viewpoints

Funds: Civilian Space Project (D040103)
  • 摘要: 卫星遥感提供了全球、连续、多尺度的地表观测能力。近年来,神经辐射场(NeRF)因其连续渲染和隐式重建特性,在自动驾驶与大场景重建等领域表现出良好鲁棒性,受到广泛关注。然而,NeRF在卫星对地观测中的应用效果有限,主要因其训练需大量多视角图像,而卫星影像获取受限。在视角稀疏时,模型易对训练视角过拟合,导致新视角下性能下降。针对上述问题,该文提出一种新的方法,通过在NeRF的训练过程中引入场景深度与表面法线的几何约束,旨在提升在稀疏视角条件下的渲染与数字表面模型(DSM)生成能力。通过在DFC2019数据集上进行广泛实验,验证了所提出方法的有效性。实验结果表明,采用几何约束的NeRF模型在稀疏视角条件下的新视角合成和DSM生成任务上均取得了领先的结果,显示出其在稀疏视角条件下卫星观测场景中的应用潜力。
  • 图  1  本文所使用的基线模型

    图  2  各种方法在JAX068场景的新视角合成结果

    图  3  各种方法在JAX068场景的DSM生成结果

    表  1  DFC2019数据集具体细节

    JAX 004 JAX 068 JAX 214 JAX 260
    训练/测试样本数量 9/2 17/2 21/3 15/2
    高程范围(m) [–24, 1] [–27. 30] [–29, 73] [–30, 13]
    下载: 导出CSV

    表  2  各种方法在不同AOI上的实验结果表格

    (a)JAX004实验结果
    训练视角数量 PSNR SSIM Altitude MAE (m)
    2 3 5 2 3 5 2 3 5
    NeRF[1] 12.168 13.376 14.279 0.421 0.474 0.493 5.273 5.137 4.493
    S-NeRF[2] 15.867 16.972 18.687 0.691 0.717 0.757 4.513 4.337 4.001
    Sat-NeRF[3] 20.619 20.831 22.186 0.794 0.794 0.814 2.917 2.907 2.738
    基线模型[29] 20.614 20.943 22.167 0.786 0.781 0.804 2.614 2.437 1.986
    本文算法 21.365 21.619 23.681 0.801 0.812 0.848 2.414 2.198 1.602
    S2P[29] 1.932 1.804 1.437
    (b) JAX068实验结果
    训练视角数量 PSNR SSIM Altitude MAE (m)
    2 3 5 2 3 5 2 3 5
    NeRF[1] 7.329 7.845 8.772 0.411 0.42 0.456 9.132 6.467 4.545
    S-NeRF[2] 15.874 17.129 20.191 0.781 0.799 0.816 6.179 4.777 3.826
    Sat-NeRF[3] 16.536 18.425 21.215 0.803 0.810 0.843 5.842 4.387 3.411
    基线模型[29] 18.473 18.431 22.017 0.800 0.831 0.843 4.539 3.048 2.619
    本文算法 19.837 19.493 23.271 0.807 0.833 0.843 3.316 2.531 2.249
    S2P[29] 2.883 2.314 2.032
    (c) JAX214实验结果
    训练视角数量 PSNR SSIM Altitude MAE (m)
    2 3 5 2 3 5 2 3 5
    NeRF[1] 10.892 12.543 12.632 0.421 0.433 0.487 9.964 9.176 8.721
    S-NeRF[2] 12.816 12.997 14.573 0.557 0.571 0.599 8.142 8.157 7.134
    Sat-NeRF[3] 15.834 16.214 17.691 0.804 0.806 0.814 5.101 5.147 4.721
    基线模型[29] 16.117 16.924 18.871 0.817 0.863 0.840 4.617 4.394 3.466
    本文算法 18.991 20.505 21.317 0.820 0.867 0.831 3.217 2.496 2.408
    S2P[29] 3.014 2.837 2.414
    (d) JAX260实验结果
    训练视角数量 PSNR SSIM Altitude MAE (m)
    2 3 5 2 3 5 2 3 5
    NeRF[1] 8.762 8.712 10.753 0.318 0.346 0.384 7.816 7.591 6.622
    S-NeRF[2] 11.259 11.573 13.836 0.543 0.551 0.654 6.165 6.189 5.432
    Sat-NeRF[3] 16.007 15.715 16.314 0.641 0.644 0.694 3.857 3.423 2.995
    基线模型[29] 16.179 16.197 18.423 0.688 0.689 0.735 3.386 3.214 2.887
    本文算法 16.728 17.150 19.934 0.690 0.690 0.742 3.016 2.926 2.119
    S2P[29] 3.423 2.873 2.317
    下载: 导出CSV

    表  3  本文使用各种损失函数在JAX068场景中的实验结果表格

    训练视角数量PSNRSSIMAltitude MAE (m)
    235235235
    基线模型18.47318.43122.0170.8000.8310.8434.5393.0482.619
    基线模型+${L_{{\text{DS}}}}$18.85719.01122.7190.7960.8310.8373.8172.7642.418
    基线模型+${L_{{\text{GC}}}}$19.01719.08222.9470.8040.8310.8403.7822.7162.298
    基线模型+${L_{{\text{full}}}}$19.83719.49323.2710.8070.8330.8433.3162.5312.249
    下载: 导出CSV
  • [1] MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99–106. doi: 10.1145/3503250.
    [2] DERKSEN D and IZZO D. Shadow neural radiance fields for multi-view satellite photogrammetry[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 1152–1161. doi: 10.1109/CVPRW53098.2021.00126.
    [3] MARÍ R, FACCIOLO G, and EHRET T. Sat-NeRF: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 1310–1320. doi: 10.1109/CVPRW56347.2022.00137.
    [4] LE SAUX B, YOKOYA N, HANSCH R, et al. 2019 Data fusion contest [technical committees][J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(1): 103–105. doi: 10.1109/MGRS.2019.2893783.
    [5] TANCIK M, MILDENHALL B, WANG T, et al. Learned initializations for optimizing coordinate-based neural representations[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 2846–2855. doi: 10.1109/CVPR46437.2021.00287.
    [6] JAIN A, TANCIK M, and ABBEEL P. Putting NeRF on a diet: Semantically consistent few-shot view synthesis[C]. The IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 5865–5874. doi: 10.1109/ICCV48922.2021.00583.
    [7] HIRSCHMULLER H. Accurate and efficient stereo processing by semi-global matching and mutual information[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 807–814. doi: 10.1109/CVPR.2005.56.
    [8] D'ANGELO P and KUSCHK G. Dense multi-view stereo from satellite imagery[C]. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 2012: 6944–6947. doi: 10.1109/IGARSS.2012.6352565.
    [9] BEYER R A, ALEXANDROV O, and MCMICHAEL S. The ames stereo pipeline: NASA’s open source software for deriving and processing terrain data[J]. Earth and Space Science, 2018, 5(9): 537–548. doi: 10.1029/2018EA000409.
    [10] DE FRANCHIS C, MEINHARDT-LLOPIS E, MICHEL J, et al. An automatic and modular stereo pipeline for pushbroom images[C]. The ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Zürich, Switzerland, 2014: 49–56. doi: 10.5194/isprsannals-ii-3-49-2014.
    [11] GONG K and FRITSCH D. DSM generation from high resolution multi-view stereo satellite imagery[J]. Photogrammetric Engineering & Remote Sensing, 2019, 85(5): 379–387. doi: 10.14358/PERS.85.5.379.
    [12] HIRSCHMULLER H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2): 328–341. doi: 10.1109/TPAMI.2007.1166.
    [13] YAO Yao, LUO Zixin, LI Shiwei, et al. MVSNet: Depth inference for unstructured multi-view stereo[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 785–801. doi: 10.1007/978-3-030-01237-3_47.
    [14] GU Xiaodong, FAN Zhiwen, ZHU Siyu, et al. Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 2492–2501. doi: 10.1109/CVPR42600.2020.00257.
    [15] CHENG Shuo, XU Zexiang, ZHU Shilin, et al. Deep stereo using adaptive thin volume representation with uncertainty awareness[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 2521–2531. doi: 10.1109/CVPR42600.2020.00260.
    [16] YANG Jiayu, MAO Wei, ALVAREZ J M, et al. Cost volume pyramid based depth inference for multi-view stereo[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 4876–4885. doi: 10.1109/CVPR42600.2020.00493.
    [17] GAO Jian, LIU Jin, and JI Shunping. A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, 195: 446–461. doi: 10.1016/j.isprsjprs.2022.12.012.
    [18] MARÍ R, FACCIOLO G, and EHRET T. Multi-date earth observation NeRF: The detail is in the shadows[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 2035–2045. doi: 10.1109/CVPRW59228.2023.00197.
    [19] SUN Wenbo, LU Yao, ZHANG Yichen, et al. Neural radiance fields for multi-view satellite photogrammetry leveraging intrinsic decomposition[C]. The IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 2024: 4631–4635. doi: 10.1109/IGARSS53475.2024.10641455.
    [20] YU A, YE V, TANCIK M, et al. pixelNeRF: Neural radiance fields from one or few images[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 4576–4585. doi: 10.1109/CVPR46437.2021.00455.
    [21] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. Proceedings of the 9th International Conference on Learning Representations, 2021.
    [22] XU Dejia, JIANG Yifan, WANG Peihao, et al. SinNeRF: Training neural radiance fields on complex scenes from a single image[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 736–753. doi: 10.1007/978-3-031-20047-2_42.
    [23] AMIR S, GANDELSMAN Y, BAGON S, et al. Deep vit features as dense visual descriptors[J]. arXiv preprint arXiv: 2112.05814, 2021.
    [24] DENG Kangle, LIU A, ZHU Junyan, et al. Depth-supervised NeRF: Fewer views and faster training for free[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 12872–12881. doi: 10.1109/CVPR52688.2022.01254.
    [25] TRUONG P, RAKOTOSAONA M J, MANHARDT F, et al. SPARF: Neural radiance fields from sparse and noisy poses[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 4190–4200. doi: 10.1109/CVPR52729.2023.00408.
    [26] WANG Guangcong, CHEN Zhaoxi, LOY C C, et al. SparseNeRF: Distilling depth ranking for few-shot novel view synthesis[C]. The IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 9031–9042. doi: 10.1109/ICCV51070.2023.00832.
    [27] SHI Ruoxi, WEI Xinyue, WANG Cheng, et al. ZeroRF: Fast sparse view 360° reconstruction with zero pretraining[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 21114–21124. doi: 10.1109/CVPR52733.2024.01995.
    [28] ZHANG Lulin and RUPNIK E. SparseSat-NeRF: Dense depth supervised neural radiance fields for sparse satellite images[J]. arXiv preprint arXiv: 2309.00277, 2023.
    [29] FACCIOLO G, DE FRANCHIS C, and MEINHARDT-LLOPIS E. Automatic 3D reconstruction from multi-date satellite images[C]. The IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 1542–1551. doi: 10.1109/CVPRW.2017.198.
    [30] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255. doi: 10.1109/CVPR.2009.5206848.
  • 加载中
图(3) / 表(3)
计量
  • 文章访问数:  233
  • HTML全文浏览量:  162
  • PDF下载量:  31
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-10-22
  • 修回日期:  2025-04-14
  • 网络出版日期:  2025-04-30
  • 刊出日期:  2025-06-30

目录

    /

    返回文章
    返回