高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于语义导向的光场图像深度估计

邓慧萍 盛志超 向森 吴谨

邓慧萍, 盛志超, 向森, 吴谨. 基于语义导向的光场图像深度估计[J]. 电子与信息学报, 2022, 44(8): 2940-2948. doi: 10.11999/JEIT210545
引用本文: 邓慧萍, 盛志超, 向森, 吴谨. 基于语义导向的光场图像深度估计[J]. 电子与信息学报, 2022, 44(8): 2940-2948. doi: 10.11999/JEIT210545
DENG Huiping, SHENG Zhichao, XIANG Sen, WU Jing. Depth Estimation Based on Semantic Guidance for Light Field Image[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2940-2948. doi: 10.11999/JEIT210545
Citation: DENG Huiping, SHENG Zhichao, XIANG Sen, WU Jing. Depth Estimation Based on Semantic Guidance for Light Field Image[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2940-2948. doi: 10.11999/JEIT210545

基于语义导向的光场图像深度估计

doi: 10.11999/JEIT210545
详细信息
    作者简介:

    邓慧萍:女,1983年生,副教授,研究方向为3D视频与图像的处理、机器学习、3维信息测量、视频图像质量评估

    盛志超:男,1998年生,硕士生,研究方向为图形图像处理、深度估计

    向森:男,1989年生,副教授,研究方向为3D视频与图像的处理、机器学习、3维信息测量、视频图像质量评估

    吴谨:女,1967年生,教授,研究方向为图像处理与模式识别、信号处理与多媒体通信、检测技术与自动化装置

    通讯作者:

    盛志超 1603287154@qq.com

  • 中图分类号: TN911.73

Depth Estimation Based on Semantic Guidance for Light Field Image

  • 摘要: 光场图像的深度估计是3维重建、自动驾驶、对象跟踪等应用中的关键技术。然而,现有的深度学习方法忽略了光场图像的几何特性,在边缘、弱纹理等区域表现出较差的学习能力,导致深度图像细节的缺失。该文提出了一种基于语义导向的光场图像深度估计网络,利用上下文信息来解决复杂区域的不适应问题。设计了语义感知模块的编解码结构来重构空间信息以更好地捕捉物体边界,空间金字塔池化结构利用空洞卷积增大感受野,挖掘多尺度的上下文内容信息;通过无降维的自适应特征注意力模块局部跨通道交互,消除信息冗余的同时有效融合多路特征;最后引入堆叠沙漏串联多个沙漏模块,通过编解码结构得到更加丰富的上下文信息。在HCI 4D光场数据集上的实验结果表明,该方法表现出较高的准确性和泛化能力,优于所比较的深度估计的方法,且保留较好的边缘细节。
  • 图  1  网络结构图

    图  2  语义感知模块SP_module的网络结构

    图  3  特征注意力模块FA_module的网络结构

    图  4  堆叠沙漏模块SH_module的网络结构

    图  5  测试数据集的4个场景的深度图和坏点图

    图  6  实验结果的局部放大图

    表  1  MSE指标对比

    算法BoxesCottonDinoSideboardBackgammonPyramidsStripesAvg
    LF17.439.1681.1635.07113.010.27317.459.08
    SPO9.1071.3130.3111.0244.5870.0436.9553.33
    LF_OCC9.5931.0740.94412.07322.780.0777.9426.35
    CAE8.4271.5060.3820.8766.0740.0483.5562.98
    FSNET11.820.8810.8931.9616.5850.0151.7983.42
    EPINET5.9040.2820.1690.8492.5790.0120.2861.44
    本文算法4.7390.2590.1250.6151.5410.0070.5161.12
    下载: 导出CSV

    表  2  BP指标对比

    算法BoxesCottonDinoSideboardBackgammonPyramidsStripesAvg
    LF23.027.82919.0321.985.51612.3535.7417.9
    SPO15.892.5942.1849.2973.7810.86114.987.08
    LF_OCC26.526.21814.9118.4919.073.17218.4115.2
    CAE17.883.3694.9689.8453.9241.6817.8727.07
    FSNET14.340.5752.5265.4024.3410.2883.7224.45
    EPINET12.240.5431.3194.9212.2310.2831.0633.23
    本文算法12.320.3421.3463.9411.7220.2312.3593.18
    下载: 导出CSV

    表  3  各算法的运行时间(s)对比

    算法BoxesCottonDinoSideboard
    LF962.1984.51130987.4
    SPO2128202520242073
    LF_OCC1040863251009913531
    CAE826.9814.2832.8861.5
    FSNET85.0584.9185.6384.78
    EPINET2.0312.0362.0352.046
    本文算法6.0015.8745.9815.856
    下载: 导出CSV

    表  4  模块消融实验的定量结果比较

    模块MSEBP
    SP_blockFA_blockSH_block
    ×××1.474.07
    ××1.363.53
    ×1.213.67
    1.123.18
    下载: 导出CSV
  • [1] MENG N, LI K, LIU J Z, et al. Light field view synthesis via aperture disparity and warping confidence map[J]. IEEE Transactions on Image Processing, 2021, 30: 3908–3921. doi: 10.1109/TIP.2021.3066293
    [2] ZHANG M, JI W, PIAO Y R, et al. LFNet: Light field fusion network for salient object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 6276–6287. doi: 10.1109/TIP.2020.2990341
    [3] LI X, YANG Y B, ZHAO Q J, et al. Spatial pyramid based graph reasoning for semantic segmentation[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 8947–8956.
    [4] 武迎春, 王玉梅, 王安红, 等. 基于边缘增强引导滤波的光场全聚焦图像融合[J]. 电子与信息学报, 2020, 42(9): 2293–2301. doi: 10.11999/JEIT190723

    WU Yingchun ,WANG Yumei, WANG Anhong. Light field all-in-focus image fusion based on edge enhanced guided filtering[J]. Journal of Electronics &Information Technology, 2020, 42(9): 2293–2301. doi: 10.11999/JEIT190723
    [5] JEON H G, PARK J, CHOE G, et al. Accurate depth map estimation from a lenslet light field camera[C]. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 1547–1555.
    [6] CHEN C, LIN H T, YU Z, et al. Light field stereo matching using bilateral statistics of surface cameras[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1518–1525.
    [7] WANNER S and GOLDLUECKE B. Globally consistent depth labeling of 4D light field[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 41–48.
    [8] ZHANG S, SHENG H, LI C, et al. Robust depth estimation for light field via spinning parallelogram operator[J]. Computer Vision and Image Understanding, 2016, 145: 148–159. doi: 10.1016/j.cviu.2015.12.007
    [9] TAO M W, SRINIVASAN P P, MALIK J, et al. Depth from shading, defocus, and correspondence using light-field angular coherence[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1940–1948.
    [10] WANG T C, EFROS A A, and RAMAMOORTHI R. Occlusion-aware depth estimation using light-field cameras[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3487–3495.
    [11] WILLIEM W and PARK I K. Robust light field depth estimation for noisy scene with occlusion[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4396–4404.
    [12] HEBER S, YU W, and POCK T. Neural EPI-volume networks for shape from light field[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2271–2279.
    [13] LUO Y X, ZHOU W H, FANG J P, et al. EPI-patch based convolutional neural network for depth estimation on 4D light field[C]. 24th International Conference on Neural Information Processing, Guangzhou, China, 2017: 642–652.
    [14] SHIN C, JEON H G, YOON Y, et al. EPINET: A fully-convolutional neural network using epipolar geometry for depth from light field images[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Lake City, USA, 2018: 4748–4757.
    [15] TSAI Y J, LIU Y L, OUHYOUNG M, et al. Attention-Based view selection networks for light-field disparity estimation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12095–12103. doi: 10.1609/AAAI.v34i07.6888
    [16] ZHOU W H, ZHOU E C, YAN Y X, et al. Learning depth cues from focal stack for light field depth estimation[C]. 2019 IEEE International Conference on Image Processing, Taipei, China, 2019: 1074–1078.
    [17] SHI J L, JIANG X R, and GUILLEMOT C. A framework for learning depth from a flexible subset of dense and sparse light field views[J]. IEEE Transactions on Image Processing, 2019, 28(12): 5867–5880. doi: 10.1109/TIP.2019.2923323
    [18] GUO C L, JIN J, HOU J H, et al. Accurate light field depth estimation via an occlusion-aware network[C]. 2020 IEEE International Conference on Multimedia and Expo, London, UK, 2020: 1–6.
    [19] HU J, SHEN L, and SUN G. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
    [20] YE J W, WANG X C, JI Y X, et al. Amalgamating filtered knowledge: Learning task-customized student from multi-task teachers[C]. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, China, 2019: 4128–4134.
    [21] HONAUER K, JOHANNSEN O, KONDERMANN D, et al. A dataset and evaluation methodology for depth estimation on 4D light fields[C]. 13th Asian Conference on Computer Vision, Taipei, China, 2016: 19–34.
  • 加载中
图(6) / 表(4)
计量
  • 文章访问数:  832
  • HTML全文浏览量:  309
  • PDF下载量:  105
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-06-08
  • 修回日期:  2022-03-10
  • 录用日期:  2022-03-15
  • 网络出版日期:  2022-03-21
  • 刊出日期:  2022-08-17

目录

    /

    返回文章
    返回