高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多尺度特征校准的图像协调化方法

高陈强 谢承娟 杨烽 赵悦 李鹏程

高陈强, 谢承娟, 杨烽, 赵悦, 李鹏程. 基于多尺度特征校准的图像协调化方法[J]. 电子与信息学报, 2022, 44(4): 1495-1502. doi: 10.11999/JEIT210159
引用本文: 高陈强, 谢承娟, 杨烽, 赵悦, 李鹏程. 基于多尺度特征校准的图像协调化方法[J]. 电子与信息学报, 2022, 44(4): 1495-1502. doi: 10.11999/JEIT210159
GAO Chenqiang, XIE Chengjuan, YANG Feng, ZHAO Yue, LI Pengcheng. Image Harmonization via Multi-scale Feature Calibration[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1495-1502. doi: 10.11999/JEIT210159
Citation: GAO Chenqiang, XIE Chengjuan, YANG Feng, ZHAO Yue, LI Pengcheng. Image Harmonization via Multi-scale Feature Calibration[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1495-1502. doi: 10.11999/JEIT210159

基于多尺度特征校准的图像协调化方法

doi: 10.11999/JEIT210159
基金项目: 国家自然科学基金(62176035, 61906025),重庆市科委自然科学基金项目(cstc2020jcyj-msxmX0835, cstc2021jcyj-bsh0155),重庆市教委科学技术研究项目(KJQN201900607, KJZD-K202100606, KJQN202000647, KJQN202100646)
详细信息
    作者简介:

    高陈强:男,1981年生,博士,教授,博士生导师,研究方向为图像处理、计算机视觉与模式识别等

    谢承娟:女,1997年生,硕士生,研究方向为图像处理、图像合成

    杨烽:女,1990年生,博士,讲师,硕士生导师,研究方向为深度学习、遥感图像处理、动态纹理分析

    赵悦:女,1988年生,博士,讲师,研究方向为图像处理、机器学习

    李鹏程:男,1995年生,博士生,研究方向为智能医学影像分析、计算机视觉与模式识别

    通讯作者:

    高陈强 gaocq@cqupt.edu.cn

  • 1) https://www.flickr.com/explore
  • 中图分类号: TN911.73; TP391

Image Harmonization via Multi-scale Feature Calibration

Funds: The National Natural Science Foundation of China (62176035, 61906025), Chongqing Research Program of Basic Research and Frontier Technology (cstc2020jcyj-msxmX0835, cstc2021jcyj-bsh0155), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN201900607, KJZD-K202100606, KJQN202000647, KJQN202100646)
  • 摘要: 图像组合是图像处理中一个重要操作,然而组合图像中前景区域与背景区域的外观不协调使得组合图像看起来不真实。图像协调化是图像组合中极其重要的一个环节,其目的是调整组合图像前景区域的外观使其与背景区域一致,从而让组合图像在视觉上看起来真实。然而,现有方法只考虑了组合图像前景与背景之间的外观差异,忽略了图像局部的亮度变化差异,这使得图像整体的光照不协调。为此,该文提出一个新的多尺度特征校准模块(MFCM)学习不同尺度的感受野之间细微的特征差异。基于所提模块,该文进一步设计了一个新的编码器学习组合图像中前景与背景的外观差异和局部亮度变化,然后利用解码器重构出图像,并通过一个对前景区域归一化的回归损失指导网络学习调整前景区域的外观。在广泛使用的iHarmony4数据集上进行实验验证,结果表明该方法的效果超过了目前最优的方法,验证了该方法的有效性。
  • 图  1  本文的图像协调化网络结构图

    图  2  多尺度特征校准模块网络结构

    图  3  不同方法在iHarmony4测试集上的定性对比

    表  1  不同方法在iHarmony4测试集上的性能对比

    方法HFlickrHday2nightHCOCOHAdobe5kiHarmony4
    MSEPSNRMSEPSNRMSEPSNRMSEPSNRMSEPSNR
    DIH [1]163.3829.5582.3434.6251.8534.6992.6532.2876.7733.41
    S2AM [3]143.4530.0376.6134.5041.0735.4763.4033.7759.6734.35
    DoveNet [4]133.1430.2154.0535.1836.7235.8352.3234.3452.3634.75
    FSRIH [2]86.2032.5547.1837.1219.3038.4331.3336.0130.7937.05
    本文方法72.0533.1146.6736.9217.6238.8027.5537.3127.1337.69
    下载: 导出CSV

    表  2  不同方法在iHarmony4测试集上不同前景区域比例的MSE和fMSE指标对比

    方法0~5%5%~15%15%~100%0~100%
    MSEfMSEMSEfMSEMSEfMSEMSEfMSE
    DIH [1]18.92799.1764.23725.86228.86768.8976.77773.18
    S2AM [3]15.09623.1148.33540.54177.62592.8359.67594.67
    DoveNet [4]14.03591.8844.90504.42152.07505.8252.36549.96
    FSRIH [2]8.48371.4725.85294.6489.68296.8030.79334.89
    本文方法7.68341.1323.15264.2378.06256.0327.13302.25
    下载: 导出CSV

    表  3  多尺度特征校准模块不同组件的消融实验结果

    HFlickrHday2nightHCOCOHAdobe5kiHarmony4
    MSEPSNRMSEPSNRMSEPSNRMSEPSNRMSEPSNR
    RF=383.3932.5456.4836.6020.6338.2632.3636.6631.7137.12
    RF=589.3332.2964.5536.3422.2138.0535.9536.3734.4936.89
    MFE76.5232.8555.7336.6919.0238.4929.1337.0629.0637.41
    MFE+FC72.0533.1146.6736.9217.6238.8027.5537.3127.1337.69
    下载: 导出CSV

    表  4  以不同的方式进行特征校准的实验结果

    特征校准方式HFlickrHday2nightHCOCOHAdobe5kiHarmony4
    MSEPSNRMSEPSNRMSEPSNRMSEPSNRMSEPSNR
    独立特征校准72.5532.9949.5537.2017.7038.7528.6937.2327.6137.63
    邻域特征交互72.0533.1146.6736.9217.6238.8027.5537.3127.1337.69
    全局特征交互75.3332.9352.8236.6618.3738.6527.5137.0728.0237.51
    下载: 导出CSV

    表  5  以不同的跨通道范围进行特征校准的实验结果

    跨通道范围$ k $HFlickrHday2nightHCOCOHAdobe5kiHarmony4
    MSEPSNRMSEPSNRMSEPSNRMSEPSNRMSEPSNR
    375.5933.0848.2937.1018.2438.6827.6937.1527.9537.58
    571.6432.9454.5536.8917.9438.6830.7837.1728.3537.57
    772.0533.1146.6736.9217.6238.8027.5537.3127.1337.69
    974.6632.8552.1436.8218.1638.5830.5436.6828.7037.35
    下载: 导出CSV
  • [1] TSAI Y H, SHEN Xiaohui, LIN Zhe, et al. Deep image harmonization[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 2799–2807.
    [2] SOFIIUK K, POPENOVA P, and KONUSHIN A. Foreground-aware semantic representations for image harmonization[EB/OL]. https://arxiv.org/abs/2006.00809, 2020.
    [3] CUN Xiaodong and PUN C M. Improving the harmony of the composite image by spatial-separated attention module[J]. IEEE Transactions on Image Processing, 2020, 29: 4759–4771. doi: 10.1109/TIP.2020.2975979
    [4] CONG Wenyan, ZHANG Jianfu, NIU Li, et al. DoveNet: Deep image harmonization via domain verification[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 8391–8400.
    [5] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, 2015: 1–9.
    [6] LI Xiang, WANG Wenhai, HU Xiaolin, et al. Selective kernel networks[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 510–519.
    [7] SUNKAVALLI K, JOHNSON M K, MATUSIK W, et al. Multi-scale image harmonization[J]. ACM Transactions on Graphics, 2010, 29(4): 1–10. doi: 10.1145/1778765.1778862
    [8] ZHU Junyan, KR?HENB?HL P, SHECHTMAN E, et al. Learning a discriminative model for the perception of realism in composite images[C]. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015: 3943–3951.
    [9] ISOLA P, ZHU Junyan, ZHOU Tinghui, et al. Image-to-image translation with conditional adversarial networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 5967–5976.
    [10] 尹梦晓, 林振峰, 杨锋. 基于动态感受野的自适应多尺度信息融合的图像转换[J]. 电子与信息学报, 2021, 43(8): 2386–2394. doi: 10.11999/JEIT200675

    YIN Mengxiao, LIN Zhenfeng, and YANG Feng. Adaptive multi-scale information fusion based on dynamic receptive field for image-to-image translation[J]. Journal of Electronics &Information Technology, 2021, 43(8): 2386–2394. doi: 10.11999/JEIT200675
    [11] LEDIG C, THEIS L, HUSZ?R F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 105–114.
    [12] WANG Xintao, YU Ke, WU Shixiang, et al. ESRGAN: Enhanced super-resolution generative adversarial networks[C]. Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 63–79.
    [13] XIONG Wei, YU Jiahui, LIN Zhe, et al. Foreground-aware image inpainting[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 5833–5841.
    [14] 易诗, 吴志娟, 朱竞铭, 等. 基于多尺度生成对抗网络的运动散焦红外图像复原[J]. 电子与信息学报, 2020, 42(7): 1766–1773. doi: 10.11999/JEIT190495

    YI Shi, WU Zhijuan, ZHU Jingming, et al. Motion defocus infrared image restoration based on multi scale generative adversarial network[J]. Journal of Electronics &Information Technology, 2020, 42(7): 1766–1773. doi: 10.11999/JEIT190495
    [15] KOTOVENKO D, SANAKOYEU A, MA Pingchuan, et al. A content transformation block for image style transfer[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 10024–10033.
    [16] 张惊雷, 厚雅伟. 基于改进循环生成式对抗网络的图像风格迁移[J]. 电子与信息学报, 2020, 42(5): 1216–1222. doi: 10.11999/JEIT190407

    ZHANG Jinglei and HOU Yawei. Image-to-image translation based on improved cycle-consistent generative adversarial network[J]. Journal of Electronics &Information Technology, 2020, 42(5): 1216–1222. doi: 10.11999/JEIT190407
    [17] ANOKHIN I, SOLOVEV P, KORZHENKOV D, et al. High-resolution daytime translation without domain labels[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 7485–7494.
    [18] HE Mingming, LIAO Jing, CHEN Dongdong, et al. Progressive color transfer with dense semantic correspondences[J]. ACM Transactions on Graphics, 2019, 38(2): 1–18. doi: 10.1145/3292482
    [19] ULYANOV D, VEDALDI A, and LEMPITSKY V. Instance normalization: The missing ingredient for fast stylization[EB/OL]. https://arxiv.org/abs/1607.08022, 2017.
    [20] IOFFE S and SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]. The 32nd International Conference on International Conference on Machine Learning - Volume 37, Lille, France, 2015: 448–456.
    [21] HU Jie, SHEN Li, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011–2023. doi: 10.1109/TPAMI.2019.2913372
    [22] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. The 15th European Conference, Munich, Germany, 2018: 3–19.
    [23] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]. The 13th European Conference, Zurich, Switzerland, 2014: 740–755.
    [24] BYCHKOVSKY V, PARIS S, CHAN E, et al. Learning photographic global tonal adjustment with a database of input/output image pairs[C]. The CVPR 2011, Colorado, USA, 2011: 97–104.
    [25] ZHOU Hao, SATTLER T, and JACOBS D W. Evaluating local features for day-night matching[C]. The 14th European Conference on Computer Vision, Amsterdam, Holland, 2016: 724–736.
    [26] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255.
  • 加载中
图(3) / 表(5)
计量
  • 文章访问数:  1049
  • HTML全文浏览量:  522
  • PDF下载量:  86
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-02-25
  • 修回日期:  2021-08-22
  • 网络出版日期:  2021-09-08
  • 刊出日期:  2022-04-18

目录

    /

    返回文章
    返回