高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

SR-FDN:面向图像细节恢复的频域扩散超分辨率重建网络

李秀梅 丁林琳 孙军梅 白煌

李秀梅, 丁林琳, 孙军梅, 白煌. SR-FDN:面向图像细节恢复的频域扩散超分辨率重建网络[J]. 电子与信息学报. doi: 10.11999/JEIT250224
引用本文: 李秀梅, 丁林琳, 孙军梅, 白煌. SR-FDN:面向图像细节恢复的频域扩散超分辨率重建网络[J]. 电子与信息学报. doi: 10.11999/JEIT250224
LI Xiumei, DING Linlin, SUN Junmei, BAI Huang. SR-FDN: A Frequency-Domain Diffusion Network for Image Detail Restoration in Super-Resolution[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250224
Citation: LI Xiumei, DING Linlin, SUN Junmei, BAI Huang. SR-FDN: A Frequency-Domain Diffusion Network for Image Detail Restoration in Super-Resolution[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250224

SR-FDN:面向图像细节恢复的频域扩散超分辨率重建网络

doi: 10.11999/JEIT250224 cstr: 32379.14.JEIT250224
基金项目: 中国-克罗地亚科技合作交流项目
详细信息
    作者简介:

    李秀梅:女,博士,教授,研究方向为时频分析及应用、压缩感知、深度学习

    丁林琳:女,硕士生,研究方向为深度学习、图像超分辨率重建

    孙军梅:女,博士,副教授,研究方向为深度学习、智能软件系统

    白煌:男,博士,副教授,研究方向为信号处理、矩阵分析、深度学习

    通讯作者:

    白煌 baihuang@hznu.edu.cn

  • 中图分类号: TP391.41

SR-FDN: A Frequency-Domain Diffusion Network for Image Detail Restoration in Super-Resolution

Funds: The China-Croatia Bilateral Science & Technology Cooperation Project
  • 摘要: 现有的一些利用频域信息的图像超分辨率重建方法在处理高频细节时仍存在一定的改进空间,在某些场景下难以避免模糊或失真的现象。为了解决这一问题,该文提出一种基于频域扩散模型的超分辨率重建网络SR-FDN。具体来说,SR-FDN引入双分支频域注意力机制,在频域与空间域进行特征融合,能够有效地捕捉频域特征并恢复高频信息,进一步提升细节恢复效果。SR-FDN还使用了小波下采样代替传统U-Net噪声预测器中的卷积下采样,在降低图像尺寸的同时保留更多的细节信息。此外,SR-FDN通过频域损失函数的约束和条件图像的引导,使得生成的高分辨率图像在细节和纹理方面具有更高的精确度。在多个基准数据集上的实验表明,所提的SR-FDN可以重建出质量更好、细节更丰富的图像,并且在定性和定量比较中均有明显优势。
  • 图  1  SR-FDN总体结构

    图  2  所提方法与其他基于扩散模型的图像超分辨率重建方法在训练时使用不同迭代次数时的性能比较

    图  3  在CelabA数据集进行×4倍超分重建后的可视化结果

    图  4  在DIV2K数据集进行×4倍超分重建后的可视化结果

    图  5  特征可视化

    1  训练阶段

     1: repeat
     2: $ ({{\boldsymbol{x}}_{{\mathrm{LR}}}},{{\boldsymbol{x}}_{{\mathrm{HR}}}}){\text{~}}({{\boldsymbol{x}}}_{{\mathrm{LR}}}^i,{{\boldsymbol{x}}}_{{\mathrm{HR}}}^i)_{i = 1}^N $
     3: $ \varepsilon {\text{~}}\mathcal{N}(0,I) $
     4: $ t{\text{~}}U(\{ 1, 2,\cdots ,T\} ) $
     5: 梯度下降优化损失函数$ {\mathcal{L}_\theta } $
     6: until 收敛
    下载: 导出CSV

    2  采样阶段

     1:$ {{\boldsymbol{x}}_T}{\text{~}}\mathcal{N}(0,I) $
     2: for $ t = T, \cdots ,2,1 $ do
     3: $ z = \mathcal{N}(0,{\boldsymbol{I}}) $ if $ t > 1 $, else $ z = 0 $
     4: $ {{\boldsymbol{x}}_{t - 1}} = \dfrac{1}{{\sqrt {{\alpha _t}} }}\left({{\boldsymbol{x}}_t} - \dfrac{{1 - {\alpha _t}}}{{\sqrt {1 - \overline {{\alpha _t}} } }}{\varepsilon _\theta }({x_t},t)\right) + {\sigma _t}z $
     5: end for
     6: return $ {x_0} + {\mathrm{up}}({x_{{\mathrm{LR}}}}) $
    下载: 导出CSV

    表  1  在两个人脸数据集上对128×128分辨率进行4倍图像超分辨率重建的定量比较

    32×32↓
    128×128
    FFHQCelebA
    PSNR(dB)SSIMLPIPSPSNR(dB)SSIMLPIPS
    基准值﹢∞1.0000﹢∞1.0000
    SRCNN24.3600.6580.29625.7600.6820.312
    SRGAN24.8650.7510.14126.2990.7880.104
    ESRGAN25.3480.7650.11726.4570.7990.099
    SR319.2850.5110.14320.0660.5320.120
    DiWa24.9580.7380.08127.6200.8130.050
    IR-SDE25.8300.7700.02327.2120.8090.046
    ResDiff25.6790.7670.11327.7410.8220.045
    SR-FDN26.1560.7870.14228.2460.8350.088
    注:最好的结果用粗体突出显示,次好的结果用下划线突出显示。
    下载: 导出CSV

    表  2  基于扩散的图像超分辨率重建模型在256×256分辨率下进行4倍超分的定量比较

    64×64↓
    256×256
    DIV2KUrban100Set5Set14BSD100
    PSNRSSIMLPIPSPSNRSSIMLPIPSPSNRSSIMLPIPSPSNRSSIMLPIPSPSNRSSIMLPIPS
    基准值﹢∞1.0000﹢∞1.0000﹢∞1.0000﹢∞1.0000﹢∞1.0000
    SR317.0100.5410.38716.5120.4980.30516.1580.5840.14817.2020.5460.18917.3570.5440.256
    DiWa20.2380.5230.17218.2120.4610.26224.1760.6370.09020.4140.5880.16322.3360.5660.190
    IR-SDE19.7210.4510.25317.7860.3850.25421.9990.6040.17719.2820.4220.27120.8170.4360.283
    ResDiff20.9750.5880.14018.4110.5040.17025.6820.7660.08220.9730.5710.13522.5440.5700.163
    SR-FDN20.9670.5930.25018.6450.4990.25124.9410.7480.14421.1790.5920.24122.8870.5970.292
    注:最好的结果用粗体突出显示,次好的结果用下划线突出显示。
    下载: 导出CSV

    表  3  在两个人脸数据集上对256×256分辨率进行8倍图像超分辨率重建的定量比较

    32×32↓
    256×256
    FFHQCelebA
    PSNR(dB)SSIMLPIPSPSNR(dB)SSIMLPIPS
    基准值﹢∞1.0000﹢∞1.0000
    SRCNN23.3900.5070.62822.810.5350.625
    SRResNet23.8830.5640.58723.3070.6010.129
    SR317.7730.5930.22118.3460.6210.196
    DiWa22.1790.5160.21623.2400.5630.190
    IR-SDE24.1460.6280.18725.4380.6850.143
    ResDiff24.2200.6600.16123.8460.6860.153
    SR-FDN25.4410.6640.19825.5770.5830.287
    注:最好的结果用粗体突出显示,次好的结果用下划线突出显示。
    下载: 导出CSV

    表  4  基于扩散的图像超分辨率重建方法在CelebA数据集中采样时间的比较

    方法采样时间(s)
    SR317
    DiWa7
    ResDiff24
    IR-SDE4
    SR-FDN23
    下载: 导出CSV

    表  5  消融实验

    模型 模型组成 评价指标
    FDL-SFT HWDB DFDA PSNR(dB) SSIM
    模型1 × × × 27.509 0.803
    模型2 × × 27.843 0.833
    模型3 × 28.061 0.806
    模型4
    (SR-FDN)
    28.246 0.835
    下载: 导出CSV
  • [1] LIANG Jie, ZENG Hui, and ZHANG Lei. Details or artifacts: A locally discriminative learning approach to realistic image super-resolution[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 5647–5656. doi: 10.1109/CVPR52688.2022.00557.
    [2] DONG Chao, LOY C C, HE Kaiming, et al. Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295–307. doi: 10.1109/TPAMI.2015.2439281.
    [3] MOSER B B, RAUE F, FROLOV S, et al. Hitchhiker's guide to super-resolution: Introduction and recent advances[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 9862–9882. doi: 10.1109/TPAMI.2023.3243794.
    [4] YANG Qi, ZHANG Yanzhu, ZHAO Tiebiao, et al. Single image super-resolution using self-optimizing mask via fractional-order gradient interpolation and reconstruction[J]. ISA Transactions, 2018, 82: 163–171. doi: 10.1016/j.isatra.2017.03.001.
    [5] SUN Long, PAN Jinshan, and TANG Jinhui. ShuffleMixer: An efficient ConvNet for image super-resolution[C]. The 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022: 1259.
    [6] ZHANG Yulun, LI Kunpeng, LI Kai, et al. Image super-resolution using very deep residual channel attention networks[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 294–310. doi: 10.1007/978-3-030-01234-2_18.
    [7] 程德强, 袁航, 钱建生, 等. 基于深层特征差异性网络的图像超分辨率算法[J]. 电子与信息学报, 2024, 46(3): 1033–1042. doi: 10.11999/JEIT230179.

    CHENG Deqiang, YUAN Hang, QIAN Jiansheng, et al. Image super-resolution algorithms based on deep feature differentiation network[J]. Journal of Electronics & Information Technology, 2024, 46(3): 1033–1042. doi: 10.11999/JEIT230179.
    [8] HUANG Huaibo, HE Ran, SUN Zhenan, et al. Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 1698–1706. doi: 10.1109/ICCV.2017.187.
    [9] LIANG Jingyun, CAO Jiezhang, SUN Guolei, et al. SwinIR: Image restoration using swin transformer[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 1833–1844. doi: 10.1109/ICCVW54120.2021.00210.
    [10] LEDIG C, THEIS L, HUSZAR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 105–114. doi: 10.1109/CVPR.2017.19.
    [11] WANG Xintao, YU Ke, WU Shixiang, et al. ESRGAN: Enhanced super-resolution generative adversarial networks[C]. The 15th European Conference on Computer Vision Workshops, Munich, Germany, 2019: 63–79. doi: 10.1007/978-3-030-11021-5_5.
    [12] 韩玉兰, 崔玉杰, 罗轶宏, 等. 基于密集残差和质量评估引导的频率分离生成对抗超分辨率重构网络[J]. 电子与信息学报, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388.

    HAN Yulan, CUI Yujie, LUO Yihong, et al. Frequency separation generative adversarial super-resolution reconstruction network based on dense residual and quality assessment[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388.
    [13] 任坤, 李峥瑱, 桂源泽, 等. 低分辨率随机遮挡人脸图像的超分辨率修复[J]. 电子与信息学报, 2024, 46(8): 3343–3352. doi: 10.11999/JEIT231262.

    REN Kun, LI Zhengzhen, GUI Yuanze, et al. Super-resolution inpainting of low-resolution randomly occluded face images[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3343–3352. doi: 10.11999/JEIT231262.
    [14] WOLF V, LUGMAYR A, DANELLJAN M, et al. DeFlow: Learning complex image degradations from unpaired data with conditional flows[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 94–103. doi: 10.1109/CVPR46437.2021.00016.
    [15] HO J, JAIN A, and ABBEEL P. Denoising diffusion probabilistic models[C]. The 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2020: 574.
    [16] SONG Jiaming, MENG Chenlin, and ERMON S. Denoising diffusion implicit models[C]. The 9th International Conference on Learning Representations, Vienna, Austria, 2021.
    [17] LI Haoying, YANG Yifan, CHANG Meng, et al. SRDiff: Single image super-resolution with diffusion probabilistic models[J]. Neurocomputing, 2022, 479: 47–59. doi: 10.1016/j.neucom.2022.01.029.
    [18] SAHARIA C, HO J, CHAN W, et al. Image super-resolution via iterative refinement[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 4713–4726. doi: 10.1109/TPAMI.2022.3204461.
    [19] MOSER B B, FROLOV S, RAUE F, et al. Waving goodbye to low-res: A diffusion-wavelet approach for image super-resolution[C]. The International Joint Conference on Neural Networks, Yokohama, Japan, 2024: 1–8. doi: 10.1109/IJCNN60899.2024.10651227.
    [20] LUO Ziwei, GUSTAFSSON F, ZHAO Zheng, et al. Image restoration with mean-reverting stochastic differential equations[C]. The 40th International Conference on Machine Learning, Honolulu, USA, 2023: 957.
    [21] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 10674–10685. doi: 10.1109/CVPR52688.2022.01042.
    [22] SHANG Shuyao, SHAN Zhengyang, LIU Guangxing, et al. ResDiff: Combining CNN and diffusion model for image super-resolution[C]. The 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 8975–8983. doi: 10.1609/aaai.v38i8.28746.
    [23] XU Guoping, LIAO Wentao, ZHANG Xuan, et al. Haar wavelet downsampling: A simple but effective downsampling module for semantic segmentation[J]. Pattern Recognition, 2023, 143: 109819. doi: 10.1016/j.patcog.2023.109819.
    [24] ZHANG Yutian, LI Xiaohua, and ZHOU Jiliu. SFTGAN: A generative adversarial network for pan-sharpening equipped with spatial feature transform layers[J]. Journal of Applied Remote Sensing, 2019, 13(2): 026507. doi: 10.1117/1.jrs.13.026507.
  • 加载中
图(5) / 表(7)
计量
  • 文章访问数:  110
  • HTML全文浏览量:  71
  • PDF下载量:  32
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-04-01
  • 修回日期:  2025-07-23
  • 网络出版日期:  2025-08-05

目录

    /

    返回文章
    返回