高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

HWT-SRNet:异质窗口图像超分辨率重建网络

卢迪 党安圆

卢迪, 党安圆. HWT-SRNet:异质窗口图像超分辨率重建网络[J]. 电子与信息学报. doi: 10.11999/JEIT250868
引用本文: 卢迪, 党安圆. HWT-SRNet:异质窗口图像超分辨率重建网络[J]. 电子与信息学报. doi: 10.11999/JEIT250868
LU Di, DANG Anyuan. HWT-SRNet: Heterogeneous Windows Transformer Network for Image Super-Resolution[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250868
Citation: LU Di, DANG Anyuan. HWT-SRNet: Heterogeneous Windows Transformer Network for Image Super-Resolution[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250868

HWT-SRNet:异质窗口图像超分辨率重建网络

doi: 10.11999/JEIT250868 cstr: 32379.14.JEIT250868
详细信息
    作者简介:

    卢迪:女,教授,博士,研究方向为数据融合、图像处理等

    党安圆:男,硕士生,研究方向为图像处理、超分辨率重建等

    通讯作者:

    卢迪 ludizeng@hrbust.edu.cn

  • 中图分类号: TN911.73

HWT-SRNet: Heterogeneous Windows Transformer Network for Image Super-Resolution

  • 摘要: 在大数据时代,图像质量参差不齐,对低质量图像进行高分辨率重建具有重要的研究与应用价值。基于 Transformer的单图像超分辨率方法通常将自注意力机制限制在局部非重叠窗口中,导致感受野受限、窗口边界失真以及高频细节重构能力不足等问题。为此,该文提出一种基于Swin IR的异质窗口注意力网络(Heterogeneous window Transformer Network for Image Super-Resolution, HWT-SRNet)。首先,设计异质窗口注意力机制,充分融合多尺度特征,以缓解窗口边界失真问题并有效扩大感受野。其次,针对Transformer在高频信息重构能力上的不足,提出一种高频先验特征提取网络,增强网络对边缘与纹理细节的恢复能力。实验结果表明,HWT-SRNet在Set5, Set14, BSD100, Urban100, Manga109五个基准测试集上,PSNR指标相比基线模型Swin IR提升0.10 dB至0.37 dB,同时,与其他具有代表性的超分模型CAT, ACT, ART等相比,在图像细节和纹理方面也取得了更优的视觉效果。
  • 图  1  不同的注意力机制

    图  2  异质窗口图像超分辨率重建网络

    图  3  异质窗口注意力模块

    图  4  高频先验提取网络

    图  5  不同方法在Set14数据集的图像4倍超分辨率重构比较

    图  6  不同方法在Urban100数据集的图像4倍超分辨率重构比较

    图  7  不同模块在Set5数据集“baby”图像4倍超分辨率重构视觉效果比较

    表  1  不同方法各数据集的PSNR和SSIM均值比较

    算法缩放因子Set5Set14BSD100Urban100Manga109
    PSNR/SSIM/PSNR/SSIMPSNR/SSIMPSNR/SSIMPSNR/SSIM
    ESWT[8]×238.33/0.961534.22/0.923332.47/0.903433.27/0.939739.79/0.9790
    CAT-R[7]38.48/0.962534.53/0.925132.56/0.904534.08/0.944340.09/0.9804
    Swin IR[6]38.42/0.962334.46/0.925032.53/0.904133.81/0.942739.92/0.9797
    ACT[14]38.46/0.962634.60/0.925632.56/0.904834.07/0.944339.95/0.9804
    CRAFT[18]38.23/0.961533.92/0.921132.33/0.901632.86/0.934339.39/0.9786
    ART[11]38.56/0.962934.59/0.926732.58/0.904834.30/0.945240.24/0.9808
    DFDN[20]38.19/0.961233.85/0.919932.30/0.901332.68/0.9335——
    MDIESR[21]38.17/0.961333.83/0.920032.31/0.901332.65/0.9331-——
    HWT-SRNet38.59/0.963234.81/0.928732.58/0.905034.42/0.945340.25/0.9812
    ESWT[8]×334.63/0.929030.55/0.846429.23/0.808828.70/0.862834.05/0.9479
    CAT-R[7]34.99/0.932031.00/0.853929.49/0.815429.91/0.884835.29/0.9542
    Swin IR[6]34.97/0.931830.93/0.853429.46/0.814529.75/0.882635.12/0.9537
    ACT[14]35.03/0.932131.08/0.854129.51/0.816430.08/0.885835.27/0.954
    CRAFT[18]34.71/0.929530.61/0.846929.24/0.809328.77/0.863534.29/0.9491
    ART[11]35.07/0.932531.02/0.854129.51/0.815930.10/0.887135.39/0.9548
    DFDN[20]34.69/0.929330.55/0.846429.25/0.808928.70/0.8630——
    MDIESR[21]34.69/0.929530.58/0.846529.25/0.808728.72/0.8634——
    HWT-SRNet35.12/0.934431.06/0.855129.57/0.817330.18/0.888935.48/0.9552
    ESWT[8]×432.46/0.897928.80/0.786627.70/0.741026.56/0.800630.94/0.9136
    CAT-R[7]32.89/0.904429.13/0.795527.95/0.750027.62/0.829232.16/0.9269
    Swin IR[6]32.92/0.904429.09/0.795027.92/0.748927.45/0.825432.03/0.9260
    ACT[14]32.97/0.903129.18/0.795427.95/0.750727.74/0.830532.20/0.9267
    CRAFT[18]32.52/0.898928.85/0.787227.72/0.741826.56/0.799531.18/0.9168
    ART[11]33.04/0.905129.16/0.795827.97/0.751027.77/0.832132.31/0.9283
    DFDN[20]32.56/0.898928.87/0.788027.73/0.741426.59/0.8008——
    MDIESR[21]32.49/0.898628.84/0.786727.73/0.739926.59/0.8007——
    HWT-SRNet33.08/0.906029.23/0.797528.02/0.752027.82/0.837032.35/0.9296
    下载: 导出CSV

    表  2  不同方法各数据集的LPIPS均值比较(×4)

    算法Set5Set14BSD100Urban100Manga109
    ESWT[8]0.20780.29770.33830.28120.1912
    CAT-R[7]0.20610.29270.32790.24960.1819
    Swin IR[6]0.20790.29570.33210.26020.1847
    ACT[14]0.20780.29040.32350.25060.1840
    CRAFT[18]0.21360.30440.33890.28160.1920
    ART[11]0.20680.29130.32590.24640.1804
    HWT-SRNet0.20500.29070.32550.24480.1799
    下载: 导出CSV

    表  3  参数量与重构时间的比较

    算法 参数量 重构时间(s)
    ESWT[8] 589K 1.20
    CAT-R[7] 16.6M 4.41
    Swin IR[6] 11.9M 1.38
    ACT[14] 46M 10.51
    CRAFT[18] 753K 1.92
    ART[11] 16.55M 3.85
    HWT-SRNet 16.63M 2.96
    下载: 导出CSV

    表  4  不同窗口大小对比实验结果

    序号窗口形状窗口大小Multi-adds(GMac)PSNR/SSIM
    1方形窗口(8,8)53.632.92/0.9044
    (16,16)63.832.98/0.9050
    (32,32)119.333.01/0.9051
    2栅栏形窗口279.532.56/0.8989
    482.432.82/0.9029
    894.132.99/0.9049
    16120.333.01/0.9050
    3异质窗口(8,8),881.433.00/0.9050
    (8,8),1687.533.01/0.9052
    (16,16),473.132.90/0.9040
    (16,16),887.033.03/0.9054
    下载: 导出CSV

    表  5  不同模块对比实验结果

    序号Swin IR异质窗口高频先验特征提取网络PSNR/SSIM
    1××32.92/0.9044
    2×33.03/0.9054
    3×32.98/0.9049
    433.08/0.9060
    下载: 导出CSV
  • [1] DONG Chao, LOY C C, HE Kaiming, et al. Learning a deep convolutional network for image super-resolution[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 184–199. doi: 10.1007/978-3-319-10593-2_13.
    [2] DONG Chao, LOY C C, and TANG Xiaoou. Accelerating the super-resolution convolutional neural network[C]. 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 391–407. doi: 10.1007/978-3-319-46475-6_25.
    [3] ZHANG Yulun, LI Kunpeng, LI Kai, et al. Image super-resolution using very deep residual channel attention networks[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 294–310. doi: 10.1007/978-3-030-01234-2_18.
    [4] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, Vienna, Austria, 2021: 1–21. (查阅网上资料, 未找到本条文献出版地信息, 请确认).
    [5] LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021: 9992–10002. doi: 10.1109/ICCV48922.2021.00986.
    [6] LIANG Jingyun, CAO Jiezhang, SUN Guolei, et al. SwinIR: Image restoration using Swin transformer[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, Canada, 2021: 1833–1844. doi: 10.1109/ICCVW54120.2021.00210.
    [7] CHEN Zheng, ZHANG Yulun, GU Jinjin, et al. Cross aggregation transformer for image restoration[C]. Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022: 1847.
    [8] SHI Jinpeng, LI Hui, LIU Tianle, et al. Image super-resolution using efficient striped window transformer[EB/OL]. https://arxiv.org/abs/2301.09869, 2023.
    [9] WU Sitong, WU Tianyi, TAN Haoru, et al. Pale transformer: A general vision transformer backbone with pale-shaped attention[C]. Proceedings of the 36th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2022: 2731–2739. doi: 10.1609/aaai.v36i3.20176. (查阅网上资料,未找到本条文献出版地信息,请确认).
    [10] WU Gang, JIANG Junjun, JIANG Kui, et al. Content-aware transformer for all-in-one image restoration[EB/OL]. https://arxiv.org/abs/2504.04869v1, 2025.
    [11] ZHANG Jiale, ZHANG Yulun, GU Jinjin, et al. Accurate image restoration with attention retractable transformer[C]. The Eleventh International Conference on Learning Representations, Kigali, Rwanda, 2023: 1–13.
    [12] CHEN Zheng, ZHANG Yulun, GU Jinjin, et al. Recursive generalization transformer for image super-resolution[C]. The Twelfth International Conference on Learning Representations, Vienna, Austria, 2024: 1–12.
    [13] CHU Shuchuan, DOU Zhichao, PAN J S, et al. HMANet: Hybrid multi-axis aggregation network for image super-resolution[C]. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, USA, 2024: 6257–6266. doi: 10.1109/CVPRW63382.2024.00629.
    [14] YOO J, KIM T, LEE S, et al. Enriched CNN-transformer feature aggregation networks for super-resolution[C]. Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2023: 4945–4954. doi: 10.1109/WACV56688.2023.00493.
    [15] SI Chenyang, YU Weihao, ZHOU Pan, et al. Inception transformer[C]. Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022: 1707.
    [16] KORKMAZ C, TEKALP A M, and DOGAN Z. Training generative image super-resolution models by wavelet-domain losses enables better control of artifacts[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024: 5926–5936. doi: 10.1109/CVPR52733.2024.00566.
    [17] 韩玉兰, 崔玉杰, 罗轶宏, 等. 基于密集残差和质量评估引导的频率分离生成对抗超分辨率重构网络[J]. 电子与信息学报, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388.

    HAN Yulan, CUI Yujie, LUO Yihong, et al. Frequency separation generative adversarial super-resolution reconstruction network based on dense residual and quality assessment[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4563–4574. doi: 10.11999/JEIT240388.
    [18] LI Ao, ZHANG Le, LIU Yun, et al. Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution[C]. Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 12480–12490. doi: 10.1109/iccv51070.2023.01150.
    [19] YAO Hongdou, HAN Pengfei, WANG Xiaofen, et al. Super-resolution via hierarchical attention and detail enhancement transformer network[J]. Optics & Laser Technology, 2025, 188: 112836. doi: 10.1016/j.optlastec.2025.112836.
    [20] 程德强, 袁航, 钱建生, 等. 基于深层特征差异性网络的图像超分辨率算法[J]. 电子与信息学报, 2024, 46(3): 1033–1042. doi: 10.11999/JEIT230179.

    CHENG Deqiang, YUAN Hang, QIAN Jiansheng, et al. Image super-resolution algorithms based on deep feature differentiation network[J]. Journal of Electronics & Information Technology, 2024, 46(3): 1033–1042. doi: 10.11999/JEIT230179.
    [21] 寇旗旗, 刘规, 江鹤, 等. 基于多域信息增强的轻量级图像超分辨率网络[J]. 通信学报, 2025, 46(4): 144–159. doi: 10.11959/j.issn.1000-436x.2025059.

    KOU Qiqi, LIU Gui, JIANG He, et al. Lightweight image super-resolution network based on muti-domain information enhancement[J]. Journal on Communications, 2025, 46(4): 144–159. doi: 10.11959/j.issn.1000-436x.2025059.
    [22] WANG Xintao, XIE Liangbin, DONG Chao, et al. Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, Canada, 2021: 1905–1914. doi: 10.1109/ICCVW54120.2021.00217.
    [23] WANG Yufei, YANG Wenhan, CHEN Xinyuan, et al. SinSR: Diffusion-based image super-resolution in a single step[C]. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 25796–25805. doi: 10.1109/CVPR52733.2024.02437.
  • 加载中
图(7) / 表(5)
计量
  • 文章访问数:  12
  • HTML全文浏览量:  9
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 修回日期:  2026-05-29
  • 录用日期:  2026-05-29
  • 网络出版日期:  2026-06-08

目录

    /

    返回文章
    返回