高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种自适应图像插值算法及加速引擎的协同设计

严忻恺 丁晟

严忻恺, 丁晟. 一种自适应图像插值算法及加速引擎的协同设计[J]. 电子与信息学报, 2023, 45(9): 3284-3294. doi: 10.11999/JEIT221503
引用本文: 严忻恺, 丁晟. 一种自适应图像插值算法及加速引擎的协同设计[J]. 电子与信息学报, 2023, 45(9): 3284-3294. doi: 10.11999/JEIT221503
YAN Xinkai, DING Sheng. Adaptive Image Interpolation Algorithm and Acceleration Engine Co-Design[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3284-3294. doi: 10.11999/JEIT221503
Citation: YAN Xinkai, DING Sheng. Adaptive Image Interpolation Algorithm and Acceleration Engine Co-Design[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3284-3294. doi: 10.11999/JEIT221503

一种自适应图像插值算法及加速引擎的协同设计

doi: 10.11999/JEIT221503
基金项目: 江苏省高等学校自然科学研究项目(19KJB510027),江苏省“333工程”科研资助项目(BRA2020318),江苏省专用集成电路设计重点实验室开放基金(2020KLOP005)
详细信息
    作者简介:

    严忻恺:男,讲师,博士生,研究方向为智能图形芯片设计等

    丁晟:男,副教授,博士,研究方向FPGA设计等

    通讯作者:

    严忻恺 yanxinkai@zju.edu.cn

  • 中图分类号: TN492

Adaptive Image Interpolation Algorithm and Acceleration Engine Co-Design

Funds: The Natural Science Foundation of the Jiangsu Higher Education Institutions of China (19KJB510027), Jiangsu “333” Scientific Research Project (BRA2020318), The Development Fundation of Jiangsu Key Laboratory of Asic Design (2020KLOP005)
  • 摘要: 为提高高清彩色图像超分辨率重建效果,该文提出了一种基于边缘对比度的新型自适应图像插值算法。使用边缘对比度检测和不同尺度的感受野来自适应选择Lanczos插值的系数,自适应性和不同感受野可以进一步提升图像放大质量,图像质量相比于双线性插值平均峰值信噪比(PSNR)提高1.1 dB,结构相似度(SSIM)提高0.025,图像感知相似度(LPIPS)提高0.051,相比于双三次插值平均PSNR提高0.34 dB,SSIM提高0.01,LPIPS提高0.033。同时为减少硬件资源以及提高存储效率协同设计了一种高并行、高能效的加速插值引擎架构,通过两级数据重用和系数脉动机制极大提高计算访存比。加速引擎在16 nm工艺库的综合结果达到2 GHz时钟频率;在Xilinx Zynq Ultra scale+ xczu15eg FPGA上工作频率达到200 MHz,帧速度(fps)达到60的实时性能。
  • 图  1  基于边缘优化的图像插值算法流程图

    图  2  各级感受野像素的合并插值示意图

    图  3  加速引擎总体架构示意图

    图  4  插值引擎总体结构框图

    图  5  边缘检测单元结构图

    图  6  插值计算单元结构图

    表  1  乘加器单元数目

    模块名数量备注
    水平插值计算单元8×3int8×int16+int24/
    int16×int16+int32
    竖直插值计算单元1×3int8×int16+int24/
    int16×int16+int32
    下载: 导出CSV

    表  2  加法器单元数目

    模块名数量位宽
    阈值计算单元24Int8
    梯度计算单元4Int8
    8Int9
    4+4(绝对值)Int10
    2Int11
    边缘处理单元4Int12
    近似灰度转换21Int8
    下载: 导出CSV

    表  3  插值引擎的RAM容量表

    模块名数量容量
    插值系数48×4×2B
    共计256B
    下载: 导出CSV

    表  4  插值引擎的寄存器数目表

    模块名数量位宽总数
    像素寄存器阵列883B264 Byte
    近似灰度阵列421B42 Byte
    乘累加结果缓存514B+2B306 Byte
    控制+边缘检测80 Byte
    共计692 Byte
    下载: 导出CSV

    表  5  不同算法的复杂度对比

    算法时间复杂度乘法次数像素点数量
    双线性插值O(n)64
    双三次插值O(n)2016
    Lanczos3插值O(n)4236
    Lanczos4插值O(n)7264
    本文算法O(n)7264
    下载: 导出CSV

    表  6  不同算法的PSNR对比(dB)

    算法平均PSNR最佳PSNR最差PSNR
    双线性插值30.8835.0823.15
    双三次插值31.5936.0023.66
    Lanczos3插值31.8236.3423.82
    Lanczos4插值31.8436.3923.83
    本文算法31.9336.4223.94
    下载: 导出CSV

    表  7  不同算法的SSIM对比

    算法平均SSIM最佳SSIM最差SSIM
    双线性插值0.8490.9200.599
    双三次插值0.8640.9360.625
    Lanczos3插值0.8680.9420.631
    Lanczos4插值0.8690.9440.632
    本文算法0.8740.9470.650
    下载: 导出CSV

    表  8  不同算法的LPIPS对比

    算法平均LPIPS最佳LPIPS最差LPIPS
    双线性插值0.2910.1330.517
    双三次插值0.2730.1040.513
    Lanczos3插值0.2760.0980.523
    Lanczos4插值0.2750.0960.520
    本文算法0.2440.0790.479
    下载: 导出CSV

    表  9  FPGA硬件实现的指标对比

    参数名称文献[19]文献[25]本文
    图像大小256×256灰度256×256灰度960×540彩色
    插值算法BICUBICNEDI本文算法
    FPGA平台Artix-7virtex-7Xilinx Zynq
    频率(MHz)289.2100200
    Slice LUTs359488319038
    Slice Reg16227056492
    DSPs04227
    *说明:FPGA资源为单个插值引擎
    下载: 导出CSV

    表  10  ASIC硬件实现的指标对比

    硬件指标VLSI’18[26]ISSCC’21[22]本文
    工艺65 nm40 nm16 nm
    算法CNN插值+预学习插值
    吞吐量(fps)609060*
    数据精度INT8/INT16INT8/INT16INT8/INT16
    频率(MHz)2002002000
    SRAM(KB)5723712.23
    门数量(M)3.110.23
    *说明:本文实例化4个插值引擎实现的吞吐量(fps)为60
    下载: 导出CSV
  • [1] MEIJERING E H W, ZUIDERVELD K J, and VIERGEVER M A. Image reconstruction by convolution with symmetrical piecewise nth-order polynomial kernels[J]. IEEE Transactions on Image Processing, 1999, 8(2): 192–201. doi: 10.1109/83.743854
    [2] KEYS R G. Cubic convolution interpolation for digital image processing[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981, 29(6): 1153–1160. doi: 10.1109/TASSP.1981.1163711
    [3] KWOK W and SUN H. Multi-directional interpolation for spatial error concealment[J]. IEEE Transactions on Consumer Electronics, 1993, 39(3): 455–460. doi: 10.1109/30.234620
    [4] LI Xin and ORCHARD M T. New edge-directed interpolation[J]. IEEE Transactions on Image Processing, 2001, 10(10): 1521–1527. doi: 10.1109/83.951537
    [5] CHEN Meijuan, HUANG C H, and LEE W L. A fast edge-oriented algorithm for image interpolation[J]. Image and Vision Computing, 2005, 23(9): 791–798. doi: 10.1016/j.imavis.2005.05.005
    [6] ZHANG Xiangjun and WU Xiaolin. Image interpolation by adaptive 2-D autoregressive modeling and soft-decision estimation[J]. IEEE Transactions on Image Processing, 2008, 28(6): 887–896. doi: 10.1109/TIP.2008.924279
    [7] JAKHETIYA V, KUMAR A, and TIWARI A K. Image interpolation by adaptive 2-D autoregressive modeling[C]. Proceedings of SPIE 7546, Second International Conference on Digital Image Processing, Singapore, 2010.
    [8] LIU Yiwei, JIANG Zhuqing, WANG Yibo, et al. Single-frame reconstruction for improvement of off-axis digital holographic imaging based on image interpolation[J]. Optics Letters, 2020, 45(24): 6623–6626. doi: 10.1364/OL.405578
    [9] WANG Qiang, TANG Xiaoou, and SHUM H. Patch based blind image super resolution[C]. Proceedings of the Tenth IEEE International Conference on Computer Vision, Beijing, China, 2005: 709–716.
    [10] CHAN T M, ZHANG Junping, PU Jian, et al. Neighbor embedding based super-resolution algorithm through edge detection and feature selection[J]. Pattern Recognition Letters, 2009, 30(5): 494–502. doi: 10.1016/j.patrec.2008.11.008
    [11] GAO Xinbo, ZHANG Kaibing, TAO Dacheng, et al. Joint learning for single-image super-resolution via a coupled constraint[J]. IEEE Transactions on Image Processing, 2012, 21(2): 469–480. doi: 10.1109/TIP.2011.2161482
    [12] JI Jiahuan, ZHONG Baojiang, and MA Kaikuang. Image interpolation using multi-scale attention-aware inception network[J]. IEEE Transactions on Image Processing, 2020, 29: 9413–9428. doi: 10.1109/TIP.2020.3026632
    [13] NIU Ben, WEN Weilei, REN Wenqi, et al. Single image super-resolution via a holistic attention network[C]. Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, 2020: 191–207.
    [14] WEI Pengxu, XIE Ziwei, LU Hannan, et al. Component divide-and-conquer for real-world image super-resolution[C]. Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, 2020: 101–117.
    [15] DENG Xin, ZHANG Yutong, XU Mai, et al. Deep coupled feedback network for joint exposure fusion and image super-resolution[J]. IEEE Transactions on Image Processing, 2021, 30: 3098–3112. doi: 10.1109/TIP.2021.3058764
    [16] LIN Yuting, LIU Wei, CAI Xiaowen, et al. A CNN-based quality model for image interpolation[C]. Proceedings of 2020 Cross Strait Radio Science & Wireless Technology Conference, Fuzhou, China, 2020: 1–3.
    [17] AMD. AMD FidelityFX super resolution (FSR): Changing the game in just 4 months[EB/OL]. https://www.amd.com/zh-hans/technologies/fidelityfx-super-resolution, 2021.
    [18] Andrew Burnes, nvidia-image-scaler-dlss-rtx-november-2021-updates[EB/OL]. https://www.nvidia.com/en-us/geforce/news/gfecnt/202111/nvidia-image-scaler-dlss-rtx-november-2021-updates/, 2021.
    [19] KHALEDYAN D, AMIRANY A, JAFARI K, et al. Low-cost implementation of bilinear and bicubic image interpolation for real-time image super-resolution[C]. Proceedings of 2020 IEEE Global Humanitarian Technology Conference, Seattle, USA, 2020: 1–5.
    [20] 王康, 杨瑞祺, 杨依忠, 等. 基于二阶牛顿插值的图像自适应缩放设计及实现[J]. 计算机应用与软件, 2020, 37(9): 126–132,138. doi: 10.3969/j.issn.1000-386x.2020.09.021

    WANG Kang, YANG Ruiqi, YANG Yizhong, et al. Design and implementation of image adaptive scaling based on second order newton interpolation[J]. Computer Applications and Software, 2020, 37(9): 126–132,138. doi: 10.3969/j.issn.1000-386x.2020.09.021
    [21] BOUKHTACHE S, BLAYSAT B, GRÉDIAC M, et al. FPGA-based architecture for bi-cubic interpolation: The best trade-off between precision and hardware resource consumption[J]. Journal of Real-Time Image Processing, 2021, 18(3): 901–911. doi: 10.1007/s11554-020-01035-1
    [22] SHEN H Y, LEE Y C, TONG T W, et al. 4.7 A 91mW 90fps super-resolution processor for full HD images[C]. Proceedings of 2021 IEEE International Solid- State Circuits Conference, San Francisco, USA, 2021: 66–68.
    [23] 陆志芳, 钟宝江. 基于预测梯度的图像插值算法[J]. 自动化学报, 2018, 44(6): 1072–1085. doi: 10.16383/j.aas.2017.c160793

    LU Zhifang and ZHONG Baojiang. Image interpolation with predicted gradients[J]. Acta Automatica Sinica, 2018, 44(6): 1072–1085. doi: 10.16383/j.aas.2017.c160793
    [24] ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 586–595.
    [25] 吴世豪, 罗小华, 张建炜, 等. 基于FPGA的新边缘指导插值算法硬件实现[J]. 浙江大学学报:工学版, 2018, 52(11): 2226–2232. doi: 10.3785/j.issn.1008-973X.2018.11.022

    WU Shihao, LUO Xiaohua, ZHANG Jianwei, et al. FPGA-based hardware implementation of new edge-directed interpolation algorithm[J]. Journal of Zhejiang University:Engineering Science, 2018, 52(11): 2226–2232. doi: 10.3785/j.issn.1008-973X.2018.11.022
    [26] LEE J, SHIN D, LEE J, et al. A full HD 60 fps CNN super resolution processor with selective caching based layer fusion for mobile devices[C]. Proceedings of 2019 Symposium on VLSI Circuits. Kyoto, Japan, 2019: C302–C303.
  • 加载中
图(6) / 表(10)
计量
  • 文章访问数:  562
  • HTML全文浏览量:  480
  • PDF下载量:  112
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-12-02
  • 修回日期:  2023-04-12
  • 网络出版日期:  2023-04-19
  • 刊出日期:  2023-09-27

目录

    /

    返回文章
    返回