高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于部分积概率分析的高精度低功耗近似浮点乘法器设计

闫成刚 赵轩 徐宸宇 陈珂 葛际鹏 王成华 刘伟强

闫成刚, 赵轩, 徐宸宇, 陈珂, 葛际鹏, 王成华, 刘伟强. 基于部分积概率分析的高精度低功耗近似浮点乘法器设计[J]. 电子与信息学报, 2023, 45(1): 87-95. doi: 10.11999/JEIT211485
引用本文: 闫成刚, 赵轩, 徐宸宇, 陈珂, 葛际鹏, 王成华, 刘伟强. 基于部分积概率分析的高精度低功耗近似浮点乘法器设计[J]. 电子与信息学报, 2023, 45(1): 87-95. doi: 10.11999/JEIT211485
YAN Chenggang, ZHAO Xuan, XU Chenyu, CHEN Ke, GE Jipeng, WANG Chenghua, LIU Weiqiang. Design of High Precision Low Power Approximate Floating-point Multiplier Based on Partial Product Probability Analysis[J]. Journal of Electronics & Information Technology, 2023, 45(1): 87-95. doi: 10.11999/JEIT211485
Citation: YAN Chenggang, ZHAO Xuan, XU Chenyu, CHEN Ke, GE Jipeng, WANG Chenghua, LIU Weiqiang. Design of High Precision Low Power Approximate Floating-point Multiplier Based on Partial Product Probability Analysis[J]. Journal of Electronics & Information Technology, 2023, 45(1): 87-95. doi: 10.11999/JEIT211485

基于部分积概率分析的高精度低功耗近似浮点乘法器设计

doi: 10.11999/JEIT211485
基金项目: 国家自然科学基金(62101246, 62022041, 62101252),江苏省自然科学基金(BK20200417),江苏省双创博士专项资金(2020-30377)
详细信息
    作者简介:

    闫成刚:男,讲师,研究方向为数模混合集成电路设计、近似通信集成电路设计

    赵轩:男,硕士生,研究方向为近似算术运算单元设计、近似FFT设计

    徐宸宇:男,硕士生,研究方向为近似计算集成电路设计

    陈珂:男,副研究员,研究方向为近似计算集成电路设计

    葛际鹏:男,硕士生,研究方向为近似计算集成电路设计

    王成华:男,教授,研究方向为信息安全芯片、物理不可克隆函数

    刘伟强:男,教授,研究方向为数字集成电路设计、混合信号集成电路设计

    通讯作者:

    刘伟强 liuweiqiang@nuaa.edu.cn

  • 中图分类号: TN911; TP331.2

Design of High Precision Low Power Approximate Floating-point Multiplier Based on Partial Product Probability Analysis

Funds: The National Natural Science Foundation of China (62101246, 62022041, 62101252), The Natural Science Foundation of Jiangsu Province (BK20200417), The Innovative and Entrepreneurial Talents of Jiangsu Province (2020-30377)
  • 摘要: 浮点乘法器是高动态范围(HDR)图像处理、无线通信等系统中的关键运算单元,其相比于定点乘法器动态范围更广,但复杂度更高。近似计算作为一种新兴范式,在受限的精度损失范围内,可大幅降低硬件资源和功耗开销。该文提出一种16 bit半精度近似浮点乘法器(App-Fp-Mul),针对浮点乘法器中的尾数乘法模块,根据其部分积阵列中出现1的概率,提出一种对输入顺序不敏感的近似4-2压缩器及低位或门压缩方法,在精度损失较小的条件下有效降低了浮点乘法器资源及功耗。相较于精确设计,所提近似浮点乘法器在归一化平均错误距离(NMED)为0.0014时,面积及功耗延时积方面分别降低20%及58%;相较于现有近似设计,在近似位宽相同时具有更高的精度及更小的功耗延时积。最后将该文所提近似浮点乘法器应用于高动态范围图像处理,相比现有主流方案,峰值信噪比和结构相似性分别达到83.16 dB 和 99.9989%,取得了显著的提升。
  • 图  1  浮点乘法器结构

    图  2  传统精确4-2压缩器

    图  3  全加器

    图  4  Ahma近似4-2压缩器[22]

    图  5  提出的近似4-2压缩器

    图  6  11×11 部分积阵列

    图  7  基于概率或门压缩

    图  8  PDP与NMED图

    图  9  PDP与MRED图

    图  10  PDP与MSE图

    图  11  图像处理结果

    表  1  浮点数的尾数中1的概率

    权重A[10]A[9]A[8]A[7]A[6]A[5]A[4]A[3]A[2]A[1]A[0]
    高斯分布0.970.420.460.480.490.490.500.500.500.500.50
    下载: 导出CSV

    表  2  门电路的延时[24]

    ANDORXOR
    归一化延时0.70.71.0
    晶体管数目2N+22N+24N+2
    下载: 导出CSV

    表  3  近似 4-2 压缩器真值表

    P1P2P3P4SumCarryError
    0000000
    0001100
    0010100
    0011010
    0100100
    0101010
    0110010
    0111110
    1000100
    1001010
    1010010
    1011110
    1100010
    1101110
    1110110
    111101–2
    下载: 导出CSV

    表  4  或门真值表

    P1P2OutError
    0000
    0110
    1010
    111–1
    下载: 导出CSV

    表  5  近似尾数乘法器精度指标

    NMED (10–3)MRED (10–2)MSE (109)
    App-Man-Mul12.71213.15120.3445
    App-Man-Mul2[9]92.11845797.77641328.3386
    App-Man-Mul3[10]5.16954.63010.7751
    App-Man-Mul4[11]6.78485.76021.8995
    下载: 导出CSV

    表  6  近似浮点乘法器精度指标

    NMED (10–3)MRED (10–2)MSE (105)
    App-Fp-Mul11.40941.02330.4118
    App-Fp-Mul2[9]7.83996.343830.8504
    App-Fp-Mul3[10]2.11121.50940.6676
    App-Fp-Mul4[11]3.24462.31771.8812
    下载: 导出CSV

    表  7  近似乘法器硬件指标(仿真频率500 MHz)

    Area (μm2)Power (mW)Delay (ns)PDP (pJ)
    Ex-Man -Mul301.6440.17571.860.3268
    App-Man-Mul1156.1140.05681.080.0613
    App-Man-Mul2[9]219.9960.07621.110.0846
    App-Man-Mul3[10]222.8940.06391.130.0722
    App-Man-Mul4[11]162.6660.05321.100.0585
    下载: 导出CSV

    表  8  近似浮点乘法器硬件指标(仿真频率200 MHz)

    Area (μm2)Power (mW)Delay (ns)PDP (pJ)
    Ex-Fp-Mul713.16000.15624.900.7654
    App-Fp-Mul1568.76400.07794.170.3249
    App-Fp-Mul2[9]631.38600.08984.330.3888
    App-Fp-Mul3[10]633.40200.08344.210.3511
    App-Fp-Mul4[11]573.17400.07754.140.3209
    下载: 导出CSV

    表  9  图像处理后的图像的量化指标

    App-Fp-Mul1App-Fp-Mul2[9]App-Fp-Mul3[10]App-Fp-Mul4[11]
    PSNR83.163968.112454.870076.0141
    SSIM(%)99.998994.864899.483199.9949
    下载: 导出CSV
  • [1] LIU Weiqiang, LOMBARDI F, and SHULTE M. A retrospective and prospective view of approximate computing[J]. Proceedings of the IEEE, 2020, 108(3): 394–399. doi: 10.1109/JPROC.2020.2975695
    [2] WILSON L. International technology roadmap for semiconductors (ITRS)[EB/OL]. https://www.semiconductors.org/resources/2013-international-technology-roadmap-for-semiconductors-itrs/, 2013.
    [3] VENKATARAMANI S, CHAKRADHAR ST, ROY K, et al. Computing approximately, and efficiently[C]. 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2015: 748–751.
    [4] CHIPPA V K, CHAKRADHAR S T, ROY K, et al. Analysis and characterization of inherent application resilience for approximate computing[C]. The 50th Annual Design Automation Conference (DAC), Austin, USA, 2013: 113.
    [5] LIU Bo, CAI Hao, WANG Zhen, et al. A 22nm, 10.8 $\mu $ W/15.1 $\mu $ W dual computing modes high power-performance-area efficiency domained background noise aware keyword- spotting processor[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2020, 67(12): 4733–4746. doi: 10.1109/TCSI.2020.2997913
    [6] LIU Bo, DING Xiaoling, CAI Hao, et al. Precision adaptive MFCC based on R2SDF-FFT and approximate computing for low-power speech keywords recognition[J]. IEEE Circuits and Systems Magazine, 2021, 21(4): 24–39. doi: 10.1109/MCAS.2021.3118175
    [7] WARIS H, WANG Chenghua, LIU Weiqiang, et al. Hybrid low radix encoding-based approximate booth multipliers[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2020, 67(12): 3367–3371. doi: 10.1109/TCSII.2020.2975094
    [8] VENKATACHALAM S, ADAMS E, LEE H J, et al. Design and analysis of area and power efficient approximate booth multipliers[J]. IEEE Transactions on Computers, 2019, 68(11): 1697–1703. doi: 10.1109/TC.2019.2926275
    [9] LIU Weiqiang, QIAN Liangyu, WANG Chenghua, et al. Design of approximate radix-4 booth multipliers for error-tolerant computing[J]. IEEE Transactions on Computers, 2017, 66(8): 1435–1441. doi: 10.1109/TC.2017.2672976
    [10] YI Xilin, PEI Haoran, ZHANG Ziji, et al. Design of an energy-efficient approximate compressor for error-resilient multiplications[C]. 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019: 1–5.
    [11] FANG Bao, LIANG Huaguo, XU Dawen, et al. Approximate multipliers based on a novel unbiased approximate 4–2 compressor[J]. Integration, 2021, 81: 17–24. doi: 10.1016/j.vlsi.2021.05.003
    [12] HA M and LEE S. Multipliers with approximate 4–2 compressors and error recovery modules[J]. IEEE Embedded Systems Letters, 2018, 10(1): 6–9. doi: 10.1109/LES.2017.2746084
    [13] AKBARI O, KAMAL M, AFZALI-KUSHA A, et al. Dual-quality 4: 2 compressors for utilizing in dynamic accuracy configurable multipliers[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017, 25(4): 1352–1361. doi: 10.1109/TVLSI.2016.2643003
    [14] SABETZADEH F, MOAIYERI M H, and AHMADINEJAD M. A majority-based imprecise multiplier for ultra-efficient approximate image multiplication[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2019, 66(11): 4200–4208. doi: 10.1109/TCSI.2019.2918241
    [15] PEI Haoran, YI Xilin, ZHOU Hang, et al. Design of ultra-low power consumption approximate 4–2 compressors based on the compensation characteristic[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2021, 68(1): 461–465. doi: 10.1109/TCSII.2020.3004929
    [16] NIU Zijing, JIANG Honglan, ANSARI M S, et al. A logarithmic floating-point multiplier for the efficient training of neural networks[C]. The 2021 on Great Lakes Symposium on VLSI, New York, USA, 2021: 65–70.
    [17] JHA C K, WALIA S, KANOJIA G, et al. FPCAM: Floating point configurable approximate multiplier for error resilient applications[C]. 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Korea, 2021: 1–5.
    [18] YIN Peipei, WANG Chenghua, LIU Weiqiang, et al. Design and performance evaluation of approximate floating-point multipliers[C]. 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Pittsburgh, USA, 2016: 296–301.
    [19] IEEE. IEEE Std 754–2008 IEEE standard for floating-point arithmetic[S]. IEEE, 2008.
    [20] TONG J Y F, NAGLE D, and RUTENBAR R A. Reducing power by optimizing the necessary precision/range of floating-point arithmetic[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2000, 8(3): 273–286. doi: 10.1109/92.845894
    [21] HSIAO S F, JIANG M R, YEH J S. Design of high-speed low-power 3–2 counter and 4–2 compressor for fast multipliers[J]. Electronics Letters, 1998, 34(4): 341–343. doi: 10.1049/el:19980306
    [22] AHMADINEJAD M, MOAIYERI M H, and SABETZADEH F. Energy and area efficient imprecise compressors for approximate multiplication at nanoscale[J]. AEU- International Journal of Electronics and Communications, 2019, 110: 152859. doi: 10.1016/j.aeue.2019.152859
    [23] STROLLO A G M, NAPOLI E, DE CARO D, et al. Comparison and extension of approximate 4–2 compressors for low-power approximate multipliers[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2020, 67(9): 3021–3034. doi: 10.1109/TCSI.2020.2988353
    [24] 朱玉莹. 优化的近似Booth乘法器设计和评估及概率错误模型分析[D]. [硕士论文], 南京航空航天大学, 2020: 13–15.

    ZHU Yuying. Design and evaluation of improved approximate Booth multipliers and probabilistic error model analysis[D]. [Master dissertation], Nanjing University of Aeronautics and Astronautics, 2020: 13–15.
  • 加载中
图(11) / 表(9)
计量
  • 文章访问数:  226
  • HTML全文浏览量:  87
  • PDF下载量:  52
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-10
  • 录用日期:  2022-03-03
  • 修回日期:  2022-02-24
  • 网络出版日期:  2022-03-08
  • 刊出日期:  2023-01-17

目录

    /

    返回文章
    返回