高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

应用于格密码的可重构多通道数论变换硬件设计

刘冬生 赵文定 刘子龙 张聪 刘星杰

刘冬生, 赵文定, 刘子龙, 张聪, 刘星杰. 应用于格密码的可重构多通道数论变换硬件设计[J]. 电子与信息学报, 2022, 44(2): 566-572. doi: 10.11999/JEIT210114
引用本文: 刘冬生, 赵文定, 刘子龙, 张聪, 刘星杰. 应用于格密码的可重构多通道数论变换硬件设计[J]. 电子与信息学报, 2022, 44(2): 566-572. doi: 10.11999/JEIT210114
LIU Dongsheng, ZHAO Wending, LIU Zilong, ZHANG Cong, LIU Xingjie. Reconfigurable Hardware Design of Multi-lanes Number Theoretic Transform for Lattice-based Cryptography[J]. Journal of Electronics & Information Technology, 2022, 44(2): 566-572. doi: 10.11999/JEIT210114
Citation: LIU Dongsheng, ZHAO Wending, LIU Zilong, ZHANG Cong, LIU Xingjie. Reconfigurable Hardware Design of Multi-lanes Number Theoretic Transform for Lattice-based Cryptography[J]. Journal of Electronics & Information Technology, 2022, 44(2): 566-572. doi: 10.11999/JEIT210114

应用于格密码的可重构多通道数论变换硬件设计

doi: 10.11999/JEIT210114
基金项目: 国家自然科学基金(61874163);国家科技重大专项基金(2017ZX01032-101);中央高校基本科研基金(HUST: 2018KFYYXJJ056)
详细信息
    作者简介:

    刘冬生:男,1979年生,博士,教授,研究方向为能效无线传输技术及芯片设计、后量子密码算法及密码芯片设计

    赵文定:男,1997年生,硕士生,研究方向为数论变换、格密码

    刘子龙:男,1990年生,博士,研究方向为硬件安全

    张聪:男,1994年生,博士生,研究方向为后量子密码

    刘星杰:男,1997年生,硕士生,研究方向为后量子密码硬件实现

    通讯作者:

    赵文定 zhaowend@foxmail.com

  • 中图分类号: TN918.2; TN402

Reconfigurable Hardware Design of Multi-lanes Number Theoretic Transform for Lattice-based Cryptography

Funds: The National Natural Science Foundation of China (61874163), The National Science and Technology Major Project (2017ZX01032-101), The Fundamental Research Funds for the Central Universities (HUST: 2018KFYYXJJ056)
  • 摘要: 针对不同格密码体制带来的数论变换参数多样性,以及数论变换的性能优化设计,该文提出一种基于随机存取存储器(RAM)的可重构多通道数论变换单元。在数论变换单元设计中,在按时间抽取的基础上改进多通道架构,并提出一种优化地址分配方法。最后基于Xilinx Artix-7现场可编程逻辑门阵列(FPGA)平台进行原型实现,结果显示,所设计的数论变换单元消耗的资源为1744 Slices, 16 DSP,完成1次多项式乘法的时间为2.01 μs(n=256), 3.57 μs(n=512), 6.71 μs(n=1024)和13.43 μs(n=2048),支持256~2048的不同参数n和13~32 bit模q的可重构配置,工作频率最高可达232 MHz。
  • 图  1  基于RAM的模乘器结构图

    图  2  4通道NTT数据流向图

    图  3  可重构NTT结构图

    表  1  基2 DIT-NTT算法

     输入:$ {\boldsymbol{A}}\left(x\right),{\omega }_{N} \in {R}_{q},N $
     输出:$ {\boldsymbol{A}}\left(x\right) $
     (1) for m = 1 to N/2 by m = 2m do
     (2) for i = 0 to N–1 do
     (3) for j = 0 to N–1 do
     (4) $ u={\boldsymbol{A}}[k+j] $
     (5) $ t={\boldsymbol{A}}[k+j+m]\times {\omega }_{N} $
     (6) $ {\boldsymbol{A}}[k+j]\mathrm{ }=\mathrm{ }(u+t)\mathrm{ }\mathrm{m}\mathrm{o}\mathrm{d}\;q $
     (7) $ {\boldsymbol{A}}[k+j+N]\mathrm{ }=\mathrm{ }(u-t)\mathrm{ }\mathrm{m}\mathrm{o}\mathrm{d}\;q $
     (8) end for
     (9) end for
     (10) end for
     (11) return $ A\left(x\right) $
    下载: 导出CSV

    表  2  可重构参数表

    n
    25651210242048
    q76811228940961655377864335767169764003323068673
    10485760116772161469762049998244353100453580919985858572013265921
    下载: 导出CSV

    表  3  多项式乘法结果比较表

    文献nq位数LUTsSlicesMemory(36 kB)DSPsMHz周期数时延(μs)ATP1) (×105)
    NTT[5]102430231799711.5 BRAMs419421405110.341.100
    NTT[5]2048573846131022.5 BRAMs1616145453282.323.698
    NTT[6]2048172323287820 Bits22228.991028944.931.9652)
    NTT[11]5122318 K2.5 BRAMs128233.14121.770.4483)
    FFT[12]204822440650 BRAMs12208.121740283.6153.684
    本设计256324780174424 BRAMs/655360 Bits1623216277.000.122
    512279812.030.210
    1024525122.580.394
    20481045544.960.784
    1)ATP是通过将消耗的Slices乘以时延来计算的。
    2)NTT[3]由于未提供消耗的Slices数量,因此ATP的计算采用LUTs×Time,而本设计对应的ATP为2.149。
    3)为了合理地比较,多通道NTT[7]的ATP计算公式为(1.5n×9+2n) / (n×9+2n) ×1.77×18k=0.448×105
    下载: 导出CSV
  • [1] SHOR P W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer[J]. SIAM Journal on Computing, 1997, 26(5): 1484–1509. doi: 10.1137/S0097539795293172
    [2] ARUTE F, ARYA K, BABBUSH R, et al. Quantum supremacy using a programmable superconducting processor[J]. Nature, 2019, 574(7779): 505–510. doi: 10.1038/s41586-019-1666-5
    [3] 赵勇, 戚巍, 徐兵杰, 等. 量子安全技术白皮书(2020)[R]. 2020.
    [4] CHEN Zhaohui, MA Yuan, CHEN Tianyu, et al. Towards efficient kyber on FPGAs: A processor for vector of polynomials[C]. The 2020 25th Asia and South Pacific Design Automation Conference, Beijing, China, 2020. doi: 10.1109/ASP-DAC47756.2020.9045459.
    [5] PÖPPELMANN T and GÜNEYSU T. Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware[C]. The 2nd International Conference on Cryptology and Information Security in Latin America, Santiago, Chile, 2012. doi: 10.1007/978-3-642-33481-8_8" target="_blank">href="http://dx.doi.org/10.1007/978-3-642-33481-8_8">10.1007/978-3-642-33481-8_8.
    [6] RENTERÍA-MEJÍA C P and VELASCO-MEDINA J. Hardware design of an NTT-Based polynomial multiplier[C]. The 2014 IX Southern Conference on Programmable Logic, Buenos Aires, Argentina, 2014: 1–5. doi: 10.1109/SPL.2014.7002209.
    [7] YE J H and SHIEH M D. High-performance NTT Architecture for large integer multiplication[C]. 2018 International Symposium on VLSI Design, Automation and Test, Hsinchu, China, 2018: 1–4. doi: 10.1109/VLSI-DAT.2018.8373254.
    [8] ZHANG Neng, QIN Qiao, YUAN Hang, et al. NTTU: An area-efficient low-power NTT-uncoupled architecture for NTT-based multiplication[J]. IEEE Transactions on Computers, 2020, 69(4): 520–533. doi: 10.1109/TC.2019.2958334
    [9] AYSU A, PATTERSON C, and SCHAUMONT P. Low-cost and area-efficient FPGA implementations of lattice-based cryptography[C]. Proceedings of 2013 IEEE International Symposium on Hardware-Oriented Security and Trust, Austin, USA, 2013.
    [10] RENTERÍA-MEJÍA C R and VELASCO-MEDINA J. Lattice-based cryptoprocessor for CCA-Secure identity-based encryption[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2020, 67(7): 2331–2344. doi: 10.1109/TCSI.2020.2981089
    [11] FENG Xiang, LI Shuguo, and XU Sufen. RLWE-oriented high-speed polynomial multiplier utilizing multi-lane stockham NTT algorithm[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2020, 67(3): 556–559. doi: 10.1109/TCSII.2019.2917621
    [12] CHEN D D, MENTENS N, VERCAUTEREN F, et al. High-speed polynomial multiplication architecture for ring-LWE and SHE cryptosystems[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2015, 62(1): 157–166. doi: 10.1109/TCSI.2014.2350431
    [13] MERT A C, KARABULUT E, OZTURK E, et al. An extensive study of flexible design methods for the number theoretic transform[J/OL]. IEEE Transactions on Computers, 2020, 1–15.
    [14] LIU Dongsheng, ZHANG Cong, LIN Hui, et al. A resource-efficient and side-channel secure hardware implementation of ring-LWE cryptographic processor[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2019, 66(4): 1474–1483. doi: 10.1109/TCSI.2018.2883966
    [15] KIM S, LEE K, CHO W, et al. Hardware architecture of a number theoretic transform for a bootstrappable RNS-based homomorphic encryption scheme[C]. The 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines, Fayetteville, USA, 2020, 56–64. doi: 10.1109/FCCM48280.2020.00017.
  • 加载中
图(3) / 表(3)
计量
  • 文章访问数:  1243
  • HTML全文浏览量:  553
  • PDF下载量:  151
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-02-01
  • 修回日期:  2021-06-10
  • 网络出版日期:  2021-06-22
  • 刊出日期:  2022-02-25

目录

    /

    返回文章
    返回