高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向格基后量子密码算法的可重构多项式乘法架构

陈韬 李慧琴 李伟 南龙梅 杜怡然

陈韬, 李慧琴, 李伟, 南龙梅, 杜怡然. 面向格基后量子密码算法的可重构多项式乘法架构[J]. 电子与信息学报, 2023, 45(9): 3380-3392. doi: 10.11999/JEIT230284
引用本文: 陈韬, 李慧琴, 李伟, 南龙梅, 杜怡然. 面向格基后量子密码算法的可重构多项式乘法架构[J]. 电子与信息学报, 2023, 45(9): 3380-3392. doi: 10.11999/JEIT230284
CHEN Tao, LI Huiqin, LI Wei, NAN Longmei, Du Yiran. Reconfigurable Polynomial Multiplication Architecture for Lattice-based Post-quantum Cryptography Algorithms[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3380-3392. doi: 10.11999/JEIT230284
Citation: CHEN Tao, LI Huiqin, LI Wei, NAN Longmei, Du Yiran. Reconfigurable Polynomial Multiplication Architecture for Lattice-based Post-quantum Cryptography Algorithms[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3380-3392. doi: 10.11999/JEIT230284

面向格基后量子密码算法的可重构多项式乘法架构

doi: 10.11999/JEIT230284
基金项目: 国家自然科学基金(61404175),国家科技重大专项(2018ZX01027101-004),河南省自然科学基金(232300421393)
详细信息
    作者简介:

    陈韬:男,副教授,研究方向为安全专用芯片设计、后量子公钥密码算法芯片设计

    李慧琴:女,硕士,研究方向为后量子公钥密码算法硬件实现

    李伟:男,教授,研究方向为密码处理器设计、ASIC专用芯片设计

    南龙梅:女,副教授,研究方向为大规模集成电路设计、专用集成电路设计

    杜怡然:男,讲师,研究方向为可重构密码芯片设计

    通讯作者:

    李慧琴 lihuiqinjiayou@163.com

  • 中图分类号: TN402; TP309

Reconfigurable Polynomial Multiplication Architecture for Lattice-based Post-quantum Cryptography Algorithms

Funds: The National Natural Science Foundation of China (61404175), The National Science and Technology Major Project (2018ZX01027101-004), The Natural Science Foundation of Henan Province (232300421393)
  • 摘要: 针对基于不同困难问题格基密码算法中的多项式乘法参数各异且实现架构不统一的现状,该文提出一种基于预处理型数论变换(PtNTT)算法的可重构架构。首先进行多项式乘法运算特征分析,综合了多项式参数(项数、模数及模多项式)对可重构架构的影响。其次,针对不同项数和模多项式设计了4×4串并行可转换型运算单元架构,可满足实现不同位宽基k-数论变换的可扩展设计。其中具体针对不同模数设计了可扩展实现16 bit模乘和32 bit乘法的可重构单元。在数据需求分析过程中,通过构建以系数地址生成、Bank划分以及实际与虚拟地址对应逻辑为主体的分配机制,设计了一种满足基k-数论变换的多Bank存储结构。实验结果表明,该文支持实现Kyber, Saber, Dilithium与NTRU等4种类型算法中的多项式乘法,与其余可重构架构相比,可采用统一架构实现4种算法中的多项式乘法。基于Xilinx Artix-7 FPGA 1.599 μs完成一组项数为256,模数为3329的多项式乘法运算,花费243个时钟。
  • 图  1  不同模多项式依据中国剩余定理映射示意图

    图  2  基4-NTT/INTT核心蝶形运算架构图

    图  3  串并行可转换型运算单元结构图

    图  4  适于Kyber算法的4-Bank存储系数情况

    图  5  Kyber算法中多项式乘法NTT/INTT地址互斥连通图

    图  6  12-Bank地址划分标识图

    图  7  多端口RAM单元硬件设计框图

    表  1  适于可重构架构的多项式乘法参数

    算法项数n模数q模多项式
    原始扩展原始扩展
    Kyber25625633293329xn+1
    Saber25625621310753xn+1
    Dilithium25625683804178380417xn+1
    NTRUNTRUhps204850950910242111043969${{\left( {{x^n} - 1} \right)} \mathord{\left/ {\vphantom {{\left( {{x^n} - 1} \right)} {\left( {x - 1} \right)}}} \right. } {\left( {x - 1} \right)}}$
    =xn–1+xn-2+···+1或xn–1
    NTRUhps204867767715362111389569
    NTRUhps409682182117282123365569
    NTRUhrss70170115362135747201
    下载: 导出CSV
     算法1 NTRUhps4096821采用混合基PtNTT的多项式乘法
     输入:a(x),b(x)$ {{ \in {Z_q}} \mathord{\left/ {\vphantom {{ \in {Z_q}} {\left( {{x^n} - 1} \right)}}} \right. } {\left( {{x^n} - 1} \right)}} $,模数q,项数n
     输出:c(x)$ {{ \in {Z_q}} \mathord{\left/ {\vphantom {{ \in {Z_q}} {\left( {{x^n} - 1} \right)}}} \right. } {\left( {{x^n} - 1} \right)}} $
     1.a(x)[i]转换为3维系数f[i/192][(imod192)/3][imod3]
     2.b(x)[i]转换为3维系数g[i/192][(imod192)/3][imod3]
     3.for j=0; j<64; j++
       for k=0; k<4; k++
        $ \widehat a = {\text{NTT(}}f[0 - 8][j][k]{\text{)}} $
        $ \widehat b = {\text{NTT(}}g[0 - 8][j][k]{\text{)}} $
       end for
      end for
     4.for l=0; l<9; l++
       for k=0; k<4; k++
         $ \widehat a = {\text{NTT(}}f[l][0 - 63][k]{\text{)}} $
        $ \widehat b = {\text{NTT(}}g[l][0 - 63][k]{\text{)}} $
       end for
      end for
     5.for i=0; i<4; i++
       for j=0; j<4; j++
        $ {R_{i,j}} = \widehat a{\text{[0 - 8][0 - 63][}}i{\text{]}} \circ \widehat b{\text{[0 - 8][0 - 63][}}j{\text{]}} $
       end for
      end for
     6.$ \widehat {{c_0}}(x) = {R_{0,0}} + \widehat x \circ \left( {{R_{1,3}} + {R_{3,1}} + {R_{2,2}}} \right) $
      $ \widehat{{c}_{1}}(x)={R}_{0,1}+{R}_{1,0}+\widehat{x}\circ ({R}_{2,3}+{R}_{3,2}) $
      $ \widehat {{c_2}}(x) = {R_{0,2}} + {R_{2,0}} + {R_{1,1}} + \widehat x \circ {R_{3,3}} $
      $ \widehat {{c_3}}(x) = {R_{0,3}} + {R_{3,0}} + {R_{1,2}} + {R_{2,1}} $
     7. for l=0; l<9; l++
       for k=0; k<4; k++
        $ c(x) = {\text{NTT(}}\hat c[l][0 - 63][k]{\text{)}} $
       end for
      end for
     8. for j=0; j<64; j++
       for k=0; k<4; k++
         $ c(x) = {\text{NTT(}}\hat c[0 - 8][j][k]{\text{)}} $
       end for
      end for
    下载: 导出CSV
    算法2 基2-NTT算法的核心运算
     $\hat a[k] = {{A} }[k] + {\zeta ^{2k + 1} }{{B} }[k]$
     $\hat a\left[ {k + \dfrac{n}{2} } \right] = {{A} }[k] - {\zeta ^{2k + 1} }{{B} }[k]$
    下载: 导出CSV
    算法3 基3-NTT算法的核心运算,其中$ \mu = {\omega ^{{n {\left/ {\vphantom {n 3}} \right. } 3}}} $
     $\hat a[k] = {{A} }[k] + {\omega ^k}{{B} }[k] + {\left( { {\omega ^k} } \right)^2}{ {C} }[k]$
     $\begin{aligned} \hat a\left[ {k + \frac{n}{3} } \right] = & {{A} }[k] + \mu {\omega ^k}{B}[k] + {\mu ^2}{\left( { {\omega ^k} } \right)^2}{ {C} }[k] \\ \;\;\;\;\;\;\;\;\;\;\; =& { {A} }[k] - {\left( { {\omega ^k} } \right)^2}{C}[k] + \mu \left( { {\omega ^k}{B}[k] - { {\left( { {\omega ^k} } \right)}^2}{C}[k]} \right) \end{aligned}$
     $ \begin{aligned} \hat a\left[ {k + \frac{{2n}}{3}} \right] =& {A}[k] + {\mu ^2}{\omega ^k}{B}[k] + \mu {\left( {{\omega ^k}} \right)^2}{C}[k] \\ \;\;\;\;\;\;\;\;\;\;\;\; =& {A}[k] - {\omega ^k}{B}[k] - \mu \left( {{\omega ^1}{B}[k] - {{\left( {{\omega ^k}} \right)}^2}{C}[k]} \right) \\ \end{aligned} $
    下载: 导出CSV
    算法4 基4-NTT算法的核心运算,其中$ \mu = {\omega ^{{n \mathord{\left/ {\vphantom {n 4}} \right. } 4}}} $
    $\hat a[k] = {A}[k] + {\zeta ^{2k + 1}}{B}[k] + {\left( {{\zeta ^{2k + 1}}} \right)^2}{C}[k] + {\left( {{\zeta ^{2k + 1}}} \right)^3}{D}[k] $
     $ \begin{aligned} \hat a\left[ {k + \frac{n}{4}} \right] =& {A}[k] + {\mu ^2}{\zeta ^{2k + 1}}{B}[k] + {\left( {{\mu ^2}{\zeta ^{2k + 1}}} \right)^2}{C}[k] + {\left( {{\mu ^2}{\zeta ^{2k + 1}}} \right)^3}{D}[k] \\ = &{A}[k] - {\zeta ^{2k + 1}}{B}[k] + {\left( {{\zeta ^{2k + 1}}} \right)^2}{C}[k] - {\left( {{\zeta ^{2k + 1}}} \right)^3}{D}[k] \end{aligned} $
     $ \begin{aligned} \hat a\left[ {k + \frac{n}{2}} \right] = &{A}[k] + \mu {\zeta ^{2k + 1}}{B}[k] + {\left( {\mu {\zeta ^{2k + 1}}} \right)^2}{C}[k] + {\left( {\mu {\zeta ^{2k + 1}}} \right)^3}{D}[k] \\ =& {A}[k] + \mu {\zeta ^{2k + 1}}{B}[k] - {\left( {{\zeta ^{2k + 1}}} \right)^2}{C}[k] - \mu {\left( {{\zeta ^{2k + 1}}} \right)^3}{D}[k] \\ \end{aligned} $
     $\begin{aligned} \hat a\left[ {k + \frac{{3n}}{4}} \right] =& {A}[k] + {\mu ^3}{\zeta ^{2k + 1}}{B}[k] + {\left( {{\mu ^3}{\zeta ^{2k + 1}}} \right)^2}{C}[k] + {\left( {{\mu ^3}{\zeta ^{2k + 1}}} \right)^3}{D}[k] \\ = & {A}[k] - \mu {\zeta ^{2k + 1}}{B}[k] - {\left( {{\zeta ^{2k + 1}}} \right)^2}{C}[k] + \mu {\left( {{\zeta ^{2k + 1}}} \right)^3}{D}[k] \end{aligned} $
    下载: 导出CSV
    算法5 基4-NTT算法的核心运算操作拆解,其中$ \mu = {\omega ^{{n \mathord{\left/ {\vphantom {n 4}} \right. } 4}}} $
     1. ${T_0} = {{A[} }k{\text{]} } + {\left( { {\zeta ^{2k + 1} } } \right)^2}{C[}k{\text{]} }$
      ${ {T}_1} = {{A[} }k{\text{]} } - {\left( { {\zeta ^{2k + 1} } } \right)^2}{C[}k{\text{]} }$
      ${ {T}_2} = {\zeta ^{2k + 1} }{{B[} }k{\text{]} } + {\left( { {\zeta ^{2k + 1} } } \right)^3}{D[}k{\text{]} }$
      ${ {T}_3} = {\zeta ^{2k + 1} }{{B[} }k{\text{]} } - {\left( { {\zeta ^{2k + 1} } } \right)^3}{D[}k{\text{]} }$
     2. $ \hat a[k] = {{T}_0} + {{T}_2} $
      $ \hat a[k + {n \mathord{\left/ {\vphantom {n 4}} \right. } 4}] = {{T}_0} - {{T}_2} $
      $ \hat a[k + {n \mathord{\left/ {\vphantom {n 2}} \right. } 2}] = {{T}_1} + {{T}_3} \cdot \mu $
      $ \hat a[k + {{3n} \mathord{\left/ {\vphantom {{3n} 4}} \right. } 4}] = {{T}_1} - {{T}_3} \cdot \mu $
    下载: 导出CSV
    算法6 基3-NTT算法的核心运算操作拆解,其中$ \mu = {\omega ^{n/3}} $
     1. $ {T_0} = {\omega ^k}B{\text{[}}k{\text{]}} + {\left( {{\omega ^k}} \right)^2}{C[}k{\text{]}} $
      $ {T_1} = {\omega ^k}B{\text{[}}k{\text{]}} - {\left( {{\omega ^k}} \right)^2}{C[}k{\text{]}} $
     2. $ {T}_{2}={\omega }^{k}B\text{[}k{{](T}}_{0}的中间结果\text{)}+{{T}}_{1}\cdot \mu $
      $ {T}_{3}={\left({\omega }^{k}\right)}^{2}C\text{[}k{{](T}}_{0}的中间结果\text{)}-{{T}}_{1}\cdot \mu $
     3. $ \hat a[k] = {{A}}[k] + {{{T}}_0} $
      $ \hat a[k + {n \mathord{\left/ {\vphantom {n 3}} \right. } 3}] = {{A}}[k] - {{{T}}_3} $
      $ \hat a[k + {{2n} \mathord{\left/ {\vphantom {{2n} 3}} \right. } 3}] = {{A}}[k] - {{{T}}_2} $
    下载: 导出CSV
    算法7 4路PtNTT中的点乘算法
     1. for i=0; i<4; i++
       for j=0; j<4; j++
        $ {R_{i,j}} = \widehat {{a_i}}{\text{(}}x{\text{)}} \circ \widehat {{b_j}}{\text{(}}x{\text{)}} $
       end for
      end for
     2.$ \widehat {{c_0}}(x) = {R_{0,0}} + \widehat x \circ \left( {{R_{1,3}} + {R_{3,1}} + {R_{2,2}}} \right) $
      $ \widehat {{c_1}}(x) = {R_{0,1}} + {R_{1,0}} + \widehat x \circ \left( {{R_{2,3}} + {R_{3,2}}} \right) $
      $ \widehat {{c_2}}(x) = {R_{0,2}} + {R_{2,0}} + {R_{1,1}} + \widehat x \circ {R_{3,3}} $
      $ \widehat {{c_3}}(x) = {R_{0,3}} + {R_{3,0}} + {R_{1,2}} + {R_{2,1}} $
    下载: 导出CSV
    算法8 3路PtNTT中的点乘算法
     1. for i=0; i<3; i++
       for j=0; j<3; j++
        $ {R_{i,j}} = \widehat {{a_i}}{\text{(}}x{\text{)}} \circ \widehat {{b_j}}{\text{(}}x{\text{)}} $
       end for
      end for
     2. $ \widehat {{c_0}}(x) = {R_{0,0}} + \widehat x \circ \left( {{R_{1,2}} + {R_{2,1}}} \right) $
      $ \widehat {{c_1}}(x) = {R_{0,1}} + {R_{1,0}} + \widehat x \circ {R_{2,2}} $
      $ \widehat {{c_2}}(x) = {R_{0,2}} + {R_{2,0}} + {R_{1,1}} $
    下载: 导出CSV
    算法9 16 bit模乘、32 bit乘法
     输入:32 bit数据[A1,A0],32 bit数据[B1,B0],16 bit模数q,选
        择数据sel(sel=0代表32比特相乘,sel=1代表16 bit模乘)
     预计算:m=$ \left\lfloor {{{{2^k}} \mathord{\left/ {\vphantom {{{2^k}} q}} \right. } q}} \right\rfloor $
     输 出:32 bit数据z
     1. ${ {{M} }_1} = {{A} }_0 \times {{B_0} }$
     2. if(sel) ${ {{M} }_2} = { {{M} }_1}[15:0] \times m$,${ {{M} }_3} = { {{M} }_1}[31:16] \times m$
      else ${ {{M} }_2} = {{A_1} } \times {{B_0} }$,${ {{M} }_3} = {{A} }0 \times {{B1} }$
     3. if(sel) ${ {{M} }_4} = \left( { { {{M} }_3} < < 16 + { {{M} }_2} } \right)[47:32] \times q$
      else ${ {{M} }_4} = {{A_1} } \times {{B_1} }$
     4. if(sel) ${\text{addsub} }_{1}={{M} }_{1}-{{M} }_{3}$
      else ${\text{addsu} }{ {\text{b} }_1} = { {{M} }_2} + { {{M} }_3}$
     5. if(sel) $ {\text{addsu}}{{\text{b}}_2} = {\text{addsu}}{{\text{b}}_1} - q $
      else ${\text{addsu} }{ {\text{b} }_2} = \left( { {\text{addsu} }{ {\text{b} }_1} < < 16} \right) + { {{M} }_1}$
     6. if(sel) $ z = \left( {{\text{addsu}}{{\text{b}}_2} < 0} \right)?{\text{addsu}}{{\text{b}}_1}:{\text{addsu}}{{\text{b}}_2} $
      else $z = \left( { { {{M} }_4} < < 32} \right) + {\text{addsu} }{ {\text{b} }_2}$
    下载: 导出CSV

    表  2  基于FPGA不同多项式乘法架构性能对比

    文献年份频率/
    MHz
    时钟
    周期数
    时延
    /μs
    工艺参数资源
    (LUT/FF/
    BRAM/DSP)
    时延性
    能对比
    支持
    算法
    [7]20203865Zynq-7000n=256, q=33292908/-/-/9Kyber
    226081n=256, q=213Saber
    [8]20222509323.728UltraScale+n=256, q=33292.33Kyber
    10244.096n=256, q=2132.56Saber
    [6]202223216277.013Artix-7n=256, q<2326976/-/24/16q≡1mod2n
    条件下
    279812.060n=512, q<2321.19
    525122.634n=1024, q<232
    1045545.065n=2048, q<232
    [9]2022224Artix-7n=256, q=332925674/3137/6/64Kyber
    n=256, q=8380417Dilithium
    [14]20222652460.928Artix-7n=256, q=33299211/9810/1/600.58Kyber
    261
    257
    454
    846
    1.740
    3.292
    Spartan-7
    Artix-7
    n=512, q=12289
    n=1024,q=12289
    22800/45600
    /1/68
    23261/14901
    /1/76

    NewHope
    本文1522431.599Artix-7n=256, q=332913794/-/3×36K/481Kyber
    8965.895n=256, q=8380417Dilithium
    2431.599n=256, q=213Saber
    3794
    3770
    6112
    24.961
    24.803
    40.211
    n=701, q=8192
    n=677, q=2048
    n=821, q=4096
    NTRU
    下载: 导出CSV
  • [1] SHOR P W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer[J]. SIAM Review, 1999, 41(2): 303–332. doi: 10.1137/S0036144598347011
    [2] ALAGIC G, ALPERIN-SHERIFF J, APON D, et al. Status report on the second round of the NIST post-quantum cryptography standardization process[R]. NIST IR 8309, 2020.
    [3] ALAGIC G, ALPERIN-SHERIFF J, APON D, et al. Status report on the first round of the NIST post-quantum cryptography standardization process[R]. NISTIR 8240, 2019.
    [4] BANERJEE U, UKYAB T S, and CHANDRAKASAN A P. Sapphire: A configurable crypto-processor for post-quantum lattice-based protocols[J]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2019, 2019(4): 17–61. doi: 10.13154/tches.v2019.i4.17-61
    [5] FRITZMANN T, SHARIF U, MÜLLER-GRITSCHNEDER D, et al. Towards reliable and secure post-quantum co-processors based on RISC-V[C]. 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019: 1148–1153.
    [6] 刘冬生, 赵文定, 刘子龙, 等. 应用于格密码的可重构多通道数论变换硬件设计[J]. 电子与信息学报, 2022, 44(2): 566–572. doi: 10.11999/JEIT210114

    LIU Dongsheng, ZHAO Wending, LIU Zilong, et al. Reconfigurable hardware design of multi-lanes number theoretic transform for lattice-based cryptography[J]. Journal of Electronics &Information Technology, 2022, 44(2): 566–572. doi: 10.11999/JEIT210114
    [7] FRITZMANN T, SIGL G, and SEPÚLVEDA J. RISQ-V: Tightly coupled RISC-V accelerators for post-quantum cryptography[R]. Paper 2020/446, 2020.
    [8] LI Aobo, LIU Dongsheng, LI Xiang, et al. A flexible instruction-based post-quantum cryptographic processor with modulus reconfigurable arithmetic unit for module LWR&E[C]. 2022 IEEE Asian Solid-State Circuits Conference (A-SSCC), Taipei, China, 2022: 1–3.
    [9] ZHAO Yifan, XIE Ruiqi, XIN Guozhu, et al. A high-performance domain-specific processor with matrix extension of RISC-V for module-LWE Applications[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2022, 69(7): 2871–2884. doi: 10.1109/TCSI.2022.3162593
    [10] ZHU Yihong, ZHU Wenping, ZHU Min, et al. A 28nm 48KOPS 3.4µJ/Op agile crypto-processor for post-quantum cryptography on multi-mathematical problems[C]. 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, USA, 2022: 514–516.
    [11] ALKIM E, CHENG D Y L, CHUNG C M M, et al. Polynomial multiplication in NTRU Prime: Comparison of optimization strategies on Cortex-M4[R]. Paper 2020/1216, 2020.
    [12] 曲英杰. 可重构密码协处理器的组成与结构[J]. 计算机工程与应用, 2003, 39(23): 32–34. doi: 10.3321/j.issn:1002-8331.2003.23.011

    QU Yingjie. Components and structure of reconfigurable cipher coprocessor[J]. Computer Engineering and Applications, 2003, 39(23): 32–34. doi: 10.3321/j.issn:1002-8331.2003.23.011
    [13] AIKATA A, MERT A C, JACQUEMIN D, et al. A unified cryptoprocessor for lattice-based signature and key-exchange[J]. IEEE Transactions on Computers, 2023, 72(6): 1568–1580. doi: 10.1109/TC.2022.3215064
    [14] DUONG-NGOC P and LEE H. Configurable mixed-radix number theoretic transform architecture for lattice-based cryptography[J]. IEEE Access, 2022, 10: 12732–12741. doi: 10.1109/ACCESS.2022.3145988
  • 加载中
图(7) / 表(11)
计量
  • 文章访问数:  454
  • HTML全文浏览量:  450
  • PDF下载量:  81
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-04-17
  • 修回日期:  2023-08-19
  • 录用日期:  2023-08-21
  • 网络出版日期:  2023-08-24
  • 刊出日期:  2023-09-27

目录

    /

    返回文章
    返回