FPGA Design and Implementation of Large Integer Multiplier
-
摘要: 大整数乘法是公钥加密中最为核心的计算环节,实现运算快速的大数乘法单元是RSA, ElGamal,全同态等密码体制中急需解决的问题之一。针对全同态加密(FHE)应用需求,该文提出一种基于Schönhage-Strassen算法(SSA)的768 kbit大整数乘法器硬件架构。采用并行架构实现了其关键模块64k点有限域快速数论变换(NTT)的运算,并主要采用加法和移位操作以保证并行处理的最大化,有效提高了处理速度。该大整数乘法器在Stratix-V FPGA上进行了硬件验证,通过与CPU上使用数论库(NTL)和GMP库实现的大整数乘法运算结果对比,验证了该文设计方法的正确性和有效性。实验结果表明,该方法实现的大整数乘法器运算时间比CPU平台上的运算大约有8倍的加速。Abstract: Large integer multiplication is the most important part in public key encryption, which often consumes most of the computing time in RSA, ElGamal, Fully Homomorphic Encryption (FHE) and other cryptosystems. Based on Schönhage-Strassen Algorithm (SSA), a design of high-speed 768 kbit multiplier is proposed. As the key component, an 64k-point Number Theoretical Transform (NTT) is optimized by adopting parallel architecture, in which only addition and shift operations are employed and thus the processing speed is improved effectively. The large integer multiplier design is validated on Stratix-V FPGA. By comparing its results with CPU using Number Theory Library(NTL) and GMP library, the correctness of this design is proved. The results also show that the FPGA implementation is about eight times faster than the same algorithm executed on the CPU.
-
Key words:
- Fully Homomorphic Encryption (FHE) /
- FPGA /
- Large number multiplication /
- GMP library
-
表 1 主要大整数乘法算法时间复杂度
算法 时间复杂度 grammar-school $O({N^2})$ Karatsuba-Ofman $O({N^{\ln 3/\ln 2}})$ Toom-Cook $O({N^{\ln (2k - 1)/\ln (k)}})$ Schönhage-Strassen (SSA) $O(N\log N\log \log N)$ 表 2 素数p的单位根
${r_N}$ NTT点数 单位根${r_N}$ 2 296 4 248 8 224 16 212 32 26 64 23 表 3 Stratix-V FGPA综合结果
逻辑单元 Stratix V 占用资源数 总资源数 利用率(%) ALUTs 240229 718400 33 Logic registers 236088 1436800 16 Total block Memory (bit) 16252928 54067200 30 Total DSP blocks 288 352 82 最大频率(MHz) 98.02 表 4 CPU和FPGA上性能比较
计算时间(ms) 加速倍数 I7-7700 CPU 3.350 1 本文设计 0.419 8 -
光炎, 祝跃飞, 顾纯祥, 等. 一种针对全同态加密体制的密钥恢复攻击[J]. 电子与信息学报, 2013, 35(12): 2999–3004. doi: 10.3724/SP.J.1146.2013.00300GUANG Yan, ZHU Yuefei, GU Chunxiang, et al. A key recovery attack on fully homomorphic encryption scheme[J]. Journal of Electronics &Information Technology, 2013, 35(12): 2999–3004. doi: 10.3724/SP.J.1146.2013.00300 FENG Xiang and LI Shuguo. Design of an area-effcient million-bit integer multiplier using double modulus NTT[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017, 25(9): 2658–2662. doi: 10.1109/TVLSI.2017.2691727 陈智罡. 基于格的全同态加密研究与设计[D]. [博士论文], 南京航空航天大学, 2015: 1–5.CHEN Zhigang. Research and design of fully homomorphic encryption based on lattice[D]. [Ph.D. dissertation], Nanjing University of Aeronautics and Astronautics, 2015: 1–5. GENTRY C and HALEVI S. Implementing Gentry’s fully-homomorphic encryption scheme[C]. The 30th Annual International Conference on Theory and Applications of Cryptographic Techniques: Advances in Cryptology, Tallinn, Estonia, 2011: 129–148. GENTRY C. A fully homomorphic encryption scheme[D]. [Ph.D. dissertation], Stanford University, 2009. 施佺, 韩赛飞, 黄新明, 等. 面向全同态加密的有限域FFT算法FPGA设计[J]. 电子与信息学报, 2018, 40(1): 57–62. doi: 10.11999/JEIT170312SHI Quan, HAN Saifei, HUANG Xinming, et al. Design of finite field FFT for fully homomorphic encryption based on FPGA[J]. Journal of Electronics &Information Technology, 2018, 40(1): 57–62. doi: 10.11999/JEIT170312 ÖZTÜRK E, DORÖZ Y, SAVAŞ E, et al. A custom accelerator for homomorphic encryption applications[J]. IEEE Transactions on Computers, 2017, 66(1): 3–16. doi: 10.1109/TC.2016.2574340 YE J H and SHIEH M D. Low-complexity VLSI design of large integer multipliers for fully homomorphic encryption[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(9): 1727–1736. doi: 10.1109/TVLSI.2018.2829539 POLLARD J M. The fast Fourier transform in a finite field[J]. Mathematics of Computation, 1971, 25(114): 365–374. doi: 10.1090/S0025-5718-1971-0301966-0 WANG Wei, HUANG Xinming, EMMART N, et al. VLSI design of a large-number multiplier for fully homomorphic encryption[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2014, 22(9): 1879–1887. doi: 10.1109/TVLSI.2013.2281786 RAFFERTY C, O’NEILL M, and HANLEY N. Evaluation of large integer multiplication methods on hardware[J]. IEEE Transactions on Computers, 2017, 66(8): 1369–1382. doi: 10.1109/TC.2017.2677426 ROY S S, VERCAUTEREN F, VLIEGEN J, et al. Hardware assisted fully homomorphic function evaluation and encrypted search[J]. IEEE Transactions on Computers, 2017, 66(9): 1562–1572. doi: 10.1109/TC.2017.2686385 DORÖZ Y, ÖZTÜRK E, and SUNAR B. Accelerating fully homomorphic encryption in hardware[J]. IEEE Transactions on Computers, 2015, 64(6): 1509–1521. doi: 10.1109/TC.2014.2345388 WANG Wei, HU Yin, CHEN Lianmu, et al. Accelerating fully homomorphic encryption using GPU[C]. 2012 IEEE Conference on High Performance Extreme Computing, Waltham, USA, 2012: 1–5. HUANG Xinming and WANG Wei. A novel and efficient design for an RSA cryptosystem with a very large key size[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2015, 62(10): 972–976. doi: 10.1109/TCSII.2015.2458033 JOHNSON L G. Conflict free memory addressing for dedicated FFT hardware[J]. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 1992, 39(5): 312–316. doi: 10.1109/82.142032 FENG Xiang and LI Shuguo. Accelerating an FHE integer multiplier using negative wrapped convolution and Ping-Pong FFT[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2019, 66(1): 121–125. doi: 10.1109/TCSII.2018.2840108 WANG Wei and HUANG Xinming. FPGA implementation of a large-number multiplier for fully homomorphic encryption[C]. Proceedings of 2013 IEEE International Symposium on Circuits and Systems, Beijing, China, 2013: 2589–2592.