A Hybrid Granularity Parallel Arithmetical Unit for Stream Cipher
-
摘要: 针对可重构密码处理器对于不同域上的序列密码算法兼容性差、实现性能低的问题,该文分析了序列密码算法的多级并行性并提出了一种反馈移位寄存器(FSR)的预抽取更新模型。进而基于该模型设计了面向密码阵列架构的可重构反馈移位寄存器运算单元(RFAU),兼容不同有限域上序列密码算法的同时,采取并行抽取和流水处理策略开发了序列密码算法的反馈移位寄存器级并行性,从而有效提升了粗粒度可重构阵列(CGRA)平台上序列密码算法的处理性能。实验结果表明与其他可重构处理器相比,对于有限域(GF)(2)上的序列密码算法,RFAU带来的性能提升为23%~186%;对于GF(2u)域上的序列密码算法,性能提升达约66%~79%,且面积效率提升约64%~91%。Abstract: For stream cipher algorithms of different granularity, reconfigurable cryptographic processors have poor compatibility and low implementation performance. In this paper, the multi-level parallelism of stream cipher algorithms is analyzed and a pre-extraction update model of the Feedback Shift Register(FSR) is established. Based on this, a Reconfigurable Feedback-shift-register Arithmetic Unit (RFAU) is proposed to apply to the cryptographic array architecture, which can be compatible with stream cipher algorithms on different Galois fields. Moreover, parallel extraction and pipeline processing strategies are executed to exploit the FSR-level parallelism of stream cipher, which effectively improve the performance of stream cryptographic algorithms on the Coarse-Grained Reconfigurable Array (CGRA) platform. The experimental results show that the performance improvement of the experimental platform brought by RFAU is reached about 23%~186% for the stream ciphers on the Galois Field (GF)(2), compared with the other reconfigurable processors. For the stream ciphers on the GF (2u) field, the throughput rate is improved to about 66%~79%, and the area efficiency is enhanced to about 64%~91%.
-
表 1 4×4规模的RFAU硬件性能参数
符号 定义 FP 序列密码算法函数级并行性 RP 序列密码算法FSR级并行性 OP 序列密码算法操作级并行性 l FSR中的寄存器位宽 n FSR中的寄存器数量 rxt FSR中寄存器Regx在时刻t和t+i的状态 an 寄存器n的状态值 axm Regx中第m bit St FSR中在时刻t的状态 F FSR的状态反馈函数 $ {r_{{F_z}}} $ FSR中参与反馈函数的寄存器状态:状态变量 k FSR中参与反馈函数的状态变量数量 Rt 时刻t所有状态变量构成的集合Rt d 反馈端和距它最近的状态变量之间的距离d 表 2 GF(232)上序列密码算法实现性能及面积效率对比
结构 工艺
(nm)面积
(mm2)算法 工作频率
(MHz)吞吐率
(Gbps)面积效率 算法 工作频率
(MHz)吞吐率
(Gbps)面积效率 本文 55 12.35 Snow 3G 250 7.81 0.63 ZUC 222 6.94 0.56 PVHarray 55 12.25 130 4.37 0.35 125 3.91 0.32 Anole 65 7.75 400 6.4 0.83 400 3.2 0.41 文献[12] 40 2.54 350 3.87 1.52 350 4.68 1.84 本文
PVHarray55
5512.35
12.25Sober-t32 240
1307.5
4.370.61
0.35SOSEMANUK 200
1206.25
3.750.51
0.31 -
[1] KOTESHWARA S, KUMAR M, and PATTNAIK P. Performance optimization of lattice post-quantum cryptographic algorithms on many-core processors[C]. 2020 IEEE International Symposium on Performance Analysis of Systems and Software, Boston, USA, 2020: 223–225. [2] JIAO Lin, HAO Yonglin, and FENG Dengguo. Stream cipher designs: A review[J]. Science China Information Sciences, 2020, 63(3): 131101. doi: 10.1007/s11432-018-9929-x [3] DAI Zibin, LI Wei, CHEN Tao, et al. Design and implementation of a high-speed reconfigurable feedback shift register[C]. 2008 4th IEEE International Conference on Circuits and Systems for Communications, Shanghai, China, 2008: 338–342. [4] 徐光明, 徐金甫, 常忠祥, 等. 序列密码非线性反馈移存器的可重构研究[J]. 计算机应用研究, 2015, 32(9): 2823–2826. doi: 10.3969/j.issn.1001-3695.2015.09.062XU Guangming, XU Jinfu, CHANG Zhongxiang, et al. Reconfigurability study on nonlinear feedback shift registers in stream cipher[J]. Application Research of Computers, 2015, 32(9): 2823–2826. doi: 10.3969/j.issn.1001-3695.2015.09.062 [5] NAN Longmei, ZENG Xiaoyang, WANG Zhouchuang, et al. Research of a reconfigurable coarse-grained cryptographic processing unit based on different operation similar structure[C]. The 2017 IEEE 12th International Conference on ASIC, Guiyang, China, 2017: 191–194. [6] NAN Longmei, YANG Xuan, ZENG Xiaoyang, et al. A VLIW architecture stream cryptographic processor for information security[J]. China Communications, 2019, 16(6): 185–199. doi: 10.23919/JCC.2019.06.015 [7] 管子铭. 序列密码可重构处理结构研究与设计[D]. [硕士论文], 解放军信息工程大学, 2009.GUAN Ziming. Research and design of sequence cipher reconfigurable processing architecture[D]. [Master dissertation], PLA Information Engineering University, 2009. [8] DU Yiran, LI Wei, DAI Zibin, et al. PVHArray: An energy-efficient reconfigurable cryptographic logic array with intelligent mapping[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2020, 28(5): 1302–1315. doi: 10.1109/TVLSI.2020.2972392 [9] LIU Leibo, WANG Bo, DENG Chenchen, et al. Anole: A highly efficient dynamically reconfigurable crypto-processor for symmetric-key algorithms[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37(12): 3081–3094. doi: 10.1109/TCAD.2018.2801229 [10] SAYILAR G and CHIOU D. Cryptoraptor: High throughput reconfigurable cryptographic processor[C]. 2014 IEEE/ACM International Conference on Computer-Aided Design, San Jose; USA, 2014: 155–161. [11] IBRAHIM M I, KHAN M I W, JUVEKAR C S, et al. 29.8 THzID: A 1.6mm2 package-less cryptographic identification tag with backscattering and beam-steering at 260GHz[C]. 2020 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, USA, 2020: 454–456. [12] 杨锦江. 基于可重构计算的密码处理器关键技术研究[D]. [博士论文], 东南大学, 2018.YANG Jinjiang. Research on key technologies of reconfigurable cryptographic processors[D]. [Ph. D. dissertation], Southeast University, 2018. [13] XUE Yuqian and DAI Zibin. Reconfiurable multi-launch pipeline processing architecture for block cipher[J]. Application of Electronic Technique, 2020, 46(4): 40–44,48. doi: 10.16157/j.issn.0258-7998.200005 [14] KITSOS P, SKLAVOS N, PROVELENGIOS G, et al. FPGA-based performance analysis of stream ciphers ZUC, Snow3g, grain V1, mickey V2, trivium and E0[J]. Microprocessors and Microsystems, 2013, 37(2): 235–245. doi: 10.1016/j.micpro.2012.09.007 [15] STILLMAKER A and BAAS B. Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm[J]. Integration, 2017, 58: 74–81. doi: 10.1016/j.vlsi.2017.02.002