Design of Hybrid-granularity Multifunctional Computing Unit Based on And-Xor-Inv Graph
-
摘要: 粗粒度可重构密码逻辑阵列(CGRCA)难以兼容细粒度序列密码算法,且在编码环节功能单元容易出现竞争冲突,进而导致阵列的资源利用率低和延迟大等问题。为此,利用与-异或-非图(AXIG)双逻辑表达的优势,该文提出一种混合粒度的可重构的多功能密码运算单元,并在晶体管级进行了实现验证,可兼容现有序列密码算法中非线性布尔函数,在延迟和面积-延迟积(ADP)方面均有提升。设计了可重构与、异或、与非(RAXN)逻辑元件,可同时重构“And, Xor, Nand”等逻辑功能,并提出了RAXN的晶体管级实现方法和版图结构;提出了基于RAXN的功能扩展方法,实现了全加器功能、与/异或3输入逻辑功能以及乘法部分积生成功能,并作为基本功能单元(RAXN_U);结合动态配置和动态调度的思想,利用阵列中互联资源和RAXN_U,设计一种可同时实现32 bit加法、8 bit乘法、CF(28)有限域乘法,以及包括S盒在内的复杂非线性布尔函数的混合粒度多功能密码运算单元(RHMCA)。在CMOS 40 nm工艺进行后端定制化设计,实验结果表明,该文提出的多功能单元较传统的实现方法,延迟最好情况优化1.27 ns,面积-延迟积(ADP)值最大提升44.8%。Abstract: Recently, the majority of fine-grained sequence-coding algorithms are not applied to the existing Coarse-Grained ReConfigurable Arrays (CGRCA). Moreover, competition conflicts often occur in the encoding stages, which causes low resource utilization and high latency for CGRCA. To address this issue, a Hybrid-grained Reconfigurable Multifunctional Cryptographic Arithmetic unit (RHMCA) at transistor-level is proposed in this paper, which can be compatible with non-linear Boolean functions in existing stream cryptography algorithms with improved performance metrics. More specifically, a Reconfigurable And-Xor-Nand (RAXN) logic element based on the And-Xor-Inv Graph (AXIG) logic is designed, which can reconfigure the several logic functions (including the And, Xor, and Nand). A transistor-level implementation and layout structure of RAXN is proposed to reduce the delay overhead; A functional extension method of RAXN is proposed in this paper and a basic functional Unit (RAXN_U) is proposed to realize full adder, three-input And/Xor logic, and multiplier partial product generation functions; A hybrid-grained RHMCA is designed by combining the interconnect resources and RAXN_Us in the array, which can implement 32 bit addition, 8 bit multiplication, CF(28) finite field multiplication, and complex nonlinear Boolean functions. The proposed design is validated with the CMOS 40 nm technology, and the results show that the proposed design reduces 1.27 ns delay and decreases 44.8% Area-Delay Product (ADP) value compared to the existing approaches.
-
表 1 RAXN电路信号传输路径分析
M A B C 信号路径 M A B C 信号路径 0 A 0 0 F=1 1 A 0 0 F=0 0 1 F=1 0 1 N5→N2→P9→F/ P5→N7→F 1 0 F=1 1 0 N1→P9→F/ P2→P4→N7→F 1 1 P6→P10(P9)→F / N10→N8→N6→F 1 1 P6→P9→F/ N10→N8→N6→F 0 B 0 F=1 0 B 0 F=0 0 1 F=1 0 1 N4→N2→P9→F/ P3→P5→N7→F 1 0 F=1 1 0 N3→N1→P9→F/ P1→P4→N7→F 1 1 P7→P10(P9)→F/N8→N6→F 1 1 P7→P9→F/ N8→N6→F 0 0 C F=1 0 0 C F=0 0 1 F=1 0 1 N2→P9→F/ P4→N7→F 1 0 F=1 1 0 N2→P9→F/ P4→N7→F 1 1 P8→ P10(P9)→F / N6→F 1 1 P8→P9→F/ N6→F 表 2 两种方式实现下延迟和面积
延迟(ns) 总面积(μm²) 8 bit乘法 Xtime 8 bit加法 NLBF CMOS标准单元实现 1.580 1.260 0.460 1.440 3634.57 定制优化 1.076 0.571 0.391 0.589 1512.12 -
[1] WANG Jiabo, LIU Ling, LYU S X, et al. Quantum-safe cryptography: Crossroads of coding theory and cryptography[J]. Science China Information Sciences, 2022, 65(1): 111301. doi: 10.1007/s11432-021-3354-7 [2] 柴进晋. 布尔函数的几类密码性质研究与S盒的构造[D]. [博士论文], 西安电子科技大学, 2021.CHAI Jinjin. Research on several cryptographic properties of boolean functions and S-boxes construction[D]. [Ph. D. dissertation], Xidian University, 2021. [3] LIU Leibo, WANG Bo, DENG Chenchen, et al. Anole: A highly efficient dynamically reconfigurable crypto-processor for symmetric-key algorithms[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37(12): 3081–3094. doi: 10.1109/TCAD.2018.2801229 [4] DU Yiran, LI Wei, DAI Zibin, et al. PVHArray: An energy-efficient reconfigurable cryptographic logic array with intelligent mapping[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2020, 28(5): 1302–1315. doi: 10.1109/TVLSI.2020.2972392 [5] 曲彤洲, 戴紫彬, 陈琳, 等. 一种面向序列密码的混合粒度并行运算单元[J]. 电子与信息学报, 2023, 45(1): 78–86. doi: 10.11999/JEIT211579QU Tongzhou, DAI Zibin, CHEN Lin, et al. A hybrid granularity parallel arithmetical unit for stream cipher[J]. Journal of Electronics &Information Technology, 2023, 45(1): 78–86. doi: 10.11999/JEIT211579 [6] 薛煜骞. 可重构分组密码流水处理架构研究[D]. [硕士论文], 战略支援部队信息工程大学, 2020.XUE Yuqian. Research on reconfigurable block cipher pipeline processing architecture[D]. [Master dissertation], PLA Strategic Support Force Information Engineering University, 2020. [7] 徐光明, 徐金甫, 常忠祥, 等. 序列密码非线性反馈移存器的可重构研究[J]. 计算机应用研究, 2015, 32(9): 2823–2826. doi: 10.3969/j.issn.1001-3695.2015.09.062XU Guangming, XU Jinfu, CHANG Zhongxiang, et al. Reconfigurability study on nonlinear feedback shift registers in stream cipher[J]. Application Research of Computers, 2015, 32(9): 2823–2826. doi: 10.3969/j.issn.1001-3695.2015.09.062 [8] NAN Longmei, ZENG Xiaoyang, WANG Zhouchuang, et al. Research of a reconfigurable coarse-grained cryptographic processing unit based on different operation similar structure[C]. The IEEE 12th International Conference on ASIC, Guiyang, China, 2017: 191–194. [9] 戴紫彬, 王周闯, 李伟, 等. 可重构非线性布尔函数利用率模型研究与硬件设计[J]. 电子与信息学报, 2017, 39(5): 1226–1232. doi: 10.11999/JEIT160733DAI Zibin, WANG Zhouchuang, LI Wei, et al. Hardware implementation and utilization model research for reconfigurable non-linear boolean function[J]. Journal of Electronics &Information Technology, 2017, 39(5): 1226–1232. doi: 10.11999/JEIT160733 [10] 李道通. 基于双逻辑的低功耗乘法器设计[D]. [硕士论文], 宁波大学, 2017.LI Daotong. The design of low-power multiplier based on dual-logic[D]. [Master dissertation], Ningbo University, 2017. [11] 高丽华. 16位可重构乘法器设计[D]. [硕士论文], 哈尔滨工业大学, 2009.GAO Lihua. Design of 16-bit reconfigurable multiplier[D]. [Master dissertation], Harbin Institute of Technology, 2009. [12] WAN Qing, WAN Changjin, WU Huaqiang, et al. 2022 roadmap on neuromorphic devices and applications research in China[J]. Neuromorphic Computing and Engineering, 2022, 2(4): 042501. doi: 10.1088/2634-4386/ac7a5a [13] SHARMA M, PANDEY D, PALTA P, et al. Design and power dissipation consideration of PFAL CMOS V/S conventional CMOS based 2: 1 multiplexer and full adder[J]. Silicon, 2022, 14(8): 4401–4410. doi: 10.1007/s12633-021-01221-1 [14] 马雪娇, 李刚. 基于NAXIG的面积和功耗优化算法[J]. 科技通报, 2020, 36(6): 19–25,32. doi: 10.13774/j.cnki.kjtb.2020.06.003MA Xuejiao and LI Gang. Area and power optimization algorithm based on NAXIG[J]. Bulletin of Science and Technology, 2020, 36(6): 19–25,32. doi: 10.13774/j.cnki.kjtb.2020.06.003 [15] 阳媛. 面向双逻辑的低功耗单元库技术[D]. [硕士论文], 宁波大学, 2016.YANG Yuan. Low power standard cell library technique for dual-logic[D]. [Master dissertation], Ningbo University, 2016. [16] 黄志洪. 现场可编程门阵列(FPGA)异质逻辑与存储互连结构研究[D]. [博士论文], 中国科学院大学, 2016.HUANG Zhihong. Research into FPGA Heterogeneous Logic and EmbeddedProgrammable Memory interconnect Architecture[D]. [Ph. D. dissertation], University of Chinese Academy of Sciences, 2016. [17] THÜMMLER M, RAI S, and KUMAR A. Improving technology mapping for and-inverter-cones[C]. 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, 2022: 274–279, [18] SARANGI S and BAAS B. DeepScaleTool: A tool for the accurate estimation of technology scaling in the deep-submicron era[C]. 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Korea, 2021: 1–5, [19] UMA R, VIJAYAN V, MOHANAPRIYA M, et al. Area, delay and power comparison of adder topologies[J]. International Journal of VLSI Design & Communication Systems, 2012, 3(1): 153–168. doi: 10.5121/vlsic.2012.3113 [20] SIVANANDAM K and KUMAR P. Design and performance analysis of reconfigurable modified Vedic multiplier with 3-1-1-2 compressor[J]. Microprocessors and Microsystems, 2019, 65: 97–106. doi: 10.1016/j.micpro.2019.01.002