高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

FPGA三模冗余工具的关键技术与发展

陈雷 张瑶伟 王硕 周婧 田春生 庞永江 马筱婧 周冲 杜忠

陈雷, 张瑶伟, 王硕, 周婧, 田春生, 庞永江, 马筱婧, 周冲, 杜忠. FPGA三模冗余工具的关键技术与发展[J]. 电子与信息学报, 2022, 44(6): 2230-2244. doi: 10.11999/JEIT210330
引用本文: 陈雷, 张瑶伟, 王硕, 周婧, 田春生, 庞永江, 马筱婧, 周冲, 杜忠. FPGA三模冗余工具的关键技术与发展[J]. 电子与信息学报, 2022, 44(6): 2230-2244. doi: 10.11999/JEIT210330
CHEN Lei, ZHANG Yaowei, WANG Shuo, ZHOU Jing, TIAN Chunsheng, PANG Yongjiang, MA Xiaojing, ZHOU Chong, DU Zhong. Key Technology and Development of Triple Modular Redundancy Tool for FPGA[J]. Journal of Electronics & Information Technology, 2022, 44(6): 2230-2244. doi: 10.11999/JEIT210330
Citation: CHEN Lei, ZHANG Yaowei, WANG Shuo, ZHOU Jing, TIAN Chunsheng, PANG Yongjiang, MA Xiaojing, ZHOU Chong, DU Zhong. Key Technology and Development of Triple Modular Redundancy Tool for FPGA[J]. Journal of Electronics & Information Technology, 2022, 44(6): 2230-2244. doi: 10.11999/JEIT210330

FPGA三模冗余工具的关键技术与发展

doi: 10.11999/JEIT210330
基金项目: 国家科技重大专项(2009ZYHJ0005)
详细信息
    作者简介:

    陈雷:男,1978年生,研究员,主要研究方向为FPGA,Soc,ASIC等VLSI研发

    张瑶伟:男,1997年生,硕士生,主要研究方向为FPGA的三模冗余、高层次综合

    王硕:男,1985年生,硕士,主要研究方向为FPGA CAD算法

    周婧:女,1986年生,硕士,主要研究方向为故障注入、刷新技术、单粒子效应缓解技术

    田春生:男,1993年生,博士,主要研究方向为集成电路设计自动化

    庞永江:男,1991年生,硕士,主要研究方向为软件应用、IDE设计

    马筱婧:女,1993年生,硕士,主要研究方向为FPGA应用与验证

    周冲:男,1995年生,硕士,主要研究方向为综合、布局、布线

    杜忠:男,1975年生,研究员,主要研究方向为软件应用、抗辐照技术、FPGA测试、FPGA EDA

    通讯作者:

    张瑶伟 zyw18810532787@163.com

  • 中图分类号: TN47

Key Technology and Development of Triple Modular Redundancy Tool for FPGA

Funds: The National Science and Technology Major Project (2009ZYHJ0005)
  • 摘要: SRAM型现场可编程门阵列(FPGA)在空间辐射环境中容易受到单粒子效应的影响,从而发生软错误,三模冗余技术(TMR)是目前使用最广泛的缓解FPGA软错误的电路加固技术。该文首先介绍了三模冗余技术研究现状,然后总结了三模冗余工具常用的细粒度TMR技术、系统分级技术、配置刷新技术、状态同步技术4项关键技术及其实现原理。随着FPGA的高层次综合技术愈发成熟,基于高层次综合的三模冗余工具逐渐成为新的研究分支,该文分类介绍了当前主流的基于寄存器传输级的三模冗余工具,基于重要软核资源的三模冗余工具,以及新兴的基于高层次综合的三模冗余工具,最后对FPGA三模冗余工具的未来发展趋势进行了总结与展望。
  • 图  1  细粒度的TMR技术分类

    图  2  将TMR应用于分解的系统

    图  3  分级TMR系统中的容错与故障

    图  4  不带状态同步的TMR

    图  5  带状态同步的TMR

    图  6  关键技术的配合使用

    图  7  RASP-TMR生成的顶层文件结构

    图  8  TMRTool的实现

    图  9  SEU的检测与恢复状态机

    图  10  RTL设计与HLS设计的设计时间与应用性能

    图  11  新型容错硬件加速器设计

    图  12  MicroBlaze TMR子系统的部分结构图

    表  1  现有的TMR工具

    分类特点工具特点
    基于RTL可以实现对TMR实现细节的微调,
    面临综合阶段冗余被优化的问题,
    需要掌握综合阶段的各种中间网表文件的细节
    RASP-TMRVerilog语言的TMR,基于MATLAB开发,功能简单
    TMRGVerilog语言的TMR,使用Python编写,维护积极,适合学术交流
    Xilinx TMRToolRTL级.ngc网表文件的TMR,受国际武器贸易条例保护
    BL-TMRRTL级.edif网表文件的TMR,开源版本早已停止更新
    Mentor Precision Hi-RelRTL综合阶段TMR,采用细粒度TMR技术,基于汉明编码的安全状态机策略
    Synopsys Synplify
    Premier
    RTL综合阶段TMR,与Mentor的工具类似,网上可查阅的资料少
    基于HLS大幅缩短设计周期,提供流水线设计,减轻
    TMR设计带来的负面时序影响,
    对设计进行HLS空间探索
    TLegUpHLS阶段的TMR,构建该方向的大框架,受商业化的限制,更新停滞
    C-TMRC语言的TMR,可对设计进行HLS空间探索,还未形成完成工具
    基于软核对软核提供了功能完备的保护,但仅针对MicroBlaze提供TMR优化,使用范围单一局限Xilinx Vivado MicroBlaze TMR软核的TMR,5个IP组成的TMR子系统,自动管理和屏蔽影响MicroBlaze软核的故障
    下载: 导出CSV
  • [1] VON NEUMANN J. Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components[M]. SHANNON C E and MCCARTHY J. Automata Studies. Princeton: Princeton University Press, 1956: 43–98.
    [2] LYONS R E and VANDERKULK W. The use of triple-modular redundancy to improve computer reliability[J]. IBM Journal of Research and Development, 1962, 6(2): 200–209. doi: 10.1147/rd.62.0200
    [3] 黄影, 张春元, 刘东. SRAM型FPGA的抗SEU方法研究[J]. 中国空间科学技术, 2007(4): 57–65. doi: 10.3321/j.issn:1000-758X.2007.04.010

    HUANG Ying, ZHANG Chunyuan, and LIU Dong. Research on SEU mitigation of FPGA based-on SRAM[J]. Chinese Space Science and Technology, 2007(4): 57–65. doi: 10.3321/j.issn:1000-758X.2007.04.010
    [4] PRATT B, CAFFREY M, GRAHAM P, et al. Improving FPGA design robustness with partial TMR[C]. 2006 IEEE International Reliability Physics Symposium Proceedings, San Jose, USA, 2006: 226–232.
    [5] SAMUDRALA P K, RAMOS J, and KATKOORI S. Selective triple modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs[J]. IEEE Transactions on Nuclear Science, 2004, 51(5): 2957–2969. doi: 10.1109/TNS.2004.834955
    [6] GOMES I A C, MARTINS M, REIS A, et al. Using only redundant modules with approximate logic to reduce drastically area overhead in TMR[C]. 2015 16th Latin-American Test Symposium (LATS), Puerto Vallarta, Mexico, 2015: 1–6.
    [7] SHASHIDHARA B, JADHAV S, and KIM Y S. Reconfigurable fault tolerant processor on a SRAM based FPGA[C]. 2020 IEEE International Conference on Electro Information Technology (EIT), Chicago, USA, 2020: 151–154.
    [8] 段小虎, 马小博, 程俊强. SRAM工艺FPGA三模冗余设计故障管理与恢复[J]. 信息通信, 2020(3): 139–141,143. doi: 10.3969/j.issn.1673-1131.2020.03.059

    DUAN Xiaohu, MA Xiaobo, and CHENG Junqiang. Fault management and recovery of triple modular redundancy design for SRAM-based FPGA[J]. Information &Communications, 2020(3): 139–141,143. doi: 10.3969/j.issn.1673-1131.2020.03.059
    [9] 徐伟杰, 谢永乐, 彭礼彪, 等. 基于SRAM型FPGA的实时容错自修复系统设计方法[J]. 电子技术应用, 2019, 45(7): 50–55. doi: 10.16157/j.issn.0258-7998.190480

    XU Weijie, XIE Yongle, PENG Libiao, et al. SRAM based FPGA system capable of runtime fault tolerance and recovery[J]. Application of Electronic Technique, 2019, 45(7): 50–55. doi: 10.16157/j.issn.0258-7998.190480
    [10] 张超, 赵伟, 刘峥. 基于FPGA的三模冗余容错技术研究[J]. 现代电子技术, 2011, 34(5): 167–171. doi: 10.3969/j.issn.1004-373X.2011.05.051

    ZHANG Chao, ZHAO Wei, and LIU Zheng. Research of TMR-based fault-tolerance techniques based on FPGA[J]. Modern Electronics Technique, 2011, 34(5): 167–171. doi: 10.3969/j.issn.1004-373X.2011.05.051
    [11] NIKNAHAD M. Using Fine Grain Approaches for Highly Reliable Design of FPGA-Based Systems in Space[M]. Karlsruhe: KIT Scientific Publishing, 2013.
    [12] BENITES L A C. Automated design flow for applying triple modular redundancy in complex semi-custom digital integrated circuits[D]. [Master dissertation], Universidade Federal do Rio Grande Do Sul, 2018.
    [13] BENITES L A C and KASTENSMIDT F L. Automated design flow for applying Triple Modular Redundancy (TMR) in complex digital circuits[C]. 2018 IEEE 19th Latin-American Test Symposium (LATS), São Paulo, Brazil, 2018: 1–4.
    [14] BENITES L A C, BENEVENUTI F, DE OLIVEIRA Á B, et al. Reliability calculation with respect to functional failures induced by radiation in TMR arm cortex-M0 soft-core embedded into SRAM-based FPGA[J]. IEEE Transactions on Nuclear Science, 2019, 66(7): 1433–1440. doi: 10.1109/TNS.2019.2921796
    [15] BENEVENUTI F, CHIELLE E, TONFAT J, et al. Experimental applications on SRAM-based FPGA for the NanosatC-BR2 scientific mission[C]. 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, 2019: 140–146.
    [16] BERG M and LABEL K A. Verification of triple modular redundancy (TMR) insertion for reliable and trusted systems[C]. Proceedings of the Government Microcircuit Applications & Critical Technology Conference, Orlando, USA, 2016.
    [17] PRATT B, WIRTHLIN M, CAFFREY M, et al. Improving FPGA reliability in harsh environments using triple modular redundancy with more frequent voting[C]. Proceedings of the Prentice Hall. Military and Aerospace FPGA Applications Conference, Palm Beach, USA, 2007.
    [18] CANNON M J. Improving the single event effect response of triple modular redundancy on SRAM FPGAs through placement and routing[D]. [Ph. D. dissertation], Brigham Young University, 2019.
    [19] ROWBERRY H C. A soft-error reliability testing platform for FPGA-based network systems[D]. [Master dissertation], Brigham Young University, 2019.
    [20] STODDARD A G. Configuration scrubbing architectures for high-reliability FPGA systems[D]. [Master dissertation], Brigham Young University, 2015.
    [21] 严健生, 杨柳青. 卫星用SRAM型FPGA抗单粒子翻转可靠性设计研究[J]. 科技创新与应用, 2021(9): 48–50,53.

    YAN Jiansheng and YANG Liuqing. Reliability design of anti-single event upset (SEU) of SRAM-FPGA for satellites[J]. Technology Innovation and Application, 2021(9): 48–50,53.
    [22] HERRERA-ALZU I and LOPEZ-VALLEJO M. Design techniques for Xilinx Virtex FPGA configuration memory scrubbers[J]. IEEE Transactions on Nuclear Science, 2013, 60(1): 376–385. doi: 10.1109/TNS.2012.2231881
    [23] HOQUE K A. Early dependability analysis of FPGA-based space applications using formal verification[D]. [Ph. D. dissertation], Concordia University, 2016.
    [24] NAZAR G L, SANTOS L P, and CARRO L. Scrubbing unit repositioning for fast error repair in FPGAs[C]. 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, Montreal, Canada, 2013: 1–10.
    [25] NAZAR G L, SANTOS L P, and CARRO L. Fine-grained fast field-programmable gate array scrubbing[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2015, 23(5): 893–904. doi: 10.1109/TVLSI.2014.2330742
    [26] ZHANG Rongsheng, XIAO Liyi, CAO Xuebing, et al. A fast scrubbing method based on triple modular redundancy for SRAM-based FPGAs[C]. 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Qingdao, China, 2018: 1–3.
    [27] JOHNSON J M. Synchronization voter insertion algorithms for FPGA designs using triple modular redundancy[D]. [Master dissertation], Brigham Young University, 2010.
    [28] JOHNSON J M and WIRTHLIN M J. Voter insertion algorithms for FPGA designs using triple modular redundancy[C]. Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, USA, 2010: 249–258.
    [29] KHATRI A R, HAYEK A, and BÖRCSÖK J. RASP-TMR: An automatic and fast synthesizable Verilog code generator tool for the implementation and evaluation of TMR approach[J]. International Journal of Advanced Computer Science and Applications, 2018, 9(8): 590–597. doi: 10.14569/IJACSA.2018.090875
    [30] KHATRI A R. Overview of fault tolerance techniques and the proposed TMR generator tool for FPGA designs[J]. International Journal of Advanced Computer Science and Applications, 2020, 11(4): 749–753. doi: 10.14569/IJACSA.2020.0110497
    [31] KULIS S. Single event effects mitigation with TMRG tool[J]. Journal of Instrumentation, 2017, 12: C01082. doi: 10.1088/1748-0221/12/01/C01082
    [32] CERN. Triple Modular Redundancy Generator (TMRG)[EB/OL]. https://tmrg.web.cern.ch/tmrg/tmrg.pdf, 2020.
    [33] KULIS S. Single event upsets mitigation techniques[EB/OL]. https://indico.cern.ch/event/465343/attachments/1256299/1854682/tmrg_skulis_ep_ese.pdf, 2016.
    [34] Xilinx. Xilinx TMRTool Industry’s first triple modular redundancy development tool for re-configurable FPGAs[EB/OL]. https://www.xilinx.com/publications/prod_mktg/TRMTool-2015.pdf, 2015.
    [35] CARMICHAEL C. Triple module redundancy design techniques for Virtex FPGAs[EB/OL]. Xilinx Application Note XAPP197, https://china.xilinx.com/content/dam/xilinx/support/documents/application_notes/xapp197.pdf, 2001.
    [36] Xilinx. Xilinx TMRTool User Guide: TMRTool software Version 13.2[EB/OL]. https://www.xilinx.com/content/dam/xilinx/support/documents/user_guides/ug156-tmrtool.pdf, 2017.
    [37] WIRTHLIN M. The benefits of feedback TMR for SEU tolerance of SRAM FPGA designs[EB/OL]. https://indico.esa.int/event/130/contributions/723/attachments/781/958/ESA_SEFUW_TMR_March_2016-3.pdf, 2016.
    [38] ANWER J, PLATZNER M, and MEISNER S. FPGA redundancy configurations: An automated design space exploration[C]. 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, Phoenix, USA, 2014: 275–280.
    [39] DANG Wansheng. FPGA radiation effects mitigation technology on logic synthesis[EB/OL]. 2020.
    [40] GRAPHICS M. Precision Hi-Rel synthesis software[EB/OL]. https://eda.sw.siemens.com/en-US/ic/precision/hi-rel/, 2018.
    [41] MERKELOV F. Design techniques for implementing highly reliable designs using FPGAs[EB/OL]. https://www.microsemi.com/document-portal/doc_view/132934-design-techniques-for-implementing-high-reliable-designs-using-microsemi-space-fpgas-russia-2013, 2013.
    [42] LEE G, AGIAKATSIKAS D, WU Tong, et al. TLegUp: A TMR code generation tool for SRAM-based FPGA applications using HLS[C]. 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, USA, 2017: 129–132.
    [43] BERNARDI M, CETIN E, and DIESSEL O. Correct high level synthesis of triple modular redundant user circuits for FPGAs[R]. UNSW-CSE-TR-201804, 2018.
    [44] AGIAKATSIKAS D. High-level synthesis of triple modular redundant FPGA circuits with energy efficient error recovery mechanisms[D]. [Ph. D. dissertation], University of New South Wales, 2019.
    [45] ZHU Zhiqi, TAHER F N, and SCHAFER B C. Exploring design trade-offs in fault-tolerant behavioral hardware accelerators[C]. Proceedings of the 2019 on Great Lakes Symposium on VLSI, Tysons Corner, USA, 2019: 291–294.
    [46] PARVIS M and AGNELLO M. High-energy physics fault tolerance metrics and testing methodologies for SRAM-based FPGAs[D]. [Master dissertation], Politecnico di Torino, 2018.
    [47] Xilinx. Microblaze triple modular redundancy(TMR) subsystem v1.0: Product guide[EB/OL]. https://www.xilinx.com/support/documentation/ip_documentation/tmr/v1_0/pg268-tmr.pdf, 2019.
  • 加载中
图(12) / 表(1)
计量
  • 文章访问数:  1498
  • HTML全文浏览量:  527
  • PDF下载量:  213
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-04-20
  • 修回日期:  2022-03-23
  • 网络出版日期:  2022-04-12
  • 刊出日期:  2022-06-21

目录

    /

    返回文章
    返回