Active Accelerated Recovery for Extended Chip Lifetime: Opportunities and Challenges
-
摘要: 新型工艺下芯片集成度的提高和尺寸的缩小导致了器件内部电场和电流密度的不断增加,使得老化问题日趋严重,当前针对老化主要的防护思路依然是采取保护带和预留时序裕量的方式,但该方法会导致过度设计。近年来,多项研究工作从实验的角度证明了芯片的主要老化机制具有一定的可恢复性,恢复过程且可被加速,从而大幅降低设计初期的时序裕量,由此启发了主动加速恢复的设计思路。该文回顾了老化防护的已有设计方法和主动加速恢复的相关进展,分析了主动加速恢复的潜在优势,并讨论了从模型、电路设计以及系统设计等角度进行片上实现所面临的瓶颈问题和相应的解决方法,提出了以感知-主动加速恢复为核心的自适应老化防护设计概念。Abstract: The higher level of integration and smaller feature size in advanced technology nodes have led to increased electrical field and current density, which worsen further the chip aging issues. Current design solutions against aging are still based on guardband or extra timing margins, which can lead to overdesign. In recent years, multiple research work have demonstrated with experiments that several dominating aging effects can be recovered, and this recovery can be further accelerated. The necessary timing margin can be significantly lowered, inspiring the active accelerated recovery design concept. In this paper, current design solutions against aging and the progress that has been made in active accelerated recovery are reviewed. Potential opportunities introduced by this new method have been identified. Based on aspects of modeling, circuit design and system design, design challenges and their potential solutions to enable on-chip implementations are investigated. A concept of adaptive system that is based on sensing and active accelerated recovery has been proposed.
-
图 11 可用作主动加速恢复的负偏置电压发生器电路原理图(基于文献[16]进行了修改)
图 12 支持供电网络逆向电流的门控电路及其性能评估(基于文献[39]进行了修改)
-
[1] HILL I, CHANAWALA P, SINGH R, et al. CMOS reliability from past to future: A survey of requirements, trends, and prediction methods[J]. IEEE Transactions on Device and Materials Reliability, 2022, 22(1): 1–18. doi: 10.1109/TDMR.2021.3131345 [2] GUO Xinfei, VERMA V, GONZALEZ-GUERRERO P, et al. When “things” get older: Exploring circuit aging in IoT applications[C]. 2018 19th International Symposium on Quality Electronic Design (ISQED), Santa Clara, USA, 2018: 296–301. [3] TAN S, TAHOORI M, KIM T, et al. Long-Term Reliability of Nanometer VLSI Systems[M]. Cham: Springer, 2019. [4] CAO Yu, VELAMALA J, SUTARIA K, et al. Cross-layer modeling and simulation of circuit reliability[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2013, 33(1): 8–23. doi: 10.1109/TCAD.2013.2289874 [5] WANG Runsheng, ZHANG Zuodong, SUN Zixuan, et al. Cross-layer design for reliability in advanced technology nodes: An EDA perspective[C]. 2022 IEEE 16th International Conference on Solid-State & Integrated Circuit Technology (ICSICT), Nangjing, China, 2022: 1–4. [6] LIU Changze, REN Pengpeng, SUN Yongsheng, et al. Reliability challenges in advanced technology node: From transistor to circuit (invited)[C]. 2020 IEEE 15th International Conference on Solid-State & Integrated Circuit Technology (ICSICT), Kunming, China, 2020: 1–4. [7] KUMAR P, KOLEY K, ASKARI S S A, et al. Assessment of negative bias temperature instability due to interface and oxide trapped charges in gate-all-around TFET devices[J]. IEEE Transactions on Nanotechnology, 2023, 22: 157–165. doi: 10.1109/TNANO.2023.3255012 [8] BRAVAIX A, GUÉRIN C, HUARD V, et al. Hot-carrier acceleration factors for low power management in DC-AC stressed 40nm NMOS node at high temperature[C]. 2009 IEEE International Reliability Physics Symposium, Montreal, Canada, 2009: 531–548. [9] GARBA-SEYBOU T, FEDERSPIEL X, BRAVAIX A, et al. New modelling off-state TDDB for 130nm to 28nm CMOS nodes[C]. 2022 IEEE International Reliability Physics Symposium (IRPS), Dallas, USA, 2022: 11A. 3-1–1-11A. 3-7. [10] ROTHE S and LIENIG J. Combined modeling of electromigration, thermal and stress migration in AC interconnect lines[C/OL]. The 2023 International Symposium on Physical Design, 2023: 107–114. [11] YU Liting, REN Jianguo, LU Xian, et al. NBTI and HCI aging prediction and reliability screening during production test[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(10): 3000–3011. doi: 10.1109/TCAD.2019.2961329 [12] BERNSTEIN K, FRANK D J, GATTIKER A E, et al. High-performance CMOS variability in the 65-nm regime and beyond[J]. IBM Journal of Research and Development, 2006, 50(4/5): 433–449. doi: 10.1147/rd.504.0433 [13] JI Zhigang, CHEN Haibao, and LI Xiuyan. Design for reliability with the advanced integrated circuit (IC) technology: Challenges and opportunities[J]. Science China Information Sciences, 2019, 62(12): 226401. doi: 10.1007/s11432-019-2643-5 [14] LEE K D. Electromigration recovery and short lead effect under bipolar-and unipolar-pulse current[C]. 2012 IEEE International Reliability Physics Symposium (IRPS), Anaheim, USA, 2012: 6B. 3.1–6B. 3.4. [15] MAHAPATRA S. Fundamentals of Bias Temperature Instability in MOS Transistors[M]. New Delhi: Springer, 2016. [16] GUO Xinfei and STAN M R. Circadian Rhythms for Future Resilient Electronic Systems[M]. Cham: Springer, 2020. [17] EVMORFOPOULOS N, SHOHEL M A A, AXELOU O, et al. Recent progress in the analysis of electromigration and stress migration in large multisegment interconnects[C/OL]. The 2023 International Symposium on Physical Design, 2023: 115–123. [18] 朱炯, 易茂祥, 张姚, 等. 缓解电路NBTI效应的改进门替换技术[J]. 电子测量与仪器学报, 2016, 30(7): 1029–1036. doi: 10.13382/j.jemi.2016.07.007ZHU Jiong, YI Maoxiang, ZHANG Yao, et al. Improved gate replacement technique for mitigating circuit NBTI effect[J]. Journal of Electronic Measurement and Instrumentation, 2016, 30(7): 1029–1036. doi: 10.13382/j.jemi.2016.07.007 [19] ZHANG Xinfa, ZHANG Zuodong, LIN Yibo, et al. Efficient aging-aware standard cell library characterization based on sensitivity analysis[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2023, 70(2): 721–725. doi: 10.1109/TCSII.2022.3212123 [20] ZHANG Zuodong, WANG Runsheng, ZHANG Zhe, et al. Reliability-enhanced circuit design flow based on approximate logic synthesis[C]. The 2020 on Great Lakes Symposium on VLSI, Beijing, China, 2020: 71–76. [21] CHEN Xiaodao, LIAO Chen, WEI Tongquan, et al. An interconnect reliability-driven routing technique for electromigration failure avoidance[J]. IEEE Transactions on Dependable and Secure Computing, 2012, 9(5): 770–776. doi: 10.1109/TDSC.2010.57 [22] GUO Xinfei and STAN M R. MCPENS: Multiple-critical-path embeddable NBTI sensors for dynamic wearout management[C]. The 11th IEEE Workshop on Silicon Errors in Logic System Effects, Austin, USA, 2015: 116–121. [23] PARK G, YU Hanzhao, KIM M, et al. An all BTI (N-PBTI, N-NBTI, P-PBTI, P-NBTI) odometer based on a dual power rail ring oscillator array[C]. 2021 IEEE International Reliability Physics Symposium (IRPS), Monterey, USA, 2021: 1–5. [24] ANGHEL L and CACHO F. Design-time exploration for process, environment and aging compensation techniques for low power reliable-aware design[J]. IEEE Transactions on Emerging Topics in Computing, 2022, 10(2): 581–590. doi: 10.1109/TETC.2021.3136288 [25] MINTARNO E, SKAF J, ZHENG Rui, et al. Self-tuning for maximized lifetime energy-efficiency in the presence of circuit aging[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2011, 30(5): 760–773. doi: 10.1109/TCAD.2010.2100531 [26] CHEN Yuguang, LIN I C, and WEI Yongche. A novel NBTI-aware chip remaining lifetime prediction framework using machine learning[C]. 2021 22nd International Symposium on Quality Electronic Design (ISQED), Santa Clara, USA, 2021: 476–481. [27] BU Ranran, REN Zhipeng, GE Hao, et al. Online NBTI-induced partially depleted (PD) SOI degradation and recovery prediction utilizing long short-term memory (LSTM)[J]. Microelectronics Reliability, 2023, 142: 114932. doi: 10.1016/j.microrel.2023.114932 [28] LI Zeyu, HUANG Zhao, WANG Quan, et al. AMROFloor: An efficient aging mitigation and resource optimization floorplanner for virtual coarse-grained runtime reconfigurable FPGAs[J]. Electronics, 2022, 11(2): 273. doi: 10.3390/electronics11020273 [29] SUN Peng, YANG Zhiming, YU Yang, et al. NBTI and power reduction using an input vector control and supply voltage assignment method[J]. Algorithms, 2017, 10(3): 94. doi: 10.3390/a10030094 [30] WANG Shengcheng, SUN Zeyu, CHENG Yuan, et al. Leveraging recovery effect to reduce electromigration degradation in power/ground TSV[C]. 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Irvine, USA, 2017: 811–818. [31] GUO Xinfei, BURLESON W, and STAN M. Modeling and experimental demonstration of accelerated self-healing techniques[C]. 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), San Francisco, USA, 2014: 1–6. [32] VELAMALA J B, SUTARIA K, SATO T, et al. Physics matters: Statistical aging prediction under trapping/detrapping[C]. The 49th Annual Design Automation Conference, San Francisco, USA, 2012: 139–144. [33] STATHIS J H, MAHAPATRA S, and GRASSER T. Controversial issues in negative bias temperature instability[J]. Microelectronics Reliability, 2018, 81: 244–251. doi: 10.1016/j.microrel.2017.12.035 [34] PALUMBO F, KLEBANOV M, MONREAL G, et al. Physical origin of the permanent components of the positive charge buildup resulting from NBTI/PBTI stress in nMOS/pMOS transistors[C]. 2022 IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA), Singapore, 2022: 1–5. [35] TAO Jiang, CHEN J F, CHEUNG N W, et al. Modeling and characterization of electromigration failures under bidirectional current stress[J]. IEEE Transactions on Electron Devices, 1996, 43(5): 800–808. doi: 10.1109/16.491258 [36] RINGLER I J and LLOYD J R. Stress relaxation in pulsed DC electromigration measurements[J]. AIP Advances, 2016, 6(9): 095118. doi: 10.1063/1.4963669 [37] GUO Xinfei and STAN M R. Work hard, sleep well-avoid irreversible IC wearout with proactive rejuvenation[C]. 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), Macao, China, 2016: 649–654. [38] KAVOUSI M, CHEN Liang, and TAN S X D. Fast electromigration stress analysis considering spatial joule heating effects[C]. 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), Taipei, China, 2022: 208–213. [39] GUO Xinfei and STAN M R. Deep healing: Ease the BTI and EM wearout crisis by activating recovery[C]. 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), Denver, USA, 2017: 184–191. [40] GUO Xinfei and STAN M R. Implications of accelerated self-healing as a key design knob for cross-layer resilience[J]. Integration, 2017, 56: 167–180. doi: 10.1016/j.vlsi.2016.10.008 [41] TING L M, MAY J S, HUNTER W R, et al. AC electromigration characterization and modeling of multilayered interconnects[C]. 31st Annual Proceedings Reliability Physics 1993, Atlanta, USA, 1993: 311–316.