A Low-Power Network-on-Chip Power-Gating Design with Bypass Mechanism
-
摘要: 随着技术尺寸的缩小,静态功耗在片上网络 (NoC)的功耗开销中占据主导地位。功率门控作为一种通用的功耗节约技术,将NoC中空闲模块关闭以降低静态功耗。然而,传统的功率门控技术带来了诸如数据包唤醒延迟,盈亏平衡时间等问题。为了解决上述问题,该文提出代替功率门控路由器进行数据包传输的分区旁路传输机制 (PBTI),并基于该旁路机制设计了低延迟低功耗的功率门控方案。PBTI使用相互独立的旁路分别处理东西方向传输的数据包,并在旁路内部使用公共的缓冲区以提高缓冲区利用率。PBTI可以在路由器断电时实现数据包的注入、传输和弹出。即使网络中所有的路由器均处于功率门控状态,数据包也可以从源节点传输到目的节点。当流量增大超过PBTI的传输能力时,路由器以列为单位进行统一的唤醒。实验结果表明,与不使用功率门控的NoC相比,所提方案降低了83.4%的静态功耗和17.2%的数据包延迟,同时只额外增加了6.2%的面积开销。相较于常规的功率门控方案该文功率门控设计实现了更低的功耗和延迟,具有显著的优势。Abstract: Static power consumption dominates the power overhead of Network-on-Chip (NoC) as the technology size shrinks. Power gating, a generalized power saving technique, turns off idle modules in NoCs to reduce static power consumption. However, the conventional power gating technique brings problems such as packet wake-up delay, break-even time, etc. To solve the above problems, the Partition Bypass Transmission Infrastructure (PBTI), which replaces the power gated router for packet transmission, is proposed in this paper, and a low-latency, low-power power gating scheme has been designed based upon this bypass mechanism. PBTI uses mutually independent bypasses to handle east-west packets separately, and uses common buffers within the bypasses to improve buffer utilization. PBTI can inject, transmit, and eject packets when the router is powered off. Packets can be transmitted from the source node to the destination node even if all routers in the network are power gated. When the traffic increases beyond the transmission capacity of PBTI, the routers perform a uniform wake-up in columns. Experimental results show that compared to the NoC without power gating, the scheme in this paper reduces 83.4% of static power consumption and 17.2% of packet delay, while adding only 6.2% additional area overhead. Compared to the conventional power gating scheme the power gated design in this paper achieves lower power consumption and delay, which is a significant advantage.
-
Key words:
- Network-on-Chip /
- Power gating /
- Bypass /
- Static power
-
1 缓冲区平衡路由算法
输入: destination address of the packet D, buffer available
signals from neighboring disconnected routers Available,
address of the local router R输出: the packet routing port Direction Begin 1. IF((Available.E==0||Available.W==0)&&(Available.N==1)
&&(R.y<D.y)) THEN2. //using YX routing algorithm 3. Direction=North; 4. ELSE IF((Available.E==0||Available.W==0)&&(Available.S==1)
&&(R.y>D.y)) THEN5. //using YX routing algorithm 6. Direction=South; 7. ELSE 8. //using XY routing algorithm 9. IF(R.x<D.x) THEN 10. Direction=East; 11. ELSE IF(R.x>D.x) THEN Direction=Wast; 12. ELSE IF(R.y<D.y) THEN Direction=North; 13. ELSE IF(R.y>D.y) THEN Direction=South; 14. ELSE Direction=Local; 15. END IF 16. END IF 17. END 表 1 实验基本参数设置表
参数 设置 网络拓扑 8×8 Mesh 缓冲区大小/端口 8 flits 虚通道数量/端口 2 数据包大小 2~6 flits 路由算法 XY,缓冲区平衡路由算法 传输链路宽度 32 bits 路由器频率 1 GHz 流量模式 均匀随机,转置,洗牌 路由器唤醒延迟 8 cycles 盈亏平衡时间 10 cycles 路由器断电等待时间 4 cycles -
[1] MONEMI A, PÉREZ I, LEYVA N, et al. PlugSMART: A pluggable open-source module to implement multihop bypass in networks-on-chip[C]. The 15th IEEE/ACM International Symposium on Networks-on-Chip, Madison, USA, 2021: 41–48. [2] SUN Chenglong, OUYANG Yiming, and LU Yingchun. DCBuf: A high-performance wireless network-on-chip architecture with distributed wireless interconnects and centralized buffer sharing[J]. Wireless Networks, 2022, 28(2): 505–520. doi: 10.1007/s11276-021-02882-x. [3] OUYANG Yiming, XU Dongyu, CHEN Zhimou, et al. REE: Reconfigurable and energy-efficient router architecture in wireless network-on-chip[J]. Microelectronics Journal, 2022, 129: 105600. doi: 10.1016/j.mejo.2022.105600. [4] CHEN Hui, CHEN Peng, ZHOU Jun, et al. ArSMART: An improved SMART NoC design supporting arbitrary-turn transmission[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(5): 1316–1329. doi: 10.1109/TCAD.2021.3091961. [5] SUN Chenglong, OUYANG Yiming, and LIANG Huaguo. Architecting a congestion pre-avoidance and load-balanced wireless network-on-chip[J]. Journal of Parallel and Distributed Computing, 2022, 161: 143–154. doi: 10.1016/j.jpdc.2021.12.003. [6] DAYA B K, CHEN C H O, SUBRAMANIAN S, et al. SCORPIO: A 36-core research chip demonstrating snoopy coherence on a scalable mesh NoC with in-network ordering[J]. ACM SIGARCH Computer Architecture News, 2014, 42(3): 25–36. doi: 10.1145/2678373.2665680. [7] KIM J S, TAYLOR M B, MILLER J, et al. Energy characterization of a tiled architecture processor with on-chip networks[C]. 2003 International Symposium on Low Power Electronics and Design, Seoul, Korea (South), 2003: 424–427. doi: 10.1109/LPE.2003.1231942. [8] WOO S C, OHARA M, TORRIE E, et al. The SPLASH-2 programs: Characterization and methodological considerations[J]. ACM SIGARCH Computer Architecture News, 1995, 23(2): 24–36. doi: 10.1145/225830.223990. [9] FARROKHBAKHT H, KAMALI H M, and HESSABI S. SMART: A scalable mapping and routing technique for power-gating in NoC routers[C]. 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip, Seoul, Korea (South), 2017: 1–8. [10] ZHOU Wu, OUYANG Yiming, LI Jianhua, et al. A transparent virtual channel power gating method for on-chip network routers[J]. Integration, 2023, 88: 286–297. doi: 10.1016/j.vlsi.2022.10.004. [11] SAMIH A, WANG Ren, KRISHNA A, et al. Energy-efficient interconnect via Router Parking[C]. 2013 IEEE 19th International Symposium on High Performance Computer Architecture, Shenzhen, China, 2013: 508–519. doi: 10.1109/HPCA.2013.6522345. [12] WANG Peng, NIKNAM S, WANG Zhiying, et al. A novel approach to reduce packet latency increase caused by power gating in network-on-chip[C]. 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip, Seoul, Korea (South), 2017: 1–8. [13] XU Dongyu, OUYANG Yiming, ZHOU Wu, et al. Improving power and performance of on-chip network through virtual channel sharing and power gating[J]. Integration, 2023, 93: 102059. doi: 10.1016/j.vlsi.2023.102059. [14] CHEN Lizhong and PINKSTON T M. NoRD: Node-router decoupling for effective power-gating of on-chip routers[C]. 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, Vancouver, Canada, 2012: 270–281. doi: 10.1109/MICRO.2012.33. [15] FARROKHBAKHT H, TARAM M, KHALEGHI B, et al. TooT: An efficient and scalable power-gating method for NoC routers[C]. 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip, Nara, Japan, 2016: 1–8. doi: 10.1109/NOCS.2016.7579326.