A Scalable CPU–FPGA Heterogeneous Cluster for Real-time Power System Simulation
-
摘要: 高频开关器件的大量接入,以及可再生能源与电压源型变流器(VSCs)的深度融合,使电力系统仿真面临微秒级暂态分析和亚微秒步长仿真的挑战。现有仿真器在应对包含数百个电力电子开关的系统时,普遍存在计算扩展性不足、通信延迟偏高的问题。为此,该文提出一种面向实时电力系统仿真的中央处理器-现场可编程门阵列(CPU-FPGA)异构集群架构,能够在1 μs步长下完成对480个开关器件构成的新能源系统的实时仿真。该系统主要包含3项核心创新:(1)提出基于时间步长解耦的计算负载感知调度策略,实现4片FPGA的并行计算调度;(2)结合混合精度量化与矩阵-向量重组技术,相较传统定点方法,在400 ns计算窗口内实现资源占用大幅下降,查找表(LUT)、触发器(F)与数字信号处理单元(DSP)分别降低32.0%, 24.2%与43.8%;(3)基于数据平面开发工具包(DPDK)设计零拷贝通信机制,实现29 μs的端到端通信延迟。实验验证表明,该系统验证了异构集群架构在大规模高频电力系统仿真中的有效性,并具备良好的扩展性与工程应用潜力。Abstract:
Objective This study aims to design and implement a scalable CPU–FPGA heterogeneous cluster for real-time simulation of high-frequency power electronic systems. With the increasing adoption of wide-bandgap semiconductor devices such as SiC and GaN, modern power systems exhibit complex switching dynamics that require sub-microsecond timestep resolution. This work focuses on the real-time modeling and simulation of 80 Voltage Source Converters (VSCs), equivalent to 480 switches, representing a typical scenario in renewable-integrated power grids with high switching frequency. Three major technical challenges are addressed: (1) enabling efficient task scheduling across multiple FPGAs to support large-scale parallel computation while maintaining load balance; (2) reducing hardware resource usage through precision-aware hybrid quantization that preserves accuracy with reduced bitwidth; and (3) minimizing CPU–FPGA communication latency via a high-throughput, low-latency data exchange framework to ensure stable synchronization between slow and fast subsystems. This work contributes to the development of a practical and extensible platform for simulating future power systems with complex electronic components. Methods To enable real-time simulation with sub-microsecond resolution, the system partitions the power system model into a slow subsystem (AC/DC network) and a fast subsystem (multiple VSCs), following a decoupled computation strategy. A Computation Load-Aware Scheduling (CLAS) strategy is employed to allocate tasks across four Xilinx XCKU060 FPGAs ( Fig. 1 andFig. 2 ), supporting parallel simulation of up to 80 VSCs. The slow subsystem is executed on the CPU using high-precision floating-point arithmetic with a 50 μs timestep. The fast subsystem is implemented on the FPGAs using fixed-point arithmetic at a 1 μs timestep (Fig. 3 andFig. 4 ). A hybrid-precision quantization scheme is adopted: voltage-processing modules use Q(48,30) format to retain numerical precision, whereas current-dominant modules use Q(48,20) to avoid overflow. The FPGA-based Matrix–Vector Multiplication (MVM) is partitioned into two sub-modules (Sub MVM1 and Sub MVM2), leveraging row-level parallelism and pipelined streaming to achieve 400 ns latency per cycle. For communication, a Data Plane Development Kit (DPDK)-based zero-copy framework with lock-free queues is implemented between the CPU and FPGA, reducing latency to 29 μs and enabling reliable synchronization between fast and slow subsystems.Results and Discussions The proposed system successfully achieves real-time simulation of a wind farm model comprising 80 VSCs using four Xilinx XCKU060 FPGA boards. Each FPGA supports 20 VSCs operating at a 1 μs timestep, with a computation latency of 400 ns, demonstrating the system’s ability to satisfy stringent real-time constraints. The hybrid-precision quantization strategy yields substantial resource savings relative to a 64-bit fixed-point baseline: LookUp Table (LUT) usage is reduced by 32.0%, Flip-Flops (FFs) by 24.2%, and Digital Signal Processors (DSPs) by 43.8%, while preserving simulation accuracy ( Table 1 ). These optimizations support scalable deployment without loss of fidelity. Communication between the CPU and FPGA is handled by a DPDK-based zero-copy framework with lock-free queues, achieving an end-to-end latency of 29 μs. This ensures robust synchronization between the slow and fast subsystems. Compared with existing FPGA-based designs, the proposed architecture provides a more resource-efficient solution (Table 1 ), delivering sub-microsecond simulation performance with reduced hardware cost and enabling multi-VSC deployment per FPGA. These findings highlight the platform’s applicability for large-scale industrial power system simulation (Fig. 6 ).Conclusions This study presents a CPU–FPGA heterogeneous cluster designed for real-time simulation of large-scale power systems. The system employs a decoupled, CLAS strategy that enables efficient resource distribution across multiple FPGAs. Real-time requirements are fully met, and the use of hybrid-precision quantization substantially reduces FPGA resource consumption without sacrificing accuracy. The system demonstrates scalability and efficiency by supporting up to 80 VSCs across four FPGA boards. Compared with existing solutions, the proposed architecture achieves the lowest resource utilization while maintaining sub-microsecond resolution, making it a practical platform for industrial-grade power system simulation. -
Key words:
- Real-time simulation /
- FPGA cluster /
- Quantization method /
- Power system
-
表 1 不同FPGA电网实时仿真加速器的性能对比
L/C[25] G-ADC[26] SNP[10] On-off[26] MNA[27] IEM[16] 本文 本文 本文 本文 年份 2019 2023 2019 2023 2023 2024 2025 2025 2025 2025 VSC规模 1 1 1 1 1 1 1 1 20 14 开关数量 6 6 6 6 6 6 6 6 120 84 芯片型号 7K325T 7K325T 7V485T KU060 7V485T 7K325T KU060 KU060 KU060 PG3T1300 仿真架构 FPGA FPGA FPGA FPGA FPGA FPGA CPU-FPGA CPU-FPGA CPU-FPGA CPU-FPGA 运行频率(MHz) - - 175 - - 100 100 100 100 100 查找表 48 374 50 734 142 296 16 773 16 988 23 731 11 891 8 086 196 701 173 623 触发器 51 874 53 350 147 317 NA 16 024 15 753 21 237 16 084 325 505 326 389 块内存 91 91 258 60 NA 31 3 3 60 24 DSP/APM 157 211 361 33 468 128 256 144 2 760 3 124 延迟(ns) 463 475 800 1000 100 500 400 400 400 400 误差(%) >10 >5 5 1 NA NA 0.7 0.9 0.9 0.9 量化方法 定点 定点 定点 定点 定点 定点 定点 混合量化 混合量化 混合量化 -
[1] CHAUHAN S and TUMMURU N R. An improvised modulation and control approach for dual active bridge DC–DC converter system[J]. IEEE Transactions on Industrial Electronics, 2024, 71(4): 3572–3582. doi: 10.1109/TIE.2023.3273248. [2] SPÍN-SARZOSA D, PALMA-BEHNKE R, CAÑIZARES C A, et al. Microgrid modeling for stability analysis[J]. IEEE Transactions on Smart Grid, 2024, 15(3): 2459–2479. doi: 10.1109/TSG.2023.3326063. [3] PATRA S and SINGHA A K. An event-driven sampling mechanism for digital average current-mode controlled boost converter[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2024, 71(3): 1456–1460. doi: 10.1109/TCSII.2023.3321890. [4] ZHENG Jiain, ZENG Yangbin, ZHAO Zhengming, et al. A semi-implicit parallel leapfrog solver with half-step sampling technique for FPGA-based real-time HIL simulation of power converters[J]. IEEE Transactions on Industrial Electronics, 2024, 71(3): 2454–2464. doi: 10.1109/TIE.2023.3265042. [5] XU Zhenyu, YU Miaoxiang, CAI J, et al. A novel FPGA-based circuit simulator for accelerating reinforcement learning-based design of power converters[C]. 2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP), Porto, Portugal, 2023: 1–9. doi: 10.1109/ASAP57973.2023.00013. [6] CHEN Zibo and HUANG A Q. High performance SiC power module based on repackaging of discrete SiC devices[J]. IEEE Transactions on Power Electronics, 2023, 38(8): 9306–9310. doi: 10.1109/TPEL.2023.3263466. [7] JOLLY N and MALLIK A. Parasitic mismatch mitigation for fast switching modular power semiconductor devices[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024, 71(1): 485–498. doi: 10.1109/TCSI.2023.3316208. [8] DEBNATH S and CHOI J. Electromagnetic transient (EMT) simulation algorithms for evaluation of large-scale extreme fast charging systems (T& D models)[J]. IEEE Transactions on Power Systems, 2023, 38(5): 4069–4079. doi: 10.1109/TPWRS.2022.3212639. [9] GROBE J, WEIHS L, HANHART M, et al. Monolithic integration of a 400V GaN half-bridge converter with output voltage regulation[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2024, 71(10): 4591–4595. doi: 10.1109/TCSII.2024.3398783. [10] MIRZAHOSSEINI R and IRAVANI R. Small time-step FPGA-based real-time simulation of power systems including multiple converters[J]. IEEE Transactions on Power Delivery, 2019, 34(6): 2089–2099. doi: 10.1109/TPWRD.2019.2933610. [11] BIEBER L, WANG Liwei, JATSKEVICH J, et al. Universal equivalent model for real-time CPU/FPGA co-simulation of hybrid cascaded multilevel converters[J]. IEEE Access, 2023, 11: 4228–4241. doi: 10.1109/ACCESS.2023.3235272. [12] DONG Zerui, GREGOIRE L A, VIPIN V N, et al. Real-time implementation of a dual-active-bridge based multi-level photovoltaic converter[C]. 2021 IEEE 12th International Symposium on Power Electronics for Distributed Generation Systems (PEDG), Chicago, USA, 2021: 1–6. doi: 10.1109/PEDG51384.2021.9494236. [13] XU Zhenyu, YU Miaoxiang, CAI J, et al. A finite-difference time-domain (FDTD) solver with linearly scalable performance in an FPGA cluster[C]. 2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, USA, 2023: 307–317. doi: 10.1109/CLUSTER52292.2023.00033. [14] LIAO Haohao, ELMOHR M A, DONG Xuan, et al. TurboHE: Accelerating Fully Homomorphic Encryption using FPGA clusters[C]. 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), St. Petersburg, USA, 2023: 788–797. doi: 10.1109/IPDPS54959.2023.00084. [15] KOZAK J P, ZHANG Ruizhe, PORTER M, et al. Stability, reliability, and robustness of GaN power devices: A review[J]. IEEE Transactions on Power Electronics, 2023, 38(7): 8442–8471. doi: 10.1109/TPEL.2023.3266365. [16] WANG Can, WANG Qinsheng, WENG Haowen, et al. A modified algorithm for the L/C-based switch model of power converters in real-time simulation based on FPGA[J]. IEEE Transactions on Industry Applications, 2024, 60(5): 7030–7037. doi: 10.1109/TIA.2024.3407031. [17] LI Zirun, XU Jin, WANG Keyou, et al. An FPGA-based hierarchical parallel real-time simulation method for cascaded solid-state transformer[J]. IEEE Transactions on Industrial Electronics, 2023, 70(4): 3847–3856. doi: 10.1109/TIE.2022.3181408. [18] BAI Hao, LIU Chen, RATHORE A K, et al. An FPGA-based IGBT behavioral model with high transient resolution for real-time simulation of power electronic circuits[J]. IEEE Transactions on Industrial Electronics, 2019, 66(8): 6581–6591. doi: 10.1109/TIE.2018.2870354. [19] XU Jin, WANG Keyou, WU Pan, et al. FPGA-based submicrosecond-level real-time simulation of solid-state transformer with a switching frequency of 50 kHz[J]. IEEE Journal of Emerging and Selected Topics in Power Electronics, 2021, 9(4): 4212–4224. doi: 10.1109/JESTPE.2020.3037233. [20] LI Zirun, XU Jin, WANG Keyou, et al. A discrete small-step synthesis real-time simulation method for power converters[J]. IEEE Transactions on Industrial Electronics, 2022, 69(4): 3667–3676. doi: 10.1109/TIE.2021.3076702. [21] BLANCHETTE H F, OULD-BACHIR T, and DAVID J P. A state-space modeling approach for the FPGA-based real-time simulation of high switching frequency power converters[J]. IEEE Transactions on Industrial Electronics, 2012, 59(12): 4555–4567. doi: 10.1109/TIE.2011.2182021. [22] GAO Shilin, CHEN Ying, SONG Yankan, et al. An efficient half-bridge MMC model for EMTP-type simulation based on hybrid numerical integration[J]. IEEE Transactions on Power Systems, 2024, 39(1): 1162–1177. doi: 10.1109/TPWRS.2023.3262584. [23] 曹阳, 顾伟, 柳伟, 等. 基于交叉初始化的换流器参数化恒导纳模型[J]. 中国电机工程学报, 2021, 41(10): 3518–3527. doi: 10.13334/j.0258-8013.pcsee.201045.CAO Yang, GU Wei, LIU Wei, et al. A parameterized fixed-admittance model of converters based on cross initialization[J]. Proceedings of the CSEE, 2021, 41(10): 3518–3527. doi: 10.13334/j.0258-8013.pcsee.201045. [24] 紫光同创. 紫光同创官网[EB/OL]. https: //www. pangomicro. com/, 2025. PANGO Microelectronics. PANGO Microelectronics Official Website[EB/OL]. https://www.pangomicro.com/, 2025. [25] WANG Keyou, XU Jin, LI Guojie, et al. A generalized associated discrete circuit model of power converters in real-time simulation[J]. IEEE Transactions on Power Electronics, 2019, 34(3): 2220–2233. doi: 10.1109/TPEL.2018.2845658. [26] FAN Xinran, LIU Sijia, JIANG Chengdong, et al. VSC converter real-time simulation modeling method research and FPGA implementation[C]. 2023 Panda Forum on Power and Energy (PandaFPE), Chengdu, China, 2023: 466–471. doi: 10.1109/PandaFPE57779.2023.10140437. [27] ZHAO Fuhai, DU Jiang, DENG Yunkai, et al. An adaptive word-length selection method to optimize hardware resources for FPGA-based real-time simulation of power converters[J]. IEEE Access, 2023, 11: 122980–122990. doi: 10.1109/ACCESS.2023.3328919. -