| Citation: | ZHANG Congwu, LIU Ao, ZHANG Ke, CHANG Yisong, BAO Yungang. A System-level Exploration and Evaluation Simulator for chiplet-based CPU[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4575-4588. doi: 10.11999/JEIT240299 | 
 
	                | [1] | MOORE G E. Cramming more components onto integrated circuits[J]. Electronics, 1965, 38(8): 114–117. | 
| [2] | DENNARD R H, GAENSSLE F H, YU H N, et al. Design of ion-implanted MOSFET's with very small physical dimensions[J]. IEEE Journal of Solid-State Circuits, 1974, 9(5): 256–268. doi:  10.1109/JSSC.1974.1050511. | 
| [3] | HAN Yinhe, XU Haobo, LU Meixuan,    et al. The big chip: Challenge, model and architecture[J]. Fundamental Research, 2023, S2667325823003709. doi:  10.1016/j.fmre.2023.10.020. | 
| [4] | CAI Jingwei, WU Zuotong, PENG Sen,    et al. Gemini: Mapping and architecture co-exploration for large-scale DNN Chiplet accelerators[C]. 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Edinburgh, United Kingdom, 2024: 156–171. doi:  10.1109/HPCA57654.2024.00022. | 
| [5] | 陈云霁, 蔡一茂, 汪玉, 等. 集成电路未来发展与关键问题—第347期"双清论坛(青年)"学术综述[J]. 中国科学: 信息科学, 2024, 54(1): 1–15. doi:  10.1360/SSI-2023-0356. CHEN Yunji, CAI Yimao, WANG Yu, et al. Integrated circuit technology: Future development and key issues–review of the 347th "Shuangqing Forum (Youth)"[J]. Scientia Sinica Informationis, 2024, 54(1): 1–15. doi:  10.1360/SSI-2023-0356. | 
| [6] | 项少林, 郭茂, 蒲菠, 等. Chiplet技术发展现状[J]. 科技导报, 2023, 41(19): 113–131. doi:  10.3981/j.issn.1000-7857.2023.19.013. XIANG Shaolin, GUO Mao, PU Bo, et al. Overview of the development status of Chiplet technology[J]. Science & Technology Review, 2023, 41(19): 113–131. doi:  10.3981/j.issn.1000-7857.2023.19.013. | 
| [7] | 厉佳瑶, 张琨, 潘权. Chiplet技术: 拓展芯片设计的新边界[J]. 集成电路与嵌入式系统, 2024, 24(2): 1–9. LI Jiayao, ZHANG Kun, and PAN Quan. Chiplet: Expanding the innovative boundaries of chip design[J]. Integrated Circuits and Embedded Systems, 2024, 24(2): 1–9. | 
| [8] | MA Xiaohan, WANG Ying, WANG Yujie, et al. Survey on Chiplets: Interface, interconnect and integration methodology[J]. CCF Transactions on High Performance Computing, 2022, 4(1): 43–52. doi:  10.1007/s42514-022-00093-0. | 
| [9] | SUGGS D, SUBRAMONY M, and BOUVIER D. The AMD “Zen 2” processor[J]. IEEE Micro, 2020, 40(2): 45–52. doi:  10.1109/MM.2020.2974217. | 
| [10] | NAFFZIGER S, LEPAK K, PARASCHOU M,    et al. 2.2 AMD Chiplet architecture for high-performance server and desktop products[C]. 2020 IEEE International Solid-State Circuits Conference - (ISSCC), San Francisco, USA, 2020: 44–45. doi:  10.1109/ISSCC19947.2020.9063103. | 
| [11] | EVERS M, BARNES L, and CLARK M. The AMD next-generation “Zen 3” Core[J]. IEEE Micro, 2022, 42(3): 7–12. doi:  10.1109/MM.2022.3152788. | 
| [12] | MUNGER B, WILCOX K, SNIDERMAN J,    et al. Zen 4: The AMD 5nm 5.7GHz x86-64 microprocessor core[C]. 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, USA, 2023: 38–39. doi:  10.1109/ISSCC42615.2023.10067540. | 
| [13] | GIANOS C. Architecting for flexibility and value with next gen Intel® Xeon® processors[C]. 2023 IEEE Hot Chips 35 Symposium (HCS), Palo Alto, USA, 2023: 1–15. doi:  10.1109/HCS59251.2023.10254694. | 
| [14] | ESPOSITO B. Intel Agilex® 9 direct RF-series FPGAs with integrated 64 Gsps data converters[C]. 2023 IEEE Hot Chips 35 Symposium (HCS), Palo Alto, USA, 2023: 1–35. doi:  10.1109/HCS59251.2023.10254707. | 
| [15] | VENTANA MICRO. Veyron V1 data center-class RISC-V processor[C]. 2023 IEEE Hot Chips 35 Symposium (HCS), Palo Alto, USA, 2023: 1–16. doi:  10.1109/HCS59251.2023.10254710. | 
| [16] | CHIRKOV G and WENTZLAFF D. Seizing the bandwidth scaling of on-package interconnect in a post-Moore’s law world[C]. Proceedings of the 37th International Conference on Supercomputing, Orlando, USA, 2023: 410–422. doi:  10.1145/3577193.3593702. | 
| [17] | YANG Chongyi, ZHANG Zhendong, WANG Xiaohang,    et al. Adaptive caching policies for Chiplet systems based on reinforcement learning[C]. 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, USA, 2023: 1–5. doi:  10.1109/ISCAS46773.2023.10181966. | 
| [18] | GADE S H, SINHA M, KUMAR M,    et al. Scalable hybrid cache coherence using emerging links for Chiplet architectures[C]. 2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID), Bangalore, India, 2022: 92–97. doi:  10.1109/VLSID2022.2022.00029. | 
| [19] | MEDINA R, KEIN J, ANSALONI G,    et al. System-level exploration of in-package wireless communication for multi-Chiplet platforms[C]. Proceedings of the 28th Asia and South Pacific Design Automation Conference, Tokyo, Japan, 2023: 561–566. doi:  10.1145/3566097.3567952. | 
| [20] | ZHU Mingcan, SHAHAB A, KATSARAKIS A,    et al. Invalidate or update? Revisiting coherence for tomorrow's cache hierarchies[C]. 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), Atlanta, USA, 2021: 226–241. doi:  10.1109/PACT52795.2021.00024. | 
| [21] | SHAHAB A, ZHU Mingcan, MARGARITOV A,    et al. Farewell my shared LLC! A case for private die-stacked DRAM caches for servers[C]. 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, Japan, 2018: 559–572. doi:  10.1109/MICRO.2018.00052. | 
| [22] | BHARADWAJ S, YIN Jieming, BECKMANN B,    et al. Kite: A family of heterogeneous interposer topologies enabled via accurate interconnect modeling[C]. 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, USA, 2020: 1–6. doi:  10.1109/DAC18072.2020.9218539. | 
| [23] | IFF P, BESTA M, CAVALCANTE M,    et al. HexaMesh: Scaling to hundreds of Chiplets with an optimized Chiplet arrangement[C]. 2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, USA, 2023: 1–6. doi:  10.1109/DAC56929.2023.10248006. | 
| [24] | FU Yaosheng and WENTZLAFF D. PriME: A parallel and distributed simulator for thousand-core chips[C]. 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Monterey, USA, 2014: 116–125. doi:  10.1109/ISPASS.2014.6844467. | 
| [25] | LOWE-POWER J, AHMAD A M, AKRAM A,    et al. The gem5 simulator: Version 20.0+[EB/OL]. https://arxiv.org/abs/2007.03152, 2020. | 
| [26] | UBAL R, JANG B, MISTRY P,    et al. Multi2Sim: A simulation framework for CPU-GPU computing[C]. Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques. Minneapolis, USA, 2012: 335–344. doi:  10.1145/2370816.2370865. | 
| [27] | QURESHI Y M, SIMON W A, ZAPATER M, et al. gem5-X: A many-core heterogeneous simulation platform for architectural exploration and optimization[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2021, 18(4): 44. doi:  10.1145/3461662. | 
| [28] | HARDAVELLAS N, SOMOGYI S, WENISCH T F, et al. SimFlex: A fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture[J]. ACM SIGMETRICS Performance Evaluation Review, 2004, 31(4): 31–34. doi:  10.1145/1054907.1054914. | 
| [29] | JIANG Nan, BECKER U D, MICHELOGIANNAKIS G,    et al. A detailed and flexible cycle-accurate network-on-chip simulator[C]. 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, USA, 2013: 86–96. doi:  10.1109/ISPASS.2013.6557149. | 
| [30] | BRKIĆ I R and JEFFREY M C M. Disintegrating manycores: Which applications lose and why?[C]. Proceedings of the 16th International Workshop on Network on Chip Architectures, Toronto, Canada, 2023: 3–8. doi:  10.1145/3610396.3618090. | 
| [31] | JEFFREY M C, SUBRAMANIAN S, YAN Cong,    et al. A scalable architecture for ordered parallelism[C]. 2015 48th International Symposium on Microarchitecture (MICRO), Waikiki, USA, 2015: 228–241. doi:  10.1145/2830772.2830777. | 
| [32] | ZHI Haocong, XU Xianuo, HAN Weijian,    et al. A methodology for simulating multi-Chiplet systems using open-source simulators[C]. Proceedings of the Eight Annual ACM International Conference on Nanoscale Computing and Communication, New York, NY, USA, 2021: 18. doi:  10.1145/3477206.3477459. | 
| [33] | ORENES-VERA M, TURECI E, MARTONOSI M,    et al. DCRA: A distributed Chiplet-based reconfigurable architecture for irregular applications[EB/OL]. https://arxiv.org/abs/2311.15443, 2024. | 
| [34] | ORENES-VERA M, TURECI E, MARTONOSI M,    et al. MuchiSim: A simulation framework for design exploration of multi-chip Manycore systems[C]. Proceedings of the 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Indianapolis, USA, 2024: 48–60. doi:  10.1109/ISPASS61541.2024.00015. | 
| [35] | LI Xingyu. High-performance FPGA-accelerated Chiplet modeling[D]. [Master dissertation], University of California, Berkeley, 2022. | 
| [36] | CHIRKOV G and WENTZLAFF D. SMAPPIC: Scalable multi-FPGA architecture prototype platform in the cloud[C]. Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vancouver, Canada, 2023: 733–746. doi:  10.1145/3575693.3575753. | 
| [37] | ZHAN Xusheng, BAO Yungang, BIENIA C, et al. PARSEC3.0: A multicore benchmark suite with network stacks and SPLASH-2X[J]. ACM SIGARCH Computer Architecture News, 2017, 44(5): 1–16. doi:  10.1145/3053277.3053279. | 
| [38] | HARDAVELLAS N, FERDMAN M, FALSAFI B, et al. Reactive NUCA: Near-optimal block placement and replication in distributed caches[J]. ACM SIGARCH Computer Architecture News, 2009, 37(3): 184–195. doi:  10.1145/1555815.1555779. | 
| [39] | AWASTHI M, SUDAN K, BALASUBRAMONIAN R,    et al. Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches[C]. 2009 IEEE 15th International Symposium on High Performance Computer Architecture, Raleigh, USA, 2009: 250–261. doi:  10.1109/HPCA.2009.4798260. | 
| [40] | KIM C, BURGER D, and KECKLER S W,    et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches[C]. Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, USA, 2002: 211–222. doi:  10.1145/605397.605420. | 
| [41] | LI Chengeng, JIANG Fan, CHEN Shixi,    et al. Accelerating cache coherence in Manycore processor through silicon photonic Chiplet[C]. Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (ICCAD'22), San Diego, USA, 2022: 43. doi:  10.1145/3508352.3549338. | 
| [42] | CUBERO-CASCANTE J, ZURSTRAßEN N, NÖLLER J,    et al. Parti-gem5: Gem5’s timing mode parallelised[C]. 23rd International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, Samos, Greece, 2023: 177–192. doi: 1 0.1007/978-3-031-46077-7_12. | 
