Hardware Trojan Detection for Gate-level Netlists Based on Graph Neural Network
-
摘要: 集成电路(IC)供应链的全球化已经将大多数设计、制造和测试过程从单一的可信实体转移到世界各处各种不可信的第三方实体。使用不可信的第三方知识产权(3PIP)可能面临着设计被对手植入硬件特洛伊木马(HTs)的巨大风险。这些硬件木马可能会使原有设计出现性能降低、信息泄露甚至发生物理层面不可逆的破坏,严重危害消费者的隐私、安全和公司的信誉。现有文献中提出的多种硬件木马检测方法,具有以下缺陷:对黄金参考电路的依赖、测试向量覆盖率的要求甚至是手动代码审查的需要,同时随着集成电路规模的增大,低触发率的硬件木马更加难以被检测。因此针对上述问题,该文提出一种基于图神经网络硬件木马的检测方法,在无需黄金参考电路以及逻辑测试的情况下实现了对门级硬件木马的检测。该方法利用图采样聚合算法(GraphSAGE)学习门级网表中的高维图特征以及相应节点特征,并采用有监督学习进行检测模型的训练。该方法探索了不同聚合方式以及数据平衡方法下的模型的检测能力。该模型在信任库(Trust-Hub)中基于新思90 nm通用库(SAED)的基准训练集的评估下,实现了92.9%的平均召回率以及86.2%的平均F1分数(平均聚合,权重平衡),相比目前最先进的学习模型F1分数提高了8.4%。而应用于基于系统250 nm库(LEDA)的数据量更大的数据集时,分别在组合逻辑类型硬件木马检测中获得平均83.6%的召回率、70.8%的F1,在时序逻辑类型硬件木马检测工作中获得平均95.0%的召回率以及92.8%的F1分数。Abstract: The globalization of the Integrated Circuit(IC) supply chain has shifted most design, manufacturing, and testing processes from a single trusted entity to a variety of untrusted third-party entities in various parts of the world. The use of untrusted Third-Party Intellectual Property(3PIP) can expose a design to significant risk of having Hardware Trojans(HTs) implanted by adversaries. These hardware trojans may cause degradation of the original design, information leakage, or even irreversible damage at the physical level, seriously jeopardizing consumer privacy, security, and company reputation. Various hardware trojan detection approaches proposed in the existing literature have the following drawbacks: the reliance on golden reference model, the requirement for test vector coverage and even the need for manual code review. At the same time, with the increase of the scale of integrated circuits, the hardware trojans with low trigger rate are more difficult to be detected. Therefore, to address the above problems, a graph neural network-based HT detection method is proposed that enables the detection of gate-level hardware trojans without the need for golden reference model as well as logic tests. Graph Sample and AGgrEgate(GraphSAGE) is used to learn the high-dimensional graph features in the gate-level netlist and the attributed node features. Then supervised learning is employed for the training of the detection model. The detection capability of models with different aggregation methods and data balancing methods are explored. An average recall of 92.9% and an average F1 score of 86.2% under the evaluation of the Synopsys 90 nm generic library(SAED) based benchmark in Trust-Hub are achieved by the model, which is an 8.4% improvement in F1 score compared to state of the art. When applied to the dataset with larger data volume based on 250 nm generic library(LEDA), the average recall and F1 of combined logic type are 83.6% and 70.8% respectively, and the average recall and F1 score of timing logic type are 95.0% and 92.8% respectively.
-
表 1 Trust-Hub数据集
Trust-Hub所在目录 使用库 网表数量 细节 Abstraction Level/
Gate/TRIT-TC, TRIT-TSLEDA 914 基于8个宿主电路的580个组合逻辑硬件木马嵌入网表以及
334个时序逻辑硬件木马嵌入网表Abstraction Level/
Gate/other benchamrksSAED 21 基于6个宿主电路的14个组合逻辑硬件木马嵌入网表以及
7个时序逻辑硬件木马嵌入网表表 2 实验评价指标
指标类型 含义 计算公式 指标类型 含义 计算公式 TN 正常节点被判别为正常的数量 统计数量 FN 木马节点被判别为正常的数量 统计数量 TP 木马节点被判别为木马的数量 统计数量 FP 正常节点被判别为木马的数量 统计数量 TNR 正常节点被判别为正常的概率 ${\rm{TNR} } = \dfrac{ { {\rm{TN} } } }{ { {\rm{TN} } + {\rm{FP} } } }$ Prec 被识别节点中真正硬件木马的概率 ${\rm{Prec} } = \dfrac{ { {\rm{TP} } } }{ { {\rm{TP} } + {\rm{FP} } } }$ Recall/TPR 木马节点被判别为木马的概率 ${\rm{Recall} } = \dfrac{ { {\rm{TP} } } }{ { {\rm{TP} } + {\rm{FN} } } }$ F1 Recall和Prec的调和平均数 ${\rm{F} }1\_{\rm{score} } = \dfrac{ {2 \times {\rm{Prec} } \times {\rm{Recall} } } }{ { {\rm{Prec} } + {\rm{Recall} } } }$ 表 3 图神经网络配置参数以及训练参数
架构参数 训练参数 输入层 [n, 25] 激活函数 ReLU 优化器 Adam 学习率 0.001 隐藏层尺寸 128 分类函数 Softmax 批次数量 72 运行次数 200 MLP尺寸 [128, 2] GNN层数 3 dropout 0.1 权值衰减 5e-4 表 4 SAED数据集门级木马电路信息
电路名称 木马触发器 网表门数 HT门数 电路名称 木马触发器 网表门数 HT门数 RS232-T1000 组合比较器 215 13 S35932-T300 比较器 5462 36 RS232-T1100 顺序比较器 216 12 S38417-T100 比较器 5341 12 RS232-T1200 顺序比较器 216 14 S38417-T200 比较器 5334 15 RS232-T1300 组合比较器 213 9 S38417-T300 比较器 5329 44 RS232-T1400 顺序比较器 215 13 S38584-T100 比较器 6417 9 RS232-T1500 顺序比较器 216 14 S38584-T200 状态机 6473 83 RS232-T1600 顺序比较器 214 12 S38584-T300 状态机 7204 730 S35932-T100 比较器 5441 15 S15850-T100 组合比较器 2182 27 S35932-T200 比较器 5438 16 EthernetMAC10GE-T700 顺序比较器 102466 13 EthernetMAC10GE-T710 顺序比较器 102466 13 EthernetMAC10GE-T720 顺序比较器 102466 13 EthernetMAC10GE-T730 顺序比较器 102466 13 总数 466005 1124 表 5 SAED采用MEAN聚合的检测结果
电路名称 TPR TNR F1 电路名称 TPR TNR F1 RS232-T1000 100.0 100.0 100.0 S35932-T300 94.4 100.0 97.1 RS232-T1100 100.0 100.0 100.0 S38417-T100 100.0 99.2 36.4 RS232-T1200 100.0 100.0 100.0 S38417-T200 100.0 99.5 54.5 RS232-T1300 100.0 100.0 100.0 S38417-T300 86.4 99.9 88.4 RS232-T1400 100.0 100.0 100.0 S38584-T100 44.4 99.6 17.1 RS232-T1500 100.0 100.0 100.0 S38584-T200 96.4 100.0 95.2 RS232-T1600 100.0 100.0 100.0 S38584-T300 99.3 99.6 97.8 S35932-T100 93.3 100.0 96.6 S15850-T100 77.8 98.1 48.7 S35932-T200 75.0 100.0 85.7 EthernetMAC10GE-T700 100.0 100.0 100.0 EthernetMAC10GE-T710 92.3 100.0 96.0 EthernetMAC10GE-T720 92.3 100.0 96.0 EthernetMAC10GE-T730 100.0 100.0 100.0 平均 92.9 99.8 86.2 表 6 本文结果与文献对比(%)
表 7 LEDA数据集结果
组合逻辑硬件木马植入数据集 时序逻辑硬件木马植入数据集 网表名 TP FN TN FP 网表名 TP FN TN FP 网表名 TP FN TN FP 网表名 TP FN TN FP c2670_T093 8 1 776 5 c3540_T017 9 0 1132 2 s1423_T408 46 11 476 4 s35932_T408 83 0 6839 0 s15850_T003 6 1 2949 36 s35932_T015 8 0 6839 0 s15850_T417 22 2 2979 6 s1423_T422 20 0 479 1 s15850_T012 5 3 2961 24 c5315_T004 7 1 2031 6 s15850_T439 33 2 2979 6 s13207_T473 11 1 2310 0 c6288_T041 8 1 2416 0 s13207_T013 5 6 2298 12 s13207_T462 51 10 2308 2 s1423_T413 18 0 478 2 c2670_T016 6 1 774 2 c5315_T047 8 0 2298 9 s15850_T450 33 0 2976 9 s15850_T406 37 12 2979 6 c6288_T066 5 0 2416 0 s35932_T006 7 0 6839 0 s35932_T414 82 1 6839 0 s15850_T434 8 2 2982 3 c2670_T073 7 1 773 3 c5315_T064 3 3 2300 7 s1423_T405 104 3 476 4 s35932_T430 21 0 6839 0 s1423_T008 7 0 477 3 s13207_T014 4 2 2305 5 s13207_T440 20 1 2310 0 s13207_T425 39 2 2979 6 c2670_T054 5 1 773 3 c5315_T057 6 0 2300 7 s35932_T402 85 4 478 2 s35932_T435 23 0 6839 0 s1423_T003 5 2 479 1 s35932_T005 7 0 6839 0 s1423_T418 72 1 6839 0 s13207_T468 22 1 2310 0 c2670_T095 4 2 774 2 s15850_T014 2 2 2982 3 s13207_T449 64 2 479 1 s15850_T429 81 20 6812 27 s15850_T009 4 4 2973 12 s13207_T005 6 1 2308 2 s1423_T412 17 1 2305 5 s15850_T475 20 3 2310 0 c3540_T087 6 4 1134 0 c5315_T063 7 1 2300 7 s35932_T421 42 0 480 0 s35932_T427 22 0 6839 0 s1423_T011 4 2 479 1 s35932_T018 9 0 6839 0 s15850_T468 32 0 6839 0 s13207_T461 21 0 2310 0 c3540_T005 8 1 1131 3 c6288_T049 6 0 2416 0 s13207_T484 20 0 2985 0 s15850_T443 32 1 2973 12 s1423_T005 4 1 476 4 s13207_T011 6 0 2310 0 s1423_T407 12 0 2303 7 s15850_T433 19 2 2976 9 c3540_T015 6 2 1132 2 c6288_T048 6 0 2416 0 s35932_T413 17 0 479 1 s35932_T411 22 0 6839 0 s1423_T014 5 0 478 2 s35932_T016 6 0 6839 0 s1423_T411 62 0 6839 0 s13207_T450 87 13 2301 9 c3540_T012 5 0 1126 8 c6288_T082 5 0 2416 0 s13207_T444 20 0 479 1 s35932_T434 18 0 6839 0 s13207_T002 3 2 2294 16 s15850_T002 5 2 2967 18 s1423_T421 16 1 2310 0 s1423_T429 83 0 6839 0 平均 TPR = 0.836 TNR = 0.997 F1 = 0.708 平均 TPR = 0.950 TNR = 0.998 F1 = 0.928 -
[1] HUANG Zhao, WANG Quan, CHEN Yin, et al. A survey on machine learning against hardware Trojan attacks: Recent advances and challenges[J]. IEEE Access, 2020, 8: 10796–10826. doi: 10.1109/ACCESS.2020.2965016 [2] ELSHAMY M, DI NATALE G, SAYED A, et al. Digital-to-analog hardware Trojan attacks[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2022, 69(2): 573–586. doi: 10.1109/TCSI.2021.3116806 [3] LYU Yangdi and MISHRA P. Automated trigger activation by repeated maximal clique sampling[C]. The 25th Asia and South Pacific Design Automation Conference (ASP-DAC), Beijing, China, 2020. [4] JIANG B C, YANG W G, and YANG C Y. An SPC-based forward-backward algorithm for arrhythmic beat detection and classification[J]. Industrial Engineering and Management Systems, 2013, 12(4): 380–388. doi: 10.7232/iems.2013.12.4.380 [5] SHAKYA B, HE T, SALMANI H, et al. Benchmarking of hardware Trojans and maliciously affected circuits[J]. Journal of Hardware and Systems Security, 2017, 1(1): 85–102. doi: 10.1007/s41635-017-0001-6 [6] LIU Qiang, ZHAO Pengyong, and CHEN Fuqiang. A hardware Trojan detection method based on structural features of Trojan and host circuits[J]. IEEE Access, 2019, 7: 44632–44644. doi: 10.1109/ACCESS.2019.2908088 [7] SHEN Haihua, TAN Huazhe, LI Huawei, et al. LMDet: A “naturalness” statistical method for hardware Trojan detection[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(4): 720–732. doi: 10.1109/TVLSI.2017.2781423 [8] LU Renjie, SHEN Haihua, SU Yu, et al. GramsDet: Hardware Trojan detection based on recurrent neural network[C]. 2019 IEEE 28th Asian Test Symposium (ATS), Kolkata, India, 2019. [9] 严迎建, 赵聪慧, 刘燕江. 基于多维结构特征的硬件木马检测技术[J]. 电子与信息学报, 2021, 43(8): 2128–2139. doi: 10.11999/JEIT210003YAN Yingjian, ZHAO Conghui, and LIU Yanjiang. Hardware Trojan detection based on multiple structural features[J]. Journal of Electronics &Information Technology, 2021, 43(8): 2128–2139. doi: 10.11999/JEIT210003 [10] 张颖, 李森, 陈鑫, 等. 基于Xgboost的混合模式门级硬件木马检测方法[J]. 电子与信息学报, 2021, 43(10): 3050–3057. doi: 10.11999/JEIT200874ZHANG Ying, LI Sen, CHEN Xin, et al. Hybrid multi-level hardware Trojan detection method for gate-level netlists based on XGBoost[J]. Journal of Electronics &Information Technology, 2021, 43(10): 3050–3057. doi: 10.11999/JEIT200874 [11] SHI Jiangyi, ZHANG Xinyuan, MA Peijun, et al. Hardware Trojan designs based on high-low probability and partitioned combinational logic with a malicious reset signal[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2021, 68(6): 2152–2156. doi: 10.1109/TCSII.2020.3044721 [12] HASEGAWA K, YANAGISAWA M, and TOGAWA N. A hardware-Trojan classification method utilizing boundary net structures[C]. 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, USA, 2018: 1–4. [13] KOK C H, OOI C Y, INOUE M, et al. Net classification based on testability and netlist structural features for hardware Trojan detection[C]. 2019 IEEE 28th Asian Test Symposium (ATS), Kolkata, India, 2019: 105–110. [14] YASAEI R, YU S Y, and AL FARUQUE M A. GNN4TJ: Graph neural networks for hardware Trojan detection at register transfer level[C]. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2021: 1504–1509. [15] MURALIDHAR N, ZUBAIR A, WEIDLER N, et al. Contrastive graph convolutional networks for hardware Trojan detection in third party IP cores[C]. 2021 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), Tysons Corner, USA, 2021: 181–191. [16] CHOWDHURY S D, YANG Kaixin, and NUZZO P. ReIGNN: State register identification using graph neural networks for circuit reverse engineering[C]. 2021 IEEE/ACM International Conference on Computer Aided Design (ICCAD), Munich, Germany, 2021: 1–9. [17] HAMILTON W L. Graph Representation Learning[M]. Switzerland: Springer, 2020: 1–159. [18] KURIHARA T and TOGAWA N. Hardware-Trojan classification based on the structure of trigger circuits utilizing random forests[C]. Proceedings of 2021 IEEE 27th International Symposium on On-Line Testing and Robust System Design (IOLTS), Torino, Italy, 2021: 1–4. [19] HASEGAWA K, HIDANO S, NOZAWA K, et al. R-HTDetector: Robust hardware-Trojan detection based on adversarial training[J]. arXiv preprint arXiv: 2205.13702, 2022. [20] Trust-HUB[EB/OL]. https://www.trust-hub.org/. [21] WANG Minjie, ZHENG Da, YE Zihao, et al. Deep graph library: A graph-centric, highly-performant package for graph neural networks[J]. arXiv preprint arXiv: 1909.01315, 2019. [22] YU Shichao, GU Chongyan, LIU Weiqiang, et al. Deep learning-based hardware Trojan detection with block-based netlist information extraction[J]. IEEE Transactions on Emerging Topics in Computing, 2022, 10(4): 1837–1853. doi: 10.1109/TETC.2021.3116484