An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing

GAO Wenchao; SUO Jianhua; ZHANG Ao

doi:10.11999/JEIT250363

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 >

GAO Wenchao, SUO Jianhua, ZHANG Ao. An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250363

Citation:

GAO Wenchao, SUO Jianhua, ZHANG Ao. An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250363

GAO Wenchao, SUO Jianhua, ZHANG Ao. An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250363

Citation:

GAO Wenchao, SUO Jianhua, ZHANG Ao. An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250363

PDF( 1976 KB)

An Interpretable Vulnerability Detection Method Based on Graph and Code Slicing

doi: 10.11999/JEIT250363 cstr: 32379.14.JEIT250363

School of Artificial Intelligence, China University of Mining And Technology-Beijing, Beijing 100083, China

Accepted Date: 2025-11-13
Rev Recd Date: 2025-11-13

Available Online: 2025-11-18

Abstract

Abstract

Objective Deep learning technology has been widely applied to source code vulnerability detection. The mainstream methods can be categorized into sequence-based and graph-based approaches. Sequence-based models usually convert structured code into a linear sequence, which ignores the syntactic and structural information of the program and often leads to a high false-positive rate. Graph-based models can effectively capture structural features, but they fail to model the execution order of the program. In addition, their prediction granularity is usually coarse and limited to the function level. Both types of methods lack interpretability, which makes it difficult for developers to locate the root causes of vulnerabilities. Although large language models (LLM) have made progress in code understanding, they still suffer from high computational overhead, hallucination problems in the security domain, and insufficient understanding of complex program logic. To address these issues, this paper proposes an interpretable vulnerability detection method based on graphs and code slicing (GSVD). The proposed method integrates structural semantics and sequential features, and provides fine-grained, line-level explanations for model decisions. Methods The proposed method consists of four main components: code graph feature extraction, code sequence feature extraction, feature fusion, and an interpreter module (Fig. 1). First, the source code is normalized, and the Joern static analysis tool is used to convert it into multiple code graphs, including the Abstract Syntax Tree (AST), Data Dependency Graph (DDG), and Control Dependency Graph (CDG). These graphs comprehensively represent the syntactic structure, data flow, and control flow of the program. Then, node features are initialized by combining CodeBERT embeddings with one-hot encodings of node types. With the adjacency matrix of each graph, a Gated Graph Convolutional Network (GGCN) equipped with a self-attention pooling layer is applied to extract deep structural semantic features. At the same time, a code slicing algorithm based on taint analysis (Algorithm 1) is designed. In this algorithm, taint sources are identified, and taints are propagated according to data and control dependencies, thereby generating concise code slices that are highly related to potential vulnerabilities. These slices remove irrelevant code noise and are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long-range sequential dependencies. After obtaining both graph and sequence features, a gating mechanism is introduced for feature fusion. The two feature vectors are fed into a Gated Recurrent Unit (GRU), which automatically learns the dependency relationships between structural and sequential information through its dynamic state updates. Finally, to address vulnerability detection and localization, a VDExplainer is designed, considering the characteristics of the vulnerability detection task. Inspired by the HITS algorithm, it iteratively computes the “authority” and “hub” values of nodes to evaluate their importance under the constraint of an edge mask, thus achieving node-level interpretability for vulnerability explanation. Results and Discussions To evaluate the effectiveness of GSVD, a series of comparative experiments(Table 2) are conducted on the Devign (FFmpeg + Qemu) dataset. GSVD is compared with several baseline models. The experimental results show that GSVD achieves the highest accuracy and F1-score of 64.57% and 61.89%, respectively. The recall rate also increases to 62.63%, indicating that the proposed method effectively performs the vulnerability detection task and reduces the number of missed vulnerability reports. To verify the effectiveness of the GRU-based fusion mechanism, three feature fusion strategies—feature concatenation, weighted sum, and attention mechanism—are compared (Table 3). GSVD achieves the best overall performance, with accuracy, recall, and F1-score reaching 64.57%, 62.63%, and 61.89%, respectively. Its precision reaches 61.17%, which is slightly lower than the 63.33% obtained by the weighted sum method. Ablation experiments (Tables 4-5) further confirm the importance of the proposed slicing algorithm. The taint propagation-based slicing method reduces the average number of code lines from 51.98 to 17.30 (a 66.72% reduction) and lowers the data redundancy rate to 6.42%, compared with 19.58% for VulDeePecker and 22.10% for SySeVR. This noise suppression effect leads to a 1.53% improvement in the F1-score, demonstrating its ability to focus on key code segments. Finally, interpretability experiments (Table 6) on the Big-Vul dataset further validate the effectiveness of the VDExplainer. The proposed method outperforms the standard GNNExplainer at all evaluation thresholds. When 50% of the nodes are selected, the localization accuracy improves by 7.65%, showing its advantage in node-level vulnerability localization. In summary, GSVD not only achieves superior detection performance but also significantly improves the interpretability of model decisions, providing practical support for vulnerability localization and remediation. Conclusions The GSVD model effectively addresses the limitations of single-modal approaches by deeply integrating graph structures with taint analysis-based code slices. It achieves notable improvements in vulnerability detection accuracy and interpretability. In addition, the VDExplainer provides node-level and line-level vulnerability localization, enhancing the practical value of the model. Experimental results confirm the superiority of the proposed method in both detection performance and interpretability.
- Vulnerability detection,
- Deep learning,
- Graph neural network,
- Code slicing,
- Interpretability

FullText(HTML)

References(25)

References

[1]	GAO Qing, MA Sen, SHAO Sihao, et al. CoBOT: Static C/C++ bug detection in the presence of incomplete code[C]. Proceedings of the 26th Conference on Program Comprehension, Gothenburg, Sweden, 2018: 385–388. doi: 10.1145/3196321.3196367.
[2]	ZHANG Yu, HUO Wei, JIAN Kunpeng, et al. SRFuzzer: An automatic fuzzing framework for physical SOHO router devices to discover multi-type vulnerabilities[C]. The 35th Annual Computer Security Applications Conference, San Juan, USA, 2019: 544–556. doi: 10.1145/3359789.3359826.
[3]	LI Zhen, ZOU Deqing, XU Shouhuai, et al. VulDeePecker: A deep learning-based system for vulnerability detection[C]. The 25th Annual Network and Distributed Systems Security Symposium, San Diego, USA, 2018. doi: 10.14722/ndss.2018.23158.
[4]	ZOU Deqing, WANG Sujuan, XU Shouhuai, et al. μVulDeePecker: A deep learning-based system for multiclass vulnerability detection[J]. IEEE Transactions on Dependable and Secure Computing, 2021, 18(5): 2224–2236. doi: 10.1109/TDSC.2019.2942930.
[5]	LI Zhen, ZOU Deqing, XU Shouhuai, et al. SySeVR: A framework for using deep learning to detect software vulnerabilities[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 19(4): 2244–2258. doi: 10.1109/TDSC.2021.3051525.
[6]	ZHOU Yaqin, LIU Shangqing, SIOW J, et al. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks[C]. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 915. doi: 10.5555/3454287.3455202.
[7]	FENG Qi, FENG Chendong, and HONG Weijiang. Graph neural network-based vulnerability predication[C]. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), Adelaide, Australia, 2020: 800–801. doi: 10.1109/ICSME46990.2020.00096.
[8]	NGUYEN V A, NGUYEN D Q, NGUYEN V, et al. ReGVD: Revisiting graph neural networks for vulnerability detection[C]. Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, Pittsburgh, USA, 2022: 178–182. doi: 10.1145/3510454.3516865.
[9]	CHAKRABORTY S, KRISHNA R, DING Yangruibo, et al. Deep learning based vulnerability detection: Are we there yet?[J]. IEEE Transactions on Software Engineering, 2022, 48(9): 3280–3296. doi: 10.1109/TSE.2021.3087402.
[10]	ALI G M A and CHEN Hongsong. Contract-guardian: A bagging-based gradient boosting decision tree for detection vulnerability in smart contract[J]. Cluster Computing, 2025, 28(8): 528. doi: 10.1007/s10586-025-05230-2.
[11]	GUO Daya, ZHU Qihao, YANG Dejian, et al. DeepSeek-coder: When the large language model meets programming -- the rise of code intelligence[J]. arXiv preprint arXiv: 2401.14196, 2024. (不确定本条文献类型及格式是否正确, 请确认). GUO Daya, ZHU Qihao, YANG Dejian, et al. DeepSeek-coder: When the large language model meets programming -- the rise of code intelligence[J]. arXiv preprint arXiv: 2401.14196, 2024. (不确定本条文献类型及格式是否正确, 请确认).
[12]	DeepSeek-AI. DeepSeek-coder-V2: Breaking the barrier of closed-source models in code intelligence[J]. arXiv preprint arXiv: 2406.11931, 2024. (不确定本条文献类型及格式是否正确, 请确认). DeepSeek-AI. DeepSeek-coder-V2: Breaking the barrier of closed-source models in code intelligence[J]. arXiv preprint arXiv: 2406.11931, 2024. (不确定本条文献类型及格式是否正确, 请确认).
[13]	AGHAEI E, NIU Xi, SHADID W, et al. SecureBERT: A domain-specific language model for cybersecurity[C]. 18th International Conference on Security and Privacy in Communication Networks, Kansas, USA, 2022: 39–56. doi: 10.1007/978-3-031-25538-0_3. (查阅网上资料,未找到本条文献出版地信息,请确认).
[14]	SUN Yuqiang, WU Daoyuan, XUE Yue, et al. LLM4Vuln: A unified evaluation framework for decoupling and enhancing LLMs' vulnerability reasoning[J]. arXiv preprint arXiv: 2401.16185, 2024. (不确定本条文献类型及格式是否正确, 请确认). SUN Yuqiang, WU Daoyuan, XUE Yue, et al. LLM4Vuln: A unified evaluation framework for decoupling and enhancing LLMs' vulnerability reasoning[J]. arXiv preprint arXiv: 2401.16185, 2024. (不确定本条文献类型及格式是否正确, 请确认).
[15]	FAR S M T and FEYZI F. Large language models for software vulnerability detection: A guide for researchers on models, methods, techniques, datasets, and metrics[J]. International Journal of Information Security, 2025, 24(2): 78. doi: 10.1007/s10207-025-00992-7.
[16]	ZHOU Xin, CAO Sicong, SUN Xiaobing, et al. Large language model for vulnerability detection and repair: Literature review and the road ahead[J]. ACM Transactions on Software Engineering and Methodology, 2025, 34(5): 145. doi: 10.1145/3708522.
[17]	YING R, BOURGEOIS D, YOU Jiaxuan, et al. GNNExplainer: Generating explanations for graph neural networks[C]. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 829. doi: 10.5555/3454287.3455116.
[18]	FAN Jiahao, LI Yi, WANG Shaohua, et al. A C/C++ code vulnerability dataset with code changes and CVE summaries[C]. Proceedings of the 17th International Conference on Mining Software Repositories, Seoul, Korea, 2020: 508–512. doi: 10.1145/3379597.3387501.
[19]	D'ABRUZZO PEREIRA J and VIEIRA M. On the use of open-source C/C++ static analysis tools in large projects[C]. 2020 16th European Dependable Computing Conference (EDCC), Munich, Germany, 2020: 97–102. doi: 10.1109/EDCC51268.2020.00025.
[20]	FERSCHKE O, GUREVYCH I, and RITTBERGER M. FlawFinder: A modular system for predicting quality flaws in wikipedia[C]. CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy, 2012: 1178.
[21]	GRAVES A. Long short-term memory[M]. GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks. Berlin: Springer, 2012: 37–45. doi: 10.1007/978-3-642-24797-2_4.
[22]	CHEN Yahui. Convolutional neural network for sentence classification[D]. [Master dissertation], University of Waterloo, 2015.
[23]	LIU Yinhan, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[C]. International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
[24]	FENG Zhangyin, GUO Daya, TANG Duyu, et al. CodeBERT: A pre-trained model for programming and natural languages[C]. Findings of the Association for Computational Linguistics: EMNLP 2020, 2020: 1536–1547. doi: 10.18653/v1/2020.findings-emnlp.139. (查阅网上资料,未找到本条文献出版地信息,请确认).
[25]	ZENG Ciling, ZHOU Bo, DONG Huoyuan, et al. A general source code vulnerability detection method via ensemble of graph neural networks[C]. The 6th International Conference on Frontiers in Cyber Security, Chengdu, China, 2023: 560–574. doi: 10.1007/978-981-99-9331-4_37.