高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

融合表示学习和知识图谱推理的糖尿病及并发症预测方法

王宇翱 黄叶琪 李青远 刘云 景慎旗 单涛 郭永安

王宇翱, 黄叶琪, 李青远, 刘云, 景慎旗, 单涛, 郭永安. 融合表示学习和知识图谱推理的糖尿病及并发症预测方法[J]. 电子与信息学报. doi: 10.11999/JEIT250798
引用本文: 王宇翱, 黄叶琪, 李青远, 刘云, 景慎旗, 单涛, 郭永安. 融合表示学习和知识图谱推理的糖尿病及并发症预测方法[J]. 电子与信息学报. doi: 10.11999/JEIT250798
WANG Yuao, HUANG Yeqi, LI Qingyuan, LIU Yun, JING Shenqi, SHAN Tao, GUO Yongan. Integrating Representation Learning and Knowledge Graph Reasoning for Diabetes and Complications Prediction[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250798
Citation: WANG Yuao, HUANG Yeqi, LI Qingyuan, LIU Yun, JING Shenqi, SHAN Tao, GUO Yongan. Integrating Representation Learning and Knowledge Graph Reasoning for Diabetes and Complications Prediction[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250798

融合表示学习和知识图谱推理的糖尿病及并发症预测方法

doi: 10.11999/JEIT250798 cstr: 32379.14.JEIT250798
基金项目: 国家重点研发计划(2023YFC3605800),江苏省前沿引领技术基础研究专项(BK20202001),江苏省研究生科研与实践创新计划项目(SJCX24_0285)
详细信息
    作者简介:

    王宇翱:男,博士生,研究方向为人工智能和智能信息处理

    黄叶琪:女,硕士生,研究方向为医疗人工智能

    李青远:男,硕士生,研究方向为人工智能和医疗信息处理

    刘云:女,教授,研究方向为智能医学、医学信息学、临床大数据

    景慎旗:男,高级工程师,研究方向为医疗信息大数据

    单涛:男,高级工程师,研究方向为医疗信息大数据

    郭永安:男,教授,研究方向为智能信息处理

    通讯作者:

    郭永安 guo@njupt.edu.cn

  • 11) https://physionet.org/,该数据集由美国国立卫生研究院提供
  • 中图分类号: TN912.34

Integrating Representation Learning and Knowledge Graph Reasoning for Diabetes and Complications Prediction

Funds: The National Key Research Program of China (2023YFC3605800), The Frontier Leading Technology Basic Research Program of Jiangsu Province (BK20202001), The Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX24_0285)
  • 摘要: 糖尿病及其发并发症的联合预测对于降低慢性病危害、改善患者预后具有重要意义。然而,现有预测方法面临数据异构性和稀疏性、实体关系复杂以及疾病与医学概念间高阶关联难以精确捕捉等挑战,限制了预测准确性和多病症识别能力。针对上述问题,该文提出一种基于表示学习与知识图谱推理的糖尿病及其并发症预测模型(REKG-MDP)。通过整合电子健康记录与医学补充知识构建医疗知识图谱,在患者侧完善个人基本信息、检查指标及现病史,在疾病侧补充疾病共病信息、多发人群、常见病因及诊断依据,从而缓解数据稀疏性与异构性问题。综合考虑对称、反对称、反转和组合4种关系连接模式,并设计层次化注意力机制与图卷积网络相结合的推理模块,在全局和局部动态调整邻居节点权重,有效聚合多阶邻居信息并捕捉高阶语义关系。基于MIMIC-IV数据集的实验结果表明,所提模型在糖尿病及发并发症联合预测任务中明显优于现有方法,预测准确率和多病症识别能力均有显著提升。
  • 图  1  REKG-MDP模型架构图

    图  2  知识图谱构建流程图

    图  3  医疗领域知识图谱示例图

    图  4  节点嵌入向量聚合过程

    图  5  REKG-MDP模型以及其3个变体的性能对比图

    图  6  嵌入向量维度对REKG-MDP模型的性能影响图

    图  7  $\beta $对REKG-MDP模型的性能影响图

    表  1  医疗知识图谱中的关系连接模式示例

    关系连接模式解释医疗案例
    对称模式两个实体之间的关系是相互的,即如果AB有这种关系,那么B也应该与A有这种关系(糖尿病,共病,高脂血症)
    反对称模式如果AB有这种关系,那么BA没有这种关系(患者,BMI,肥胖)
    反转模式在某些条件下,这导致原始关系的反转,即如果存在$ {r_1}(A,B) $,那么存在$ {r_2}(B,A) $(高血糖,导致,糖尿病)
    →(糖尿病,风险因素,高血糖)
    组合模式一个实体可以通过一系列关系与另一个实体连接,即如果存在$ {r_1}(A,B) $和$ {r_2}(B,C) $,
    那么可以推断出$ {r_3}(A,C) $
    (患者,有,异常检查指标)+
    (疾病,诊断依据,异常检查指标)
    (患者,患有,疾病)
    下载: 导出CSV

    表  2  知识图谱统计信息

    数据类型数据集大小
    训练集2910
    测试集1942
    疾病数量18
    患者数量4852
    检查指标数量92
    基本个人信息类型数量18
    共病/常见病因/多发人群数量485
    关系类型数量18
    知识图中的三元组数量163118
    下载: 导出CSV

    表  3  该文中使用的疾病信息和疾病分类

    疾病类别疾病ICD-10代码疾病名称
    代谢性疾病E112型糖尿病
    E78.5高脂血症
    E11.4糖尿病性神经病变
    E10.2&E11.2糖尿病性慢性肾病
    E10.65&E11.65高血糖症
    E101型糖尿病
    E78.0高胆固醇血症
    E10.1&E11.1糖尿病酮症酸中毒
    心脑血管疾病I10高血压
    150心力衰竭
    125.1冠状动脉粥样硬化性心脏病
    121心肌梗死
    163缺血性中风
    G45短暂性脑缺血发作
    170动脉粥样硬化
    肾脏疾病N18慢性肾病
    非酒精性脂肪肝病K75.81非酒精性脂肪性肝炎
    K76.0脂肪肝
    下载: 导出CSV

    表  4  REKG-MDP模型与5种基线方法的性能对比

    模型 P@1 P@3 P@5 F1@1 F1@3 F1@5 NDCG@1 NDCG@3 NDCG@5
    REKG-MDP 0.9655
    (↑19.39%)
    0.8879
    (↑16.71%)
    0.8280
    (↑22.01%)
    0.4200
    (↑19.67%)
    0.7332
    (↑21.83%)
    0.8121
    (↑20.34%)
    0.9655
    (↑19.39%)
    0.9151
    (↑23.53%)
    0.8946
    (↑20.88%)
    DCKD-RF 0.7199 0.4192 0.3455 0.3058 0.3569 0.3375 0.7199 0.4651 0.4329
    bSES-AC-RUN-FKNN 0.7106 0.4670 0.3995 0.3086 0.3972 0.3902 0.7106 0.4855 0.4384
    KGRec 0.8087 0.7608 0.6786 0.2910 0.4804 0.5057 0.8087 0.6544 0.6017
    PyRec 0.7948 0.7018 0.6537 0.3510 0.6018 0.6748 0.7948 0.7408 0.7401
    下载: 导出CSV
  • [1] American Diabetes Association. Diagnosis and classification of diabetes mellitus[J]. Diabetes Care, 2014, 37(S1): S81–S90. doi: 10.2337/dc14-S081.
    [2] 姚欣卉, 肖洪彬, 卞敬琦, 等. 丹参有效成分在治疗糖尿病及其并发症中的作用机制研究进展[J]. 中国实验方剂学杂志, 2021, 27(7): 209–218. doi: 10.13422/j.cnki.syfjx.20210401.

    YAO Xinhui, XIAO Hongbin, BIAN Jingqi, et al. New progress in mechanism of Salviae Miltiorrhizae Radix et Rhizoma in treatment of diabetes and its complications[J]. Chinese Journal of Experimental Traditional Medical Formulae, 2021, 27(7): 209–218. doi: 10.13422/j.cnki.syfjx.20210401.
    [3] GUAN Zhouyu, LI Huating, LIU Ruhan, et al. Artificial intelligence in diabetes management: Advancements, opportunities, and challenges[J]. Cell Reports Medicine, 2023, 4(10): 101213. doi: 10.1016/j.xcrm.2023.101213.
    [4] ZHANG Lufang, YU Renyue, CHEN Keya, et al. Enhancing deep vein thrombosis prediction in patients with coronavirus disease 2019 using improved machine learning model[J]. Computers in Biology and Medicine, 2024, 173: 108294. doi: 10.1016/j.compbiomed.2024.108294.
    [5] RAHMAN M M, AL-AMIN M, and HOSSAIN J. Machine learning models for chronic kidney disease diagnosis and prediction[J]. Biomedical Signal Processing and Control, 2024, 87: 105368. doi: 10.1016/j.bspc.2023.105368.
    [6] ALTHOBAITI T, ALTHOBAITI S, and SELIM M M. An optimized diabetes mellitus detection model for improved prediction of accuracy and clinical decision-making[J]. Alexandria Engineering Journal, 2024, 94: 311–324. doi: 10.1016/j.aej.2024.03.044.
    [7] AL-SSULAMI A M, ALSORORI R S, AZMI A M, et al. Improving coronary heart disease prediction through machine learning and an innovative data augmentation technique[J]. Cognitive Computation, 2023, 15(5): 1687–1702. doi: 10.1007/s12559-023-10151-6.
    [8] 金怀平, 薛飞跃, 李振辉, 等. 基于病理图像集成深度学习的胃癌预后预测方法[J]. 电子与信息学报, 2023, 45(7): 2623–2633. doi: 10.11999/JEIT220655.

    JIN Huaiping, XUE Feiyue, LI Zhenhui, et al. Prognostic prediction of gastric cancer based on ensemble deep learning of pathological images[J]. Journal of Electronics & Information Technology, 2023, 45(7): 2623–2633. doi: 10.11999/JEIT220655.
    [9] 季薇, 王传瑜, 吴迪, 等. 基于跨语种声学分析的帕金森病检测方法[J]. 电子与信息学报, 2024, 46(2): 546–554. doi: 10.11999/JEIT230981.

    JI Wei, WANG Chuanyu, WU Di, et al. Parkinson's disease detection method based on cross-language acoustic analysis[J]. Journal of Electronics & Information Technology, 2024, 46(2): 546–554. doi: 10.11999/JEIT230981.
    [10] GHORBANI M, KAZI A, BAGHSHAH M S, et al. RA-GCN: Graph convolutional network for disease prediction problems with imbalanced data[J]. Medical Image Analysis, 2023, 75: 102272. doi: 10.1016/j.media.2021.102272.
    [11] ZHAO Qing, LI Jianqiang, ZHAO Linna, et al. Knowledge guided feature aggregation for the prediction of chronic obstructive pulmonary disease with Chinese EMRs[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 20(6): 3343–3352. doi: 10.1109/TCBB.2022.3198798.
    [12] PHAM T, TAO Xiaohui, ZHANG Ji, et al. Graph-based multi-label disease prediction model learning from medical data and domain knowledge[J]. Knowledge-Based Systems, 2022, 235: 107662. doi: 10.1016/j.knosys.2021.107662.
    [13] QU Zhe, CUI Lizhen, and XU Yonghui. Disease risk prediction via heterogeneous graph attention networks[C]. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, USA, IEEE, 2022: 3385–3390. doi: 10.1109/BIBM55620.2022.9995491.
    [14] LU Chang, HAN Tian, and NING Yue. Context-aware health event prediction via transition functions on dynamic disease graphs[C]. The 36th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2022: 4567–4574. doi: 10.1609/aaai.v36i4.20380.
    [15] 熊立鹏, 徐修远, 牛颢, 等. 融合nmODE的术后肺部并发症预测模型[J]. 智能系统学报, 2025, 20(1): 198–205. doi: 10.11992/tis.202401007.

    XIONG Lipeng, XU Xiuyuan, NIU Hao, et al. Predicting postoperative pulmonary complications after lung surgery using nmODE[J]. CAAI Transactions on Intelligent Systems, 2025, 20(1): 198–205. doi: 10.11992/tis.202401007.
    [16] SUN Zhoujian, DONG Wei, SHI Jinlong, et al. Interpretable disease progression prediction based on reinforcement reasoning over a knowledge graph[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024, 54(3): 1948–1959. doi: 10.1109/TSMC.2023.3331847.
    [17] CHEN Xiaojun, JIA Shengbin, and XIANG Yang. A review: Knowledge reasoning over knowledge graph[J]. Expert Systems with Applications, 2020, 141: 112948. doi: 10.1016/j.eswa.2019.112948.
    [18] BORDES A, USUNIER N, GARCIA-DURÁN A, et al. Translating embeddings for modeling multi-relational data[C]. The 27th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2013: 2787–2795.
    [19] LIN Yankai, LIU Zhiyuan, SUN Maosong, et al. Learning entity and relation embeddings for knowledge graph completion[C]. The 29th AAAI Conference on Artificial Intelligence, Austin, USA, 2015: 2181–2187. doi: 10.1609/aaai.v29i1.9491.
    [20] TROUILLON T, WELBL J, RIEDEL S, et al. Complex embeddings for simple link prediction[C]. The 33rd International Conference on Machine Learning, New York, USA, 2016: 2071–2080.
    [21] HE Zexue, YAN An, GENTILI A, et al. “Nothing abnormal”: Disambiguating medical reports via contrastive knowledge infusion[C]. The 37th AAAI Conference on Artificial Intelligence, Washington, USA, 2023: 14232–14240. doi: 10.1609/aaai.v37i12.26665.
    [22] SUN Zhiqing, DENG Zhihong, NIE Jianyun, et al. Rotate: Knowledge graph embedding by relational rotation in complex space[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019: 1–18.
    [23] QIU Jiezhong, TANG Jian, MA Hao, et al. DeepInf: Social influence prediction with deep learning[C]. The 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, United Kingdom, 2018: 2110–2119. doi: 10.1145/3219819.3220077.
    [24] WANG Xiang, HE Xiangnan, CAO Yixin, et al. KGAT: Knowledge graph attention network for recommendation[C]. The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, USA, 2019: 950–958. doi: 10.1145/3292500.3330989.
    [25] RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[C]. The 25th Conference on Uncertainty in Artificial Intelligence, Montreal, Canada, 2009: 452–461.
    [26] STEFAN N and CUSI K. A global view of the interplay between non-alcoholic fatty liver disease and diabetes[J]. The Lancet Diabetes & Endocrinology, 2022, 10(4): 284–296. doi: 10.1016/S2213-8587(22)00003-1.
    [27] CARRASCO-ZANINI J, PIETZNER M, KOPRULU M, et al. Proteomic prediction of diverse incident diseases: A machine learning-guided biomarker discovery study using data from a prospective cohort study[J]. The Lancet Digital Health, 2024, 6(7): e470–e479. doi: 10.1016/S2589-7500(24)00087-6.
    [28] LI Bo, QUAN Haowei, WANG Jiawei, et al. Neural library recommendation by embedding project-library knowledge graph[J]. IEEE Transactions on Software Engineering, 2024, 50(6): 1620–1638. doi: 10.1109/TSE.2024.3393504.
    [29] YANG Yuhao, HUANG Chao, XIA Lianghao, et al. Knowledge graph self-supervised rationalization for recommendation[C]. The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, USA, 2023: 3046–3056. doi: 10.1145/3580305.3599400.
    [30] KINGMA D P and BA J. Adam: A method for stochastic optimization[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–15.
    [31] HAMILTON W L, YING R, and LESKOVEC J. Inductive representation learning on large graphs[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 1025–1035.
  • 加载中
图(7) / 表(4)
计量
  • 文章访问数:  20
  • HTML全文浏览量:  15
  • PDF下载量:  2
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-08-26
  • 修回日期:  2025-10-27
  • 网络出版日期:  2025-11-04

目录

    /

    返回文章
    返回