Advanced Search
Turn off MathJax
Article Contents
HE Qian, ZHU Lei, LI Gong, YOU Zhengpeng, YUAN Lei, JIA Fei. Research on Collaborative Reasoning Framework and Algorithms of Cloud-Edge Large Models for Intelligent Auxiliary Diagnosis Systems[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250828
Citation: HE Qian, ZHU Lei, LI Gong, YOU Zhengpeng, YUAN Lei, JIA Fei. Research on Collaborative Reasoning Framework and Algorithms of Cloud-Edge Large Models for Intelligent Auxiliary Diagnosis Systems[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250828

Research on Collaborative Reasoning Framework and Algorithms of Cloud-Edge Large Models for Intelligent Auxiliary Diagnosis Systems

doi: 10.11999/JEIT250828 cstr: 32379.14.JEIT250828
Funds:  None
  • Received Date: 2025-09-01
  • Accepted Date: 2025-11-05
  • Rev Recd Date: 2025-11-05
  • Available Online: 2025-11-13
  •   Objective  The deployment of large language models (LLMs) in intelligent auxiliary diagnosis is constrained by two critical challenges: insufficient computing power for localized deployment in hospitals and significant privacy risks associated with medical data transmission and storage in cloud environments. Low-parameter local LLMs suffer from 20%-30% lower accuracy in medical knowledge Q&A and 15%-25% reduced medical knowledge coverage compared to full-parameter cloud LLMs, while cloud-based solutions face inherent data security and privacy protection issues. To address these dilemmas, this study aims to propose a cloud-edge LLM collaborative reasoning framework and corresponding algorithms for intelligent auxiliary diagnosis systems. The core objective is to develop a cloud-edge collaborative reasoning agent integrated with intelligent routing and dynamic semantic desensitization capabilities, enabling dynamic task allocation between edge (hospital-end) and cloud (regional cloud) sides. This framework seeks to balance diagnostic accuracy, data privacy security, and resource utilization efficiency, providing a viable technical paradigm for the advancement of medical artificial intelligence systems.  Methods  The proposed framework adopts a layered architectural design, consisting of a four-tier progressive architecture on the edge side and a four-tier service-oriented architecture on the cloud side (Fig. 1). The edge side encompasses resource, data, model, and application layers, with the model layer hosting lightweight medical LLMs and the cloud-edge collaborative agent. The cloud side includes AI IaaS, AI PaaS, AI MaaS, and AI SaaS layers, serving as a convergence center for computing power and advanced models. The collaborative reasoning process follows a structured business workflow (Fig. 2), starting with user input parsed by the agent to extract clinical key features, followed by reasoning node decision-making. Two core technologies underpin the agent: 1) Intelligent routing: This mechanism prioritizes edge-side processing by default and dynamically selects optimal reasoning paths (edge or cloud) through a dual-driven weight update strategy. It integrates semantic feature similarity (calculated via Chinese word segmentation and pre-trained medical language models) and historical decision data, with exponential moving average used to update feature libraries for adaptive optimization. 2) Dynamic semantic desensitization: Employing a three-stage architecture (sensitive entity recognition, semantic correlation analysis, and hierarchical desensitization decision-making), this technology identifies sensitive entities via a domain-enhanced named entity recognition (NER) model, calculates entity sensitivity and desensitization priority, and enforces a semantic similarity constraint to avoid excessive desensitization. Three desensitization strategies (complete deletion, general replacement, partial masking) are applied based on entity sensitivity. Experimental validation was conducted using two open-source Chinese medical knowledge graphs (CMeKG and CPubMedKG) covering over 2.7 million medical entities. The experimental environment (Fig. 3) deployed a qwen3:1.7b model on the edge and the Jiutian LLM on the cloud, with a 5,000-sample evaluation dataset divided into entity-level, relation-level, and subgraph-level questions. Performance was assessed using three core metrics: answer accuracy, average token consumption, and average response time.  Results and Discussions  Experimental results demonstrate that the proposed framework achieves remarkable performance across key evaluation dimensions. In terms of answer accuracy, the intelligent routing mechanism yields overall accuracy of 72.44% (CMeKG)(Fig. 4) and 66.20% (CPubMedKG) (Fig. 5), which are significantly higher than those of the edge-side LLM alone (60.73% and 54.18%) and nearly comparable to the cloud LLM (72.68% and 66.49%). This confirms that the framework maintains diagnostic consistency with cloud-based solutions while leveraging edge-side capabilities. Regarding resource efficiency, the intelligent routing model reduces average token consumption to 61.27, accounting for only 45.63% of the cloud LLM’s token usage (131.68) (Fig. 6), resulting in substantial cost savings. In terms of response time, the edge-side LLM exhibits a latency exceeding 6s due to computing power limitations, while the cloud LLM achieves 0.44s latency via dedicated line access (8% of the 5.46s latency with internet access). The intelligent routing model’s average latency falls between the edge and cloud LLMs under both access modes (Fig. 7), aligning with expected performance trade-offs. The framework demonstrates strong applicability across typical medical scenarios (Table 1), including outpatient triage, chronic disease management, medical image analysis, intensive care, and health consultation, by combining local real-time processing advantages with cloud-based deep reasoning capabilities. However, limitations exist in emergency rescue scenarios with poor network conditions (due to latency constraints) and rare disease diagnosis (due to insufficient edge-side training samples and potential loss of individual features during desensitization). These results collectively validate that the cloud-edge collaborative reasoning mechanism effectively optimizes computing resource overhead while ensuring diagnostic result consistency.  Conclusions  This study successfully constructs a cloud-edge LLM collaborative reasoning framework for intelligent auxiliary diagnosis systems, addressing the key challenges of limited local computing power and cloud data privacy risks. By integrating intelligent routing, prompt engineering adaptation, and dynamic semantic desensitization technologies, the framework achieves a balanced optimization of diagnostic accuracy, data security, and resource economy. The experimental validation confirms that the framework’s performance is comparable to that of cloud-only LLMs in terms of accuracy while significantly reducing resource consumption, providing a new technical path for medical intelligence upgrading. Future research will focus on three directions: first, intelligent on-demand scheduling of computing and network resources to address latency issues caused by edge-side computing bottlenecks; second, collaborative deployment of localized LLMs with Retrieval-Augmented Generation (RAG) to enhance edge-side standalone accuracy to over 90%; and third, expansion of medical diagnostic evaluation indicators to establish a three-dimensional "scenario-node-indicator" system, incorporating sensitivity, specificity, and AUC for clinical-oriented validation.
  • loading
  • [1]
    GUO Daya, YANG Dejian, ZHANG Haowei, et al. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning[EB/OL]. https://arxiv.org/abs/2501.12948, 2025.
    [2]
    LIU Aixin, FENG Bei, XUE Bing, et al. DeepSeek-V3 technical report[EB/OL]. https://arxiv.org/abs/2412.19437, 2025.
    [3]
    ZHANG Ziheng, LIN Zhenxi, ZHENG Yefeng, et al. How much medical knowledge do LLMs have? An evaluation of medical knowledge coverage for LLMs[C]. Proceedings of the ACM on Web Conference 2025, Sydney, Australia, 2025: 5330–5341. doi: 10.1145/3696410.3714535.
    [4]
    VINEELA A, KASIVISWANATH N, and BINDU C S. Data integrity auditing scheme for preserving security in cloud based big data[C]. 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2022: 609–613. doi: 10.1109/ICICCS53718.2022.9788365.
    [5]
    ZHANG Sainan and SONG J. A chatbot based question and answer system for the auxiliary diagnosis of chronic diseases based on large language model[J]. Scientific Reports, 2024, 14(1): 17118. doi: 10.1038/s41598-024-67429-4.
    [6]
    MAO Yuqiang, XU Nan, WU Yanan, et al. Assessments of lung nodules by an artificial intelligence chatbot using longitudinal CT images[J]. Cell Reports Medicine, 2025, 6(3): 101988. doi: 10.1016/j.xcrm.2025.101988.
    [7]
    PANAGOULIAS D P, PALAMIDAS F A, VIRVOU M, et al. Rule-augmented artificial intelligence-empowered systems for medical diagnosis using large language models[C]. 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), Atlanta, USA, 2023: 70–77. doi: 10.1109/ICTAI59109.2023.00018.
    [8]
    YU Han, GUO Peikun, and SANO A. Zero-shot ECG diagnosis with large language models and retrieval-augmented generation[C]. Proceedings of Machine Learning Research, New Orleans, USA, 2023: 650–663.
    [9]
    陈玉平, 刘波, 林伟伟, 等. 云边协同综述[J]. 计算机科学, 2021, 48(3): 259–268. doi: 10.11896/jsjkx.201000109.

    CHEN Yuping, LIU Bo, LIN Weiwei, et al. Survey of cloud-edge collaboration[J]. Computer Science, 2021, 48(3): 259–268. doi: 10.11896/jsjkx.201000109.
    [10]
    LUO Zeliang, DING Xiaoxuan, HOU Ning, et al. A deep-learning-based collaborative edge-cloud telemedicine system for retinopathy of prematurity[J]. Sensors, 2023, 23(1): 276. doi: 10.3390/s23010276.
    [11]
    LIU Yehui, XU Aobo, ZENG Hui, et al. Edge computing-based cloud platform for snakebite assisted diagnosis[C]. Proceedings of the 2023 8th International Conference on Biomedical Signal and Image Processing, Chengdu, China, 2023: 18–22. doi: 10.1145/3613307.3613311.
    [12]
    王继彬, 张虎, 陈静, 等. 算力网络场景下的超算互联网建设探索与实践[J]. 邮电设计技术, 2024(2): 14–21. doi: 10.12045/j.issn.1007-3043.2024.02.003.

    WANG Jibin, ZHANG Hu, CHEN Jing, et al. Exploration and practice of supercomputing internet construction in computing power network scenarios[J]. Designing Techniques of Posts and Telecommunications, 2024(2): 14–21. doi: 10.12045/j.issn.1007-3043.2024.02.003.
    [13]
    李逸博, 李小平, 王爽, 等. 面向算力网络的智慧调度综述[J]. 自动化学报, 2024, 50(6): 1086–1103. doi: 10.16383/j.aas.c230196.

    LI Yibo, LI Xiaoping, WANG Shuang, et al. Survey on wise scheduling in computing power network[J]. Acta Automatica Sinica, 2024, 50(6): 1086–1103. doi: 10.16383/j.aas.c230196.
    [14]
    GAN Wensheng, WAN Shicheng, and YU P S. Model-as-a-service (MaaS): A survey[C]. 2023 IEEE International Conference on Big Data, Sorrento, Italy, 2023: 4636–4645. doi: 10.1109/BigData59044.2023.10386351.
    [15]
    赵婵婵, 吕飞, 石宝, 等. 面向边缘智能的协同推理方法研究综述[J]. 计算机工程与应用, 2025, 61(3): 1–20. doi: 10.3778/j.issn.1002-8331.2406-0040.

    ZHAO Chanchan, LYU Fei, SHI Bao, et al. Review of collaborative inference methods for edge intelligence[J]. Computer Engineering and Applications, 2025, 61(3): 1–20. doi: 10.3778/j.issn.1002-8331.2406-0040.
    [16]
    庄严, 张军雁, 卢若谷, 等. 基于医学大模型的智能问诊助手构建研究[J]. 解放军医学院学报, 2025, 46(2): 126–133. doi: 10.12435/j.issn.2095-5227.24070108.

    ZHUANG Yan, ZHANG Junyan, LU Ruogu, et al. Constructing an intelligent consultation assistant system based on medical large language models[J]. Academic Journal of Chinese PLA Medical School, 2025, 46(2): 126–133. doi: 10.12435/j.issn.2095-5227.24070108.
    [17]
    ZHANG Xianwei, WU Peng, CAI Jiuming, et al. A contrastive study of Chinese text segmentation tools in marketing notification texts[J]. Journal of Physics: Conference Series, 2019, 1302(2): 022010. doi: 10.1088/1742-6596/1302/2/022010.
    [18]
    DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, 2019: 4171–4186. doi: 10.18653/v1/N19-1423.
    [19]
    ROMEO J, ABBASS M, SHERIF A, et al. Privacy-preserving machine learning for E-health applications: A survey[C]. 2024 IEEE 3rd International Conference on Computing and Machine Intelligence (ICMI), Mt Pleasant, USA, 2024: 1–6. doi: 10.1109/ICMI60790.2024.10586115.
    [20]
    奥德玛, 杨云飞, 穗志方, 等. 中文医学知识图谱CMeKG构建初探[J]. 中文信息学报, 2019, 33(10): 1–9. doi: 10.3969/j.issn.1003-0077.2019.10.001.

    AO Dema, YANG Yunfei, SUI Zhifang, et al. Preliminary study on the construction of Chinese medical knowledge graph[J]. Journal of Chinese Information Processing, 2019, 33(10): 1–9. doi: 10.3969/j.issn.1003-0077.2019.10.001.
    [21]
    LI Bin, SUN Bin, LI Shutao, et al. Distinct but correct: Generating diversified and entity-revised medical response[J]. Science China Information Sciences, 2024, 67(3): 132106. doi: 10.1007/s11432-021-3534-9.
    [22]
    赵鹏, 李金翼, 王琛, 等. 人工智能能力与算力网络智慧运营研究与应用[J]. 计算机应用, 2025, 45(S1): 295–301.

    ZHAO Peng, LI Jinyi, WANG Chen, et al. Research and application on intelligent operation of artificial intelligence capability and computing power network[J]. Journal of Computer Applications, 2025, 45(S1): 295–301.
    [23]
    REZAEI M R, FARD R S, PARKER J L, et al. Agentic medical knowledge graphs enhance medical question answering: Bridging the gap between LLMs and evolving medical knowledge[C]. Findings of the Association for Computational Linguistics, Suzhou, China, 2025: 12682–12701. doi: 10.18653/v1/2025.findings-emnlp.679.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)  / Tables(1)

    Article Metrics

    Article views (38) PDF downloads(3) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return