Lightweight Incremental Deployment for Computing-Network Converged AI Services
-
摘要: 近年来,人工智能(AI)计算服务的规模和复杂性迅速增长要求算力资源能够被灵活访问和高效使用。作为用户与算力资源间访问和交互的重要通道,网络的能力和性能也亟需进行提升以支持AI计算服务的应用需求,如低时延、高并发等。然而,传统的域名系统(DNS)和基于IP的调度机制在满足这些需求方面面临适应性不足和智能化缺失的问题。因此,计算与网络资源的一体化(即算网融合)成为了解决上述问题的关键途径。鉴于此,该文引入了一种面向AI计算的语义化服务标识(AISID),用于对服务进行编码,AISID的引入实现了服务请求与资源位置的解耦,从而支持更灵活精确的服务调度。在此基础上,提出一种算网融合的轻量化增量部署方案,通过将智能路由与资源调度相结合以优化服务请求的路由及资源分配。通过对核心设备实施轻量化的增量部署,可在最小改动现有网络的情况下优化网络性能,并增强系统可扩展性。实验结果表明,在500个并发请求的负载条件下,相较于传统的DNS调度和网络架构,AISID机制将请求响应时间降低了61.3%;轻量化部署方案使链路带宽使用率方差和算力使用率方差分别降低32.8%和12.3%。这些结果验证了所提方法在提升AI计算服务性能和资源利用效率方面的有效性,表明该方法为实现算网融合提供了一种有效途径。Abstract:
Objective The rapid expansion of Artificial Intelligence (AI) computing services has heightened the demand for flexible access and efficient utilization of computing resources. Traditional Domain Name System (DNS) and IP-based scheduling mechanisms are constrained in addressing the stringent requirements of low latency and high concurrency, highlighting the need for integrated computing-network resource management. To address these challenges, this study proposes a lightweight deployment framework that enhances network adaptability and resource scheduling efficiency for AI services. Methods The AI-oriented Service IDentifier (AISID) is designed to encode service attributes into four dimensions: Object, Function, Method, and Performance. Service requests are decoupled from physical resource locations, enabling dynamic resource matching. AISID is embedded within IPv6 packets ( Fig. 5 ), consisting of a 64-bit prefix for identification and a 64-bit service-specific suffix (Fig. 4 ). A lightweight incremental deployment scheme is implemented through hierarchical routing, in which stable wide-area routing is managed by ingress gateways, and fine-grained local scheduling is handled by egress gateways (Fig. 6 ). Ingress and egress gateways are incrementally deployed under the coordination of an intelligent control system to optimize resource allocation. AISID-based paths are encapsulated at ingress gateways using Segment Routing over IPv6 (SRv6), whereas egress gateways select optimal service nodes according to real-time load data using a weighted least-connections strategy (Fig. 8 ). AISID lifecycle management includes registration, query, migration, and decommissioning phases (Table 2 ), with global synchronization maintained by the control system. Resource scheduling is dynamically adjusted according to real-time network topology and node utilization metrics (Fig. 7 ).Results and Discussions Experimental results show marked improvements over traditional DNS/IP architectures. The AISID mechanism reduces service request initiation latency by 61.3% compared to DNS resolution ( Fig. 9 ), as it eliminates the need for round-trip DNS queries. Under 500 concurrent requests, network bandwidth utilization variance decreases by 32.8% (Fig. 10 ), reflecting the ability of AISID-enabled scheduling to alleviate congestion hotspots. Computing resource variance improves by 12.3% (Fig. 11 ), demonstrating more balanced workload distribution across service nodes. These improvements arise from AISID’s precise semantic matching in combination with the hierarchical routing strategy, which together enhance resource allocation efficiency while maintaining compatibility with existing IPv6/DNS infrastructure (Fig. 2 –3 ). The incremental deployment approach further reduces disruption to legacy networks, confirming the framework’s practicality and viability for real-world deployment.Conclusions This study establishes a computing-network convergence framework for AI services based on semantic-driven AISID and lightweight deployment. The key innovations include AISID’s semantic encoding, which enables dynamic resource scheduling and decoupled service access, together with incremental gateway deployment that optimizes routing without requiring major modifications to legacy networks. Experimental validation demonstrates significant improvements in latency reduction, bandwidth efficiency, and balanced resource utilization. Future research will explore AISID’s scalability across heterogeneous domains and its robustness under dynamic network conditions. -
Key words:
- Service identifier /
- AI computing services /
- Lightweight deployment /
- Resource scheduling
-
表 1 携带AISID的IPv6报文示例
字段 描述 值/示例 版本 IPv6协议版本,固定为6位 6 流量类 用于流量标识,通常为0,长度8位 0x00 流量标签 用于标识流量类型,长度20位 0x00000 负载长度 报文有效负载的长度,16位 40(假设有效载荷长度为40字节) 下一个头部 表示协议类型,长度8位,表示传输协议类型 17(例如UDP协议,假设为17) 跳数限制 TTL(Time-To-Live),最大跳数,长度8位 64 源IP地址 发起请求的源设备的IPv6地址,128位 fd00:abcd: 1234 :5678 :
90ab:cdef:2345 :6789 AISID 存储AISID, 128位 2001:0db8:abcd:0012:
0001:0002:0001:1388 表 2 AISID生命周期管理的任务与实现机制
阶段 主要任务 关键接口与机制 服务注册 AISID生成、验证并登记;智能管控系统全局注册与同步更新 服务注册接口,AISID唯一性校验机制 服务查询 用户发起AISID请求;入口/出口网关执行路由与节点定位 AISID路由表,1级/2级路由机制 服务迁移 更新AISID与节点位置映射;向智能管控系统同步变更信息 服务迁移通知接口,智能管控系统同步机制 服务下线 AISID注销、缓存清理与回收;通知相关用户 服务下线接口,AISID回收机制 表 3 网络和服务请求参数设置
网络参数 网络节点数 42 入口网关数 10 出口网关数 4 链路数 66 链路带宽 100 Gbit/(s·Hz) 链路传播时延 5 ms 服务提供商局域网内设备总算力 10000TFLOPS 服务提供商可提供的服务类型数 10 服务请求参数 用户可请求的服务类型数 20 带宽需求 20Mbps~100 Mbit/(s·Hz) 算力需求 1TFLOPS~10TFLOPS -
[1] 刘强, 崔莉, 陈海明. 物联网关键技术与应用[J]. 计算机科学, 2010, 37(6): 1–4,10. doi: 10.3969/j.issn.1002-137X.2010.06.001.LIU Qiang, CUI Li, and CHEN Haiming. Key technologies and applications of internet of things[J]. Computer Science, 2010, 37(6): 1–4,10. doi: 10.3969/j.issn.1002-137X.2010.06.001. [2] WANG Shuo, ZHANG Xing, ZHANG Yan, et al. A survey on mobile edge networks: Convergence of computing, caching and communications[J]. IEEE Access, 2017, 5: 6757–6779. doi: 10.1109/ACCESS.2017.2685434. [3] TRIGKA M and DRITSAS E. Edge and cloud computing in smart cities[J]. Future Internet, 2025, 17(3): 118. doi: 10.3390/fi17030118. [4] SINGH R and GILL S S. Edge AI: A survey[J]. Internet of Things and Cyber-Physical Systems, 2023, 3: 71–92. doi: 10.1016/j.iotcps.2023.02.004. [5] ALSADIE D. Advancements in heuristic task scheduling for IoT applications in fog-cloud computing: Challenges and prospects[J]. PeerJ Computer Science, 2024, 10: e2128. doi: 10.7717/peerj-cs.2128. [6] PENG Xiaohui, SUN Yixuan, ZHANG Zhenghui, et al. DSparse: A distributed training method for edge clusters based on sparse update[J]. Journal of Computer Science and Technology, 2025, 40(3): 637–653. doi: 10.1007/s11390-025-4821-5. [7] SU Weixing, LI Linfeng, LIU Fang, et al. AI on the Edge: A comprehensive review[J]. Artificial Intelligence Review, 2022, 55(8): 6125–6183. doi: 10.1007/s10462-022-10141-4. [8] 张宏科, 于成晓, 权伟, 等. 融算网络体系基础研究[J]. 电子学报, 2022, 50(12): 2928–2934. doi: 10.12263/DZXB.20221140.ZHANG Hongke, YU Chengxiao, QUAN Wei, et al. Fundamental research on computing integration networking[J]. Acta Electronica Sinica, 2022, 50(12): 2928–2934. doi: 10.12263/DZXB.20221140. [9] ZHANG Zhen, CHANG Chaokun, LIN Haibin, et al. Is network the bottleneck of distributed training?[C]. Proceedings of the Workshop on Network Meets AI & ML (NetAI '20), USA, 2020: 8–13. doi: 10.1145/3405671.3405810. (查阅网上资料,未找到本条文献出版地信息,请确认). [10] ISMAIL A A, KHALIFA N E, and EL-KHORIBI R A. A survey on resource scheduling approaches in multi-access edge computing environment: A deep reinforcement learning study[J]. Cluster Computing, 2025, 28(3): 184. doi: 10.1007/s10586-024-04893-7. [11] AKTAS F, SHAYEA I, ERGEN M, et al. AI-enabled routing in next generation networks: A survey[J]. Alexandria Engineering Journal, 2025, 120: 449–474. doi: 10.1016/j.aej.2025.01.095. [12] GAO Tianfu and DONG Qingkuan. DNS-BC: Fast, reliable and secure domain name system caching system based on a consortium blockchain[J]. Sensors, 2023, 23(14): 6366. doi: 10.3390/s23146366. [13] DAN O, PARIKH V, and DAVISON B D. IP geolocation through reverse DNS[J]. ACM Transactions on Internet Technology, 2022, 22(1): 17. doi: 10.1145/3457611. [14] DENG Shuiguang, ZHAO Hailiang, FANG Weijia, et al. Edge intelligence: The confluence of edge computing and artificial intelligence[J]. IEEE Internet of Things Journal, 2020, 7(8): 7457–7469. doi: 10.1109/JIOT.2020.2984887. [15] 陈前斌, 谭颀, 贺兰钦, 等. 云雾混合网络下基于多智能体架构的资源分配及卸载决策研究[J]. 电子与信息学报, 2021, 43(9): 2654–2662. doi: 10.11999/JEIT200256.CHEN Qianbin, TAN Qi, HE Lanqin, et al. Research on resource allocation and offloading decision based on multi-agent architecture in cloud-fog hybrid network[J]. Journal of Electronics & Information Technology, 2021, 43(9): 2654–2662. doi: 10.11999/JEIT200256. [16] ZHOU Guangyao, TIAN Wenhong, BUYYA R, et al. Deep reinforcement learning-based methods for resource scheduling in cloud computing: A review and future directions[J]. Artificial Intelligence Review, 2024, 57(5): 124. doi: 10.1007/s10462-024-10756-9. [17] SHEN Wangbo, LIN Weiwei, WU Wentai, et al. Reinforcement learning-based task scheduling for heterogeneous computing in end-edge-cloud environment[J]. Cluster Computing, 2025, 28(3): 179. doi: 10.1007/s10586-024-04828-2. [18] BALAKRISHNAN H, BANERJEE S, CIDON I, et al. Revitalizing the public internet by making it extensible[J]. ACM SIGCOMM Computer Communication Review, 2021, 51(2): 18–24. doi: 10.1145/3464994.3464998. [19] KNIGHT S, NGUYEN H X, FALKNER N, et al. The Internet topology zoo[J]. IEEE Journal on Selected Areas in Communications, 2011, 29(9): 1765–1775. doi: 10.1109/JSAC.2011.111002. -