高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于改进深度强化学习的虚拟网络功能部署优化算法

唐伦 贺兰钦 连沁怡 谭颀

唐伦, 贺兰钦, 连沁怡, 谭颀. 基于改进深度强化学习的虚拟网络功能部署优化算法[J]. 电子与信息学报, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297
引用本文: 唐伦, 贺兰钦, 连沁怡, 谭颀. 基于改进深度强化学习的虚拟网络功能部署优化算法[J]. 电子与信息学报, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297
Lun TANG, Lanqin HE, Qinyi LIAN, Qi TAN. Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297
Citation: Lun TANG, Lanqin HE, Qinyi LIAN, Qi TAN. Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297

基于改进深度强化学习的虚拟网络功能部署优化算法

doi: 10.11999/JEIT200297
基金项目: 国家自然科学基金(62071078),重庆市教委科学技术研究项目(KJZD-M201800601),重庆市重大主题专项 (cstc2019jscx-zdztzxX0006)
详细信息
    作者简介:

    唐伦:男,1973年生,教授,博士,研究方向为下一代无线通信网络、异构蜂窝网络、软件定义无线网络等

    贺兰钦:男,1995年生,硕士生,研究方向为5G网络切片、机器学习算法

    谭颀:女,1995年生,硕士生,研究方向为5G网络切片、资源分配、随机优化理论

    通讯作者:

    贺兰钦 719097886@qq.com

  • 中图分类号: TN929.5

Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning

Funds: The National Natural Science Foundation of China (62071078), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601), The Major Theme Special Projects of Chongqing (cstc2019jscx-zdztzxX0006)
  • 摘要: 针对网络功能虚拟化/软件定义网络 (NFV/SDN)架构下,网络服务请求动态到达引起的服务功能链(SFC)部署优化问题,该文提出一种基于改进深度强化学习的虚拟网络功能(VNF)部署优化算法。首先,建立了马尔科夫决策过程 (MDP)的随机优化模型,完成SFC的在线部署以及资源的动态分配,该模型联合优化SFC部署成本和时延成本,同时受限于SFC的时延以及物理资源约束。其次,在VNF部署和资源分配的过程中,存在状态和动作空间过大,以及状态转移概率未知等问题,该文提出了一种基于深度强化学习的VNF智能部署算法,从而得到近似最优的VNF部署策略和资源分配策略。最后,针对深度强化学习代理通过ε贪婪策略进行动作探索和利用,造成算法收敛速度慢等问题,提出了一种基于值函数差异的动作探索和利用方法,并进一步采用双重经验回放池,解决经验样本利用率低的问题。仿真结果表示,该算法能够加快神经网络收敛速度,并且可以同时优化SFC部署成本和SFC端到端时延。
  • 图  1  系统模型

    图  2  改进深度强化学习算法框架

    图  3  损失函数对比

    图  4  系统总时延对比

    图  5  部署成本对比

    图  6  效用对比

    表  1  网络场景的仿真参数

    仿真参数仿真参数
    数据包到达过程泊松过程${\lambda _i} = 2$数据包大小500 kByte/packet
    通用服务器总台数$H$6台物理链路带宽资源640 MB
    通用服务器$v$的CPU资源容量8核单个CPU服务速率$\beta $25 MB/s
    折扣因子$\gamma $0.99软更新因子$\tau $0.01
    最大迭代轮数2000学习率$\alpha $$\left\{ {0.00001,0.0001} \right\}$
    SFC的长度Uniform[2,5]个SFC的时延最长限制${D_i}$30 ms
    正数$\partial $30正数$\varsigma $20
    下载: 导出CSV
  • [1] 唐伦, 杨恒, 马润琳, 等. 基于5G接入网络的多优先级虚拟网络功能迁移开销与网络能耗联合优化算法[J]. 电子与信息学报, 2019, 41(9): 2079–2086. doi: 10.11999/JEIT180906

    TANG Lun, YANG Heng, MA Runlin, et al. Multi-priority based joint optimization algorithm of virtual network function migration cost and network energy consumption[J]. Journal of Electronics &Information Technology, 2019, 41(9): 2079–2086. doi: 10.11999/JEIT180906
    [2] KUO T W, LIOU B H, LIN K C J, et al. Deploying chains of virtual network functions: On the relation between link and server usage[J]. IEEE/ACM Transactions on Networking, 2018, 26(4): 1562–1576. doi: 10.1109/TNET.2018.2842798
    [3] VIZARRETA P, CONDOLUCI M, MACHUCA C M, et al. QoS-driven function placement reducing expenditures in NFV deployments[C]. 2017 IEEE International Conference on Communications (ICC), Paris, France, 2017: 1–7. doi: 10.1109/ICC.2017.7996513.
    [4] XIONG Gang, HU Yuxiang, TIAN Le, et al. A virtual service placement approach based on improved quantum genetic algorithm[J]. Frontiers of Information Technology & Electronic Engineering, 2016, 17(7): 661–671. doi: 10.1631/FITEE.1500494
    [5] LUO Ziyue and WU Chuan. An online algorithm for VNF service chain scaling in datacenters[J]. IEEE/ACM Transactions on Networking, 2020, 28(3): 1061–1073. doi: 10.1109/TNET.2020.2979263
    [6] GHARBAOUI M, CONTOLI C, DAVOLI G, et al. Demonstration of latency-aware and self-adaptive service chaining in 5G/SDN/NFV infrastructures[C]. 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Verona, Italy, 2018: 1–2. doi: 10.1109/NFV-SDN.2018.8725645.
    [7] CHENG Aolin, LI Jian, YU Yuling, et al. Delay-sensitive user scheduling and power control in heterogeneous networks[J]. IET Networks, 2015, 4(3): 175–184. doi: 10.1049/iet-net.2014.0026
    [8] YANG Jian, ZHANG Shuben, WU Xiaomin, et al. Online learning-based server provisioning for electricity cost reduction in data center[J]. IEEE Transactions on Control Systems Technology, 2017, 25(3): 1044–1051. doi: 10.1109/TCST.2016.2575801
    [9] 唐伦, 杨恒, 赵国繁, 等. 基于时延感知的5G网络切片节点和链路映射算法[J]. 北京邮电大学学报, 2018, 41(6): 71–77. doi: 10.13190/j.jbupt.2018-018

    TANG Lun, YANG Heng, ZHAO Guofan, et al. Delay-aware 5G network slicing node and link embedding algorithm[J]. Journal of Beijing University of Posts and Telecommunications, 2018, 41(6): 71–77. doi: 10.13190/j.jbupt.2018-018
    [10] WANG Zhuzhu, LIU Yang, MA Zhou, et al. LiPSG: lightweight privacy-preserving Q-learning-based energy management for the IoT-Enabled smart grid[J]. IEEE Internet of Things Journal, 2020, 7(5): 3935–3947. doi: 10.1109/JIOT.2020.2968631
    [11] TOKIC M. Adaptive ε-greedy exploration in reinforcement learning based on value differences[C]. The 33rd Annual German Conference on KI 2010: Advances in Artificial Intelligence, Karlsruhe, Germany, 2010: 203-210.
    [12] CAO Xi, WAN Huaiyu, LIN Youfang, et al. High-value prioritized experience replay for off-policy reinforcement learning[C]. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, USA, 2019: 1510–1514. doi: 10.1109/ICTAI.2019.00215.
    [13] 陈卓, 冯钢, 刘蓓, 等. 运营商网络中面向时延优化的服务功能链迁移重配置策略[J]. 电子学报, 2018, 46(9): 2229–2237. doi: 10.3969/j.issn.0372-2112.2018.09.026

    CHEN Zhuo, FENG Gang, LIU Bei, et al. Delay optimization oriented service function chain migration and re-deployment in operator network[J]. Acta Electronica Sinica, 2018, 46(9): 2229–2237. doi: 10.3969/j.issn.0372-2112.2018.09.026
    [14] LI Han, LÜ Tiejun, and ZHANG Xuewei. Deep deterministic policy gradient based dynamic power control for self-powered ultra-dense networks[C]. 2018 IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, 2018: 1–6. doi: 10.1109/GLOCOMW.2018.8644157.
    [15] 金明, 李琳琳, 张文瑾, 等. 基于深度强化学习的服务功能链映射算法[J]. 计算机应用研究, 2020, 37(11): 3456–3460, 3466.

    JIN Ming, LI Linlin, ZHANG Wenjin, et al. SFC mapping algorithm based on deep reinforcement learning[J]. Application Research of Computers, 2020, 37(11): 3456–3460, 3466.
  • 加载中
图(6) / 表(1)
计量
  • 文章访问数:  1324
  • HTML全文浏览量:  539
  • PDF下载量:  162
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-04-21
  • 修回日期:  2021-01-22
  • 网络出版日期:  2021-01-29
  • 刊出日期:  2021-06-18

目录

    /

    返回文章
    返回