Advanced Search
Volume 43 Issue 6
Jun.  2021
Turn off MathJax
Article Contents
Lun TANG, Lanqin HE, Qinyi LIAN, Qi TAN. Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297
Citation: Lun TANG, Lanqin HE, Qinyi LIAN, Qi TAN. Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1724-1732. doi: 10.11999/JEIT200297

Virtual Network Function Placement Optimization Algorithm Based on Improve Deep Reinforcement Learning

doi: 10.11999/JEIT200297
Funds:  The National Natural Science Foundation of China (62071078), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601), The Major Theme Special Projects of Chongqing (cstc2019jscx-zdztzxX0006)
  • Received Date: 2020-04-21
  • Rev Recd Date: 2021-01-22
  • Available Online: 2021-01-29
  • Publish Date: 2021-06-18
  • Considering the problem of Service Function Chain (SFC) placement optimization caused by the dynamic arrival of network service requests under the Network Function Virtualization/Software Defined Network (NFV/SDN) architecture, a Virtual Network Function (VNF) placement optimization algorithm based on improved deep reinforcement learning is proposed. Firstly, a stochastic optimization model of Markov Decision Process (MDP) is established to jointly optimizes SFC placement cost and delay cost, and is constrained by the delay of SFC, as well as the resources of common server Central Processing Unit (CPU) and physical link bandwidth. Secondly, in the process of VNF placement and resource allocation, there are problems such as too large state space, high dimension of action space, and unknown state transition probability. A VNF intelligent placement algorithm based on deep reinforcement learning is proposed to obtain an approximately optimal VNF placement strategy and resource allocation strategy. Finally, considering the problems of deep reinforcement learning agent's action exploration and utilization through ε greedy strategy, resulting in low learning efficiency and slow convergence speed, a method of action exploration and utilization based on the difference of value function is proposed, and further adopts dual experience playback pool to solve the problem of low utilization of empirical samples. Simulation results show that the algorithm can converge quickly, and it can optimize SFC placement cost and SFC end-to-end delay.
  • loading
  • [1]
    唐伦, 杨恒, 马润琳, 等. 基于5G接入网络的多优先级虚拟网络功能迁移开销与网络能耗联合优化算法[J]. 电子与信息学报, 2019, 41(9): 2079–2086. doi: 10.11999/JEIT180906

    TANG Lun, YANG Heng, MA Runlin, et al. Multi-priority based joint optimization algorithm of virtual network function migration cost and network energy consumption[J]. Journal of Electronics &Information Technology, 2019, 41(9): 2079–2086. doi: 10.11999/JEIT180906
    [2]
    KUO T W, LIOU B H, LIN K C J, et al. Deploying chains of virtual network functions: On the relation between link and server usage[J]. IEEE/ACM Transactions on Networking, 2018, 26(4): 1562–1576. doi: 10.1109/TNET.2018.2842798
    [3]
    VIZARRETA P, CONDOLUCI M, MACHUCA C M, et al. QoS-driven function placement reducing expenditures in NFV deployments[C]. 2017 IEEE International Conference on Communications (ICC), Paris, France, 2017: 1–7. doi: 10.1109/ICC.2017.7996513.
    [4]
    XIONG Gang, HU Yuxiang, TIAN Le, et al. A virtual service placement approach based on improved quantum genetic algorithm[J]. Frontiers of Information Technology & Electronic Engineering, 2016, 17(7): 661–671. doi: 10.1631/FITEE.1500494
    [5]
    LUO Ziyue and WU Chuan. An online algorithm for VNF service chain scaling in datacenters[J]. IEEE/ACM Transactions on Networking, 2020, 28(3): 1061–1073. doi: 10.1109/TNET.2020.2979263
    [6]
    GHARBAOUI M, CONTOLI C, DAVOLI G, et al. Demonstration of latency-aware and self-adaptive service chaining in 5G/SDN/NFV infrastructures[C]. 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Verona, Italy, 2018: 1–2. doi: 10.1109/NFV-SDN.2018.8725645.
    [7]
    CHENG Aolin, LI Jian, YU Yuling, et al. Delay-sensitive user scheduling and power control in heterogeneous networks[J]. IET Networks, 2015, 4(3): 175–184. doi: 10.1049/iet-net.2014.0026
    [8]
    YANG Jian, ZHANG Shuben, WU Xiaomin, et al. Online learning-based server provisioning for electricity cost reduction in data center[J]. IEEE Transactions on Control Systems Technology, 2017, 25(3): 1044–1051. doi: 10.1109/TCST.2016.2575801
    [9]
    唐伦, 杨恒, 赵国繁, 等. 基于时延感知的5G网络切片节点和链路映射算法[J]. 北京邮电大学学报, 2018, 41(6): 71–77. doi: 10.13190/j.jbupt.2018-018

    TANG Lun, YANG Heng, ZHAO Guofan, et al. Delay-aware 5G network slicing node and link embedding algorithm[J]. Journal of Beijing University of Posts and Telecommunications, 2018, 41(6): 71–77. doi: 10.13190/j.jbupt.2018-018
    [10]
    WANG Zhuzhu, LIU Yang, MA Zhou, et al. LiPSG: lightweight privacy-preserving Q-learning-based energy management for the IoT-Enabled smart grid[J]. IEEE Internet of Things Journal, 2020, 7(5): 3935–3947. doi: 10.1109/JIOT.2020.2968631
    [11]
    TOKIC M. Adaptive ε-greedy exploration in reinforcement learning based on value differences[C]. The 33rd Annual German Conference on KI 2010: Advances in Artificial Intelligence, Karlsruhe, Germany, 2010: 203-210.
    [12]
    CAO Xi, WAN Huaiyu, LIN Youfang, et al. High-value prioritized experience replay for off-policy reinforcement learning[C]. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, USA, 2019: 1510–1514. doi: 10.1109/ICTAI.2019.00215.
    [13]
    陈卓, 冯钢, 刘蓓, 等. 运营商网络中面向时延优化的服务功能链迁移重配置策略[J]. 电子学报, 2018, 46(9): 2229–2237. doi: 10.3969/j.issn.0372-2112.2018.09.026

    CHEN Zhuo, FENG Gang, LIU Bei, et al. Delay optimization oriented service function chain migration and re-deployment in operator network[J]. Acta Electronica Sinica, 2018, 46(9): 2229–2237. doi: 10.3969/j.issn.0372-2112.2018.09.026
    [14]
    LI Han, LÜ Tiejun, and ZHANG Xuewei. Deep deterministic policy gradient based dynamic power control for self-powered ultra-dense networks[C]. 2018 IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, 2018: 1–6. doi: 10.1109/GLOCOMW.2018.8644157.
    [15]
    金明, 李琳琳, 张文瑾, 等. 基于深度强化学习的服务功能链映射算法[J]. 计算机应用研究, 2020, 37(11): 3456–3460, 3466.

    JIN Ming, LI Linlin, ZHANG Wenjin, et al. SFC mapping algorithm based on deep reinforcement learning[J]. Application Research of Computers, 2020, 37(11): 3456–3460, 3466.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(1)

    Article Metrics

    Article views (1320) PDF downloads(162) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return