Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning

Lun TANG; Xiaoyu HE; Xiao WANG; Qi TAN; Yanjuan HU; Qianbin CHEN

doi:10.11999/JEIT200287

Volume 43 Issue 6

Jun. 2021

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2021 > 43(6): 1733-1741

Lun TANG, Xiaoyu HE, Xiao WANG, Qi TAN, Yanjuan HU, Qianbin CHEN. Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1733-1741. doi: 10.11999/JEIT200287

Citation:

Lun TANG, Xiaoyu HE, Xiao WANG, Qi TAN, Yanjuan HU, Qianbin CHEN. Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1733-1741. doi: 10.11999/JEIT200287

Citation:

Lun TANG, Xiaoyu HE, Xiao WANG, Qi TAN, Yanjuan HU, Qianbin CHEN. Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning[J]. Journal of Electronics & Information Technology, 2021, 43(6): 1733-1741. doi: 10.11999/JEIT200287

PDF( 2092 KB)

Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning

doi: 10.11999/JEIT200287 cstr: 32379.14.JEIT200287

1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Key Laboratory of Mobile Communication, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Funds: The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M20180601), The Major Theme Special Projects of Chongqing (cstc2019jscx-zdztzxX0006)

Received Date: 2020-04-21
Rev Recd Date: 2020-09-28

Available Online: 2020-09-30

Publish Date: 2021-06-18

Abstract

Abstract

Considering the fact that global network information is hard to obtain, and the slice resource allocation optimization problem caused by mobility of User Equipment (UE) and dynamics of packet arrival in the radio access network slice, a Service Function Chain(SFC)resource allocation algorithm based on Asynchronous Advantage Actor-Critic (A3C) learning is proposed. Firstly, a resource management mechanism based on blockchain technology is established, which can credibly share and update the global network information, also supervise and record SFC resource allocation process. Then, a delay minimization model based on joint allocation of radio resources, computing resources and bandwidth resources is built under the circumstance of UE moving and time-varying packet arrival, and further transformed into an Markov Decision Process(MDP) problem. At last, A3C learning method is adopted to obtain the resource allocation optimization strategy in this MDP. Simulation results show that the proposed algorithm could utilize resources more efficiently to optimize the system delay while guarantee the requirement of each UE.
- Network slice,
- Service Function Chain(SFC) resource allocation,
- Markov Decision Process(MDP),
- Asynchronous Advantage Actor-Critic(A3C) learning,
- Blockchain

FullText(HTML)

References(20)

References

[1]	OTOKURA M, LEIBNITZ K, KOIZUMI Y, et al. Evolvable virtual network function placement method: Mechanism and performance evaluation[J]. IEEE Transactions on Network and Service Management, 2019, 16(1): 27–40. doi: 10.1109/TNSM.2018.2890273
[2]	CABALLERO P, BANCHS A, DE VECIANA G, et al. Network slicing games: Enabling customization in multi-tenant mobile networks[J]. IEEE/ACM Transactions on Networking, 2019, 27(2): 662–675. doi: 10.1109/TNET.2019.2895378
[3]	ALQERM I and SHIHADA B. Sophisticated online learning scheme for green resource allocation in 5G heterogeneous cloud radio access networks[J]. IEEE Transactions on Mobile Computing, 2018, 17(10): 2423–2437. doi: 10.1109/TMC.2018.2797166
[4]	DEMIR M S, SAIT S M, and UYSAL M. Unified resource allocation and mobility management technique using particle swarm optimization for VLC networks[J]. IEEE Photonics Journal, 2018, 10(6): 7908809. doi: 10.1109/JPHOT.2018.2864139
[5]	DASTGHEIB M A, BEYRANVAND H, SALEHI J A, et al. Mobility-aware resource allocation in VLC networks using T-step look-ahead policy[J]. Journal of Lightwave Technology, 2018, 36(23): 5358–5370. doi: 10.1109/JLT.2018.2872869
[6]	唐伦, 周钰, 谭颀, 等. 基于强化学习的5G网络切片虚拟网络功能迁移算法[J]. 电子与信息学报, 2020, 42(3): 669–677. doi: 10.11999/JEIT190290 TANG Lun, ZHOU Yu, TAN Qi, et al. Virtual network function migration algorithm based on reinforcement learning for 5G network slicing[J]. Journal of Electronics &Information Technology, 2020, 42(3): 669–677. doi: 10.11999/JEIT190290
[7]	SHARMA P K, CHEN M Y, and PARK J H. A software defined fog node based distributed blockchain cloud architecture for IoT[J]. IEEE Access, 2017, 6: 115–124. doi: 10.1109/ACCESS.2017.2757955
[8]	XIE Lixia, DING Ying, YANG Hongyu, et al. Blockchain-based secure and trustworthy Internet of Things in SDN-enabled 5G-VANETs[J]. IEEE Access, 2019, 7: 56656–56666. doi: 10.1109/ACCESS.2019.2913682
[9]	SUN Yao, FENG Gang, QIN Shuang, et al. The SMART handoff policy for millimeter wave heterogeneous cellular networks[J]. IEEE Transactions on Mobile Computing, 2018, 17(6): 1456–1468. doi: 10.1109/TMC.2017.2762668
[10]	LI Junling, SHI Weisen, ZHANG Ning, et al. Reinforcement learning based VNF scheduling with end-to-end delay guarantee[C]. 2019 IEEE/CIC International Conference on Communications in China (ICCC), Changchun, China, 2019: 572–577. doi: 10.1109/ICCChina.2019.8855889.
[11]	LI Guanglei, ZHOU Huachun, FENG Bohao, et al. Efficient provision of service function chains in overlay networks using reinforcement learning[J]. IEEE Transactions on Cloud Computing, To be pulished. doi: 10.1109/TCC.2019.2961093
[12]	GRONDMAN I, BUSONIU L, LOPES G A D, et al. A survey of actor-critic reinforcement learning: Standard and natural policy gradients[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) , 2012, 42(6): 1291–1307. doi: 10.1109/TSMCC.2012.2218595
[13]	朱立, 俞欢, 詹士潇, 等. 高性能联盟区块链技术研究[J]. 软件学报, 2019, 30(6): 1577–1593. doi: 10.13328/j.cnki.jos.005737 ZHU Li, YU Huan, ZHAN Shixiao, et al. Research on high-performance consortium blockchain technology[J]. Journal of Software, 2019, 30(6): 1577–1593. doi: 10.13328/j.cnki.jos.005737
[14]	KIAYIAS A, RUSSELL A, DAVID B, et al. Ouroboros: A provably secure proof-of-stake blockchain protocol[C]. The 37th Annual International Cryptology Conference, Santa Barbara, USA, 2017: 357–388. doi: 10.1007/978-3-319-63688-7_12.
[15]	YAO Yingying, CHANG Xiaolin, MIŠIĆ J, et al. BLA: Blockchain-assisted lightweight anonymous authentication for distributed vehicular fog services[J]. IEEE Internet of Things Journal, 2019, 6(2): 3775–3784. doi: 10.1109/JIOT.2019.2892009
[16]	CHEN Zhonglin, CHEN Shanzhi, XU Hui, et al. A security authentication scheme of 5G ultra-dense network based on block chain[J]. IEEE Access, 2018, 6: 55372–55379. doi: 10.1109/ACCESS.2018.2871642
[17]	HE Li and HOU Zhixin. An improvement of consensus fault tolerant algorithm applied to alliance chain[C]. The IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, 2019: 1–4. doi: 10.1109/ICEIEC.2019.8784495.
[18]	GUO Shaoyong, DAI Yao, XU Siya, et al. Trusted cloud-edge network resource management: DRL-driven service function chain orchestration for IoT[J]. IEEE Internet of Things Journal, 2020, 7(7): 6010–6022. doi: 10.1109/JIOT.2019.2951593
[19]	WEI Qinglai, WANG Lingxiao, LIU Yu, et al. Optimal elevator group control via deep asynchronous actor-critic learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(12): 5245–5256. doi: 10.1109/TNNLS.2020.2965208
[20]	戴鹏. 基于实用拜占庭共识算法(PBFT)的区块链模型的评估与改进[D]. [硕士论文], 北京邮电大学, 2019. DAI Peng. Evalution and research of blockchain model based on practical byzantine consensus algorithm (PBFT)[D]. [Master dissertation], Beijing University of Posts and Telecommunications, 2019.