虚拟化云无线接入网络下基于在线学习的网络切片虚拟资源分配算法

唐伦; 魏延南; 马润琳; 贺小雨; 陈前斌

doi:10.11999/JEIT180771

虚拟化云无线接入网络下基于在线学习的网络切片虚拟资源分配算法

doi: 10.11999/JEIT180771

1.
重庆邮电大学通信与信息工程学院重庆 400065
2.
重庆邮电大学移动通信技术重点实验室重庆 400065

基金项目: 国家自然科学基金(61571073)，重庆市教委科学技术研究项目(KJZD-M201800601)

详细信息

作者简介:
唐伦：男，1973年生，教授，主要研究方向为下一代无线通信网络、异构蜂窝网络、软件定义无线网络等

魏延南：男，1995年生，硕士生，研究方向为5G网络切片、虚拟资源分配、随机优化理论

马润琳：女，1993年生，硕士生，研究方向为5G网络切片、网络功能虚拟化、无线资源分配

贺小雨：女，1995年生，硕士生，研究方向为5G网络切片、无线网络虚拟化、智能优化理论

陈前斌：男，1967年生，教授，博士生导师，主要研究方向为个人通信、多媒体信息处理与传输、异构蜂窝网络等

通讯作者:
魏延南　weiyannan_cqupt@163.com

中图分类号: TN929.5
计量
- 文章访问数: 2981
- HTML全文浏览量: 1176
- PDF下载量: 140
- 被引次数: 0
出版历程
- 收稿日期: 2018-08-03
- 修回日期: 2019-02-20
- 网络出版日期: 2019-03-19
- 刊出日期: 2019-07-01

Online Learning-based Virtual Resource Allocation for Network Slicing in Virtualized Cloud Radio Access Network

1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Key Laboratory of Mobile Communication Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Funds: The National Natural Science Foundation of China (61571073), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601)

摘要

摘要: 针对现有研究中缺乏云无线接入网络(C-RAN)场景下对网络切片高效的动态资源分配方案的问题，该文提出一种虚拟化C-RAN网络下的网络切片虚拟资源分配算法。首先基于受限马尔可夫决策过程(CMDP)理论建立了一个虚拟化C-RAN场景下的随机优化模型，该模型以最大化平均切片和速率为目标，同时受限于各切片平均时延约束以及网络平均回传链路带宽消耗约束。其次，为了克服CMDP优化问题中难以准确掌握系统状态转移概率的问题，引入决策后状态(PDS)的概念，将其作为一种“中间状态”描述系统在已知动态发生后，但在未知动态发生前所处的状态，其包含了所有与系统状态转移有关的已知信息。最后，提出一种基于在线学习的网络切片虚拟资源分配算法，其在每个离散的资源调度时隙内会根据当前系统状态为每个网络切片分配合适的资源块数量以及缓存资源。仿真结果表明，该算法能有效地满足各切片的服务质量(QoS)需求，降低网络回传链路带宽消耗的压力并同时提升系统吞吐量。
- 5G网络切片 /
- 云无线接入网络 /
- 资源分配 /
- 马尔可夫决策过程
Abstract: To solve the problem of lacking efficient and dynamic resource allocation schemes for 5G Network Slicing (NS) in Cloud Radio Access Network (C-RAN) scenario in the existing researches, a virtual resource allocation algorithm for NS in virtualized C-RAN is proposed. Firstly, a stochastic optimization model in virtualized C-RAN network is established based on the Constrained Markov Decision Process (CMDP) theory, which maximizes the average sum rates of all slices as its objective, and is subject to the average delay constraint for each slice as well as the average network backhaul link bandwidth consumption constraint in the meantime. Secondly, in order to overcome the issue of having difficulties in acquiring the accurate transition probabilities of the system states in the proposed CMDP optimization problem, the concept of Post-Decision State (PDS) as an " intermediate state” is introduced, which is used to describe the state of the system after the known dynamics, but before the unknown dynamics occur, and it incorporates all of the known information about the system state transition. Finally, an online learning based virtual resource allocation algorithm is presented for NS in virtualized C-RAN, where in each discrete resource scheduling slot, it will allocate appropriate Resource Blocks (RBs) and caching resource for each network slice according to the observed current system state. The simulation results reveal that the proposed algorithm can effectively satisfy the Quality of Service (QoS) demand of each individual network slice, reduce the pressure of backhaul link on bandwidth consumption and improve the system throughput.
- 5G Network Slicing (NS) /
- Cloud Radio Access Network (C-RAN) /
- Resource allocation /
- Markov Decision Process (MDP)

HTML全文

图 1 虚拟化C-RAN网络系统场景

下载: 全尺寸图片幻灯片

图 2 不同平均时延约束下的平均切片和速率

下载: 全尺寸图片幻灯片

图 3 不同平均时延约束下的平均切片总时延

下载: 全尺寸图片幻灯片

图 4 不同数据包到达率${\lambda _1}$下的平均切片和速率

下载: 全尺寸图片幻灯片

图 5 不同数据包到达率${\lambda _1}$下的平均切片总时延

下载: 全尺寸图片幻灯片

表 1 虚拟化C-RAN网络下基于在线学习的网络切片虚拟资源分配算法

　输入　系统状态空间$C$，动作空间$A$，拉格朗日回报函数
$g({c_t}, {\text{π}} ({c_t}))$，有限信道状态集合${\text{H}}$。

　初始化：初始化决策后状态的状态值函数${\tilde V_0}(\tilde c) \in R, \forall \tilde c \in C\,$，令
$t \leftarrow 0$, ${c_t} \leftarrow c \in C\,$。

　学习阶段：
　　(1) 求解
${a_t} = \mathop {\arg \min }\limits_{a \in A} \left\{ {g({c_t}, a) + \gamma {{\tilde V}_t}({S^{M, a}}({c_t}, a))} \right\}$； (27)

　　(2) 观察PDS状态${\tilde c_t}$和下一时隙状态${c_{t + 1}}$：${\tilde c_t} = {S^{M, a}}({c_t}, {a_t})$,
${c_{t + 1}} = {S^{M, W}}({\tilde c_t}, {{\text{A}}_t}, {{\text{H}}_{t + 1}})$；

　　(3) 计算${c_{t + 1}}$的状态值函数：
　　　${V_t}({c_{t + 1}}) = \mathop {\min }\limits_{a \in A} \left\{ {g({c_{t + 1}}, a) + \gamma {{\tilde V}_t}({S^{M, a}}({c_{t + 1}}, a))} \right\}$； (28)

　　(4) 更新${\tilde V_{t + 1}}({\tilde c_t})$：

　　　${\tilde V_{t + 1}}({\tilde c_t}) = (1 - {\alpha _t}){\tilde V_t}({\tilde c_t}) + {\alpha _t}{V_t}({c_{t + 1}})$；　　　　 (29)

　　(5) 利用随机次梯度法更新拉格朗日乘子${\text{β}} :{\beta _i} \ge 0$。

　输出　最优策略${\text{π}} _{{\rm{PDS}}}^ * $。

下载: 导出CSV

表 2 仿真参数

仿真参数	值
远端射频头(RRH)最大发射功率	20 dBm
各切片最大队列长度${Q_{s, \max }}$	20 packets
噪声功率谱密度	–174 dBm/Hz
数据包大小$L$	4 kbit/packet
路径损耗模型	104.5+20lg(d) (d[km])
时隙长度$\tau $	1 ms

下载: 导出CSV

参考文献(11)

HOSSAIN E and HASAN M. 5G cellular: Key enabling technologies and research challenges[J]. IEEE Instrumentation & Measurement Magazine, 2015, 18(3): 11–21. doi: 10.1109/MIM.2015.7108393

CHECKO A, CHRISTIANSEN H L, YAN Ying, et al. Cloud RAN for mobile networks-A technology overview[J]. IEEE Communications Surveys & Tutorials, 2015, 17(1): 405–426. doi: 10.1109/COMST.2014.2355255

NIU Binglai, ZHOU Yong, SHAH-MANSOURI H, et al. A dynamic resource sharing mechanism for cloud radio access networks[J]. IEEE Transactions on Wireless Communications, 2016, 15(12): 8325–8338. doi: 10.1109/TWC.2016.2613896

KALIL M, Al-DWEIK A, SHARKH M F A, et al. A framework for joint wireless network virtualization and cloud radio access networks for next generation wireless networks[J]. IEEE Access, 2017, 5: 20814–20827. doi: 10.1109/ACCESS.2017.2746666

BERTSEKAS D and GALLAGER R. Data Networks[M]. Englewood Cliffs: Prentice-Hall, 1991, 152–162.

YANG Jian, ZHANG Shuben, WU Xiaomin, et al. Online learning-based server provisioning for electricity cost reduction in data center[J]. IEEE Transactions on Control Systems Technology, 2017, 25(3): 1044–1051. doi: 10.1109/TCST.2016.2575801

KALIL M, SHAMI A, and YE Yinghua. Wireless resources virtualization in LTE systems[C]. Proceedings of 2014 IEEE Conference on Computer Communications Workshops, Toronto, Canada, 2014: 363–368. doi: 10.1109/INFCOMW.2014.6849259.

POWELL W B. Approximate Dynamic Programming: Solving the Curses of Dimensionality[M]. Hoboken, USA: Wiley, 2011, 289–388.

LAKSHMINARAYANAN C and BHATNAGAR S. Approximate dynamic programming with (min, +) linear function approximation for Markov decision processes[J]. arXiv preprint arXiv: 1403.4179, 2014.

LI Rongpeng, ZHAO Zhifeng, CHEN Xianfu, et al. TACT: A transfer actor-critic learning framework for energy saving in cellular radio access networks[J]. IEEE Transactions on Wireless Communications, 2014, 13(4): 2000–2011. doi: 10.1109/TWC.2014.022014.130840

HE Xiaoming, WANG Kun, HUANG Huawei, et al. Green resource allocation based on deep reinforcement learning in content-centric IoT[J]. IEEE Transactions on Emerging Topics in Computing, 2019. doi: 10.1109/TETC.2018.2805718

施引文献

资源附件(0)

访问统计