异构云无线接入网下基于功率域NOMA的能效优化算法

唐伦; 李子煜; 管令进; 陈前斌

doi:10.11999/JEIT200327

异构云无线接入网下基于功率域NOMA的能效优化算法

doi: 10.11999/JEIT200327

1.
重庆邮电大学通信与信息工程学院重庆 400065
2.
重庆邮电大学移动通信技术重点实验室重庆 400065

基金项目: 国家自然科学基金(62071078)，重庆市教委科学技术研究项目(KJZD-M201800601)，重庆市重大主题专项项目(cstc2019jscx-zdztzxX0006)

详细信息

作者简介:
唐伦：男，1973年生，教授、博士生导师，研究方向为新一代无线通信网络、异构蜂窝网络、软件定义无线网络等

李子煜：女，1995年生，硕士生，研究方向为资源分配、机器学习

管令进：男，1995年生，硕士生，研究方向为网络功能虚拟化、无线资源分配、机器学习

陈前斌：男，1967年生，教授、博士生导师，研究方向为个人通信、多媒体信息处理与传输、下一代移动通信网络等

通讯作者:
李子煜　lzy395682410@qq.com

中图分类号: TN929.5
计量
- 文章访问数: 1467
- HTML全文浏览量: 563
- PDF下载量: 103
- 被引次数: 0
出版历程
- 收稿日期: 2020-04-28
- 修回日期: 2020-10-05
- 网络出版日期: 2020-10-12
- 刊出日期: 2021-06-18

Energy Efficiency Optimization Algorithm Based On PD-NOMA Under Heterogeneous Cloud Radio Access Networks

1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Key Laboratory of Mobile Communication Technology, Chongqing University of Post and Telecommunications, Chongqing 400065, China

Funds: The National Natural Science Foundation of China (62071078), The Science and Technology Research Project of Chongqing Education Commission (KJZD-M201800601), The Major Theme Projects in Chongqing (cstc2019jscxzdztzxX0006)

摘要

摘要: 针对异构云无线接入网络的频谱效率和能效问题，该文提出一种基于功率域-非正交多址接入(PD-NOMA)的能效优化算法。首先，该算法以队列稳定和前传链路容量为约束，联合优化用户关联、功率分配和资源块分配，并建立网络能效和用户公平的联合优化模型；其次，由于系统的状态空间和动作空间都是高维且具有连续性，研究问题为连续域的NP-hard问题，进而引入置信域策略优化(TRPO)算法，高效地解决连续域问题；最后，针对TRPO算法的标准解法产生的计算量较为庞大，采用近端策略优化(PPO)算法进行优化求解，PPO算法既保证了TRPO算法的可靠性，又有效地降低TRPO的计算复杂度。仿真结果表明，该文所提算法在保证用户公平性约束下，进一步提高了网络能效性能。
- 异构云无线接入网络 /
- 资源分配 /
- 网络能效 /
- 深度强化学习
Abstract: In view of the spectrum efficiency and energy efficiency of Heterogeneous Cloud Radio Access Networks (H-CRAN), an energy efficiency optimization algorithm based on Power Domain Non-Orthogonal Multiple Access (PD-NOMA) is proposed. First, the algorithm takes queue stability and forward link capacity as constraints, jointly optimizes user association, power allocation and resource block allocation, and it establishes a joint optimization model of network energy efficiency and user fairness. Secondly, because the state space and action space of the system are both high-dimensional and continuity, the research problem is the NP-hard problem of the continuous domain, and then Trust Region Policy Optimization (TRPO) algorithm is introduced to solve efficiently the continuous domain issue. Finally, the amount of calculations generated by the standard solution for the TRPO algorithm is too large, and Proximal Policy Optimization (PPO) algorithm is used to optimize the solution. The PPO algorithm not only ensures the reliability of the TRPO algorithm, but also reduces effectively the TRPO calculation complexity. Simulation results show that the algorithm proposed in this paper improves further the energy efficiency performance of the network under the constraint of ensuring user fairness.
- Heterogeneous Cloud Radio Access Networks(H-CRAN) /
- Resource allocation /
- Network energy efficiency /
- Deep Reinforcement Learning(DRL)

HTML全文

图 1 基于PD-NOMA的异构云无线接入网架构

下载: 全尺寸图片幻灯片

图 2 前传链路框图

下载: 全尺寸图片幻灯片

图 3 两级队列架构

下载: 全尺寸图片幻灯片

图 4 PPO算法框图

下载: 全尺寸图片幻灯片

图 5 PPO算法下不同batch的网络能效

下载: 全尺寸图片幻灯片

图 6 不同到达率的平均队列长度

下载: 全尺寸图片幻灯片

图 7 不同算法下的网络能效

下载: 全尺寸图片幻灯片

图 8 不同算法下的网络能耗

下载: 全尺寸图片幻灯片

表 1 近端策略优化PPO训练Actor网络参数算法

算法1 近端策略优化(PPO)训练Actor网络参数算法
(1) 初始化Actor神经网络参数 $\theta$ 以及Critic的神经网络参数 ${\kappa _v}$
(2) For episode $G = 1,2,···,1000$ do
(3) 　while经验池D中没有足够的元组do
(4) 　　随机选取一个初始状态 ${s_0}$
(5) 　　for step=1, 2, ···, n do
(6) 　　　　定义起始状态 $s$ ，根据策略 $\pi (s\|\theta )$ 选取动作 $a$
(7) 　　　　采取动作 $a$ 与无线网络环境进行交互后，观察下一　　　　　　状态 $s'$ 并计算出奖励回报 $r$ .
(8) 　　　　通过式(15)计算出累计折扣奖励 ${R_i}$ ，将元组　　　　　　 $({s_i},{a_i},{s'_i},{R_i})$ 存入经验池 $D$ 中
(9) 　　end for
(10) 　　end while
(11) 　 ${\theta ^{{\rm{old}}}} \leftarrow \theta$
(12) 　for 每次更新回合 do
(13) 　　从经验池D中随机采样mini-batch样本
(14 )　　对于Critic网络而言：通过最小Critic网络中的损失函　　　　数来更新Critic的参数 ${\kappa _v}$
(15) 　　对于Actor网络而言：
(16) 　　根据状态 ${s_i}$ ，利用式(17)计算优势函数 ${A_i}$ ，通过最大　　　　化actor网络的损失函数来更新Actor的参数 $\theta$
(17) 　end for
(18) End For

下载: 导出CSV

参考文献(14)

[1]	张广驰, 曾志超, 崔苗, 等. 无线供电混合多址接入网络的资源分配[J]. 电子与信息学报, 2018, 40(12): 3013–3019. doi: 10.11999/JEIT180219 ZHANG Guangchi, ZENG Zhichao, CUI Miao, et al. Resource allocation for wireless powered hybrid multiple access networks[J]. Journal of Electronics &Information Technology, 2018, 40(12): 3013–3019. doi: 10.11999/JEIT180219
[2]	MOKDAD A, AZMI P, MOKARI N, et al. Cross-layer energy efficient resource allocation in PD-NOMA based H-CRANs: Implementation via GPU[J]. IEEE Transactions on Mobile Computing, 2019, 18(6): 1246–1259. doi: 10.1109/TMC.2018.2860985
[3]	ZHANG Yizhong, WU Gang, DENG Lijun, et al. Arrival rate-based average energy-efficient resource allocation for 5G heterogeneous cloud RAN[J]. IEEE Access, 2019, 7: 136332–136342. doi: 10.1109/ACCESS.2019.2939348
[4]	陈前斌, 管令进, 李子煜, 等. 基于深度强化学习的异构云无线接入网自适应无线资源分配算法[J]. 电子与信息学报, 2020, 42(6): 1468–1477. doi: 10.11999/JEIT190511 CHEN Qianbin, GUAN Lingjin, LI Ziyu, et al. Deep reinforcement learning-based adaptive wireless resource allocation algorithm for heterogeneous cloud wireless access network[J]. Journal of Electronics &Information Technology, 2020, 42(6): 1468–1477. doi: 10.11999/JEIT190511
[5]	PENG Mugen, LI Yong, ZHAO Zhongyuan, et al. System architecture and key technologies for 5G heterogeneous cloud radio access networks[J]. IEEE Network, 2015, 29(2): 6–14. doi: 10.1109/MNET.2015.7064897
[6]	HUNG S, HSU H, CHENG S, et al. Delay guaranteed network association for mobile machines in heterogeneous cloud radio access network[J]. IEEE Transactions on Mobile Computing, 2018, 17(12): 2744–2760. doi: 10.1109/TMC.2018.2815702
[7]	TAN Zhongwei, YANG Chuanchuan, and WANG Ziyu. Energy evaluation for cloud RAN employing TDM-PON as front-haul based on a new network traffic modeling[J]. Journal of Lightwave Technology, 2017, 35(13): 2669–2677. doi: 10.1109/JLT.2016.2613095
[8]	DHAINI A R, HO P H, SHEN Gangxiang, et al. Energy efficiency in TDMA-based next-generation passive optical access networks[J]. IEEE/ACM Transactions on Networking, 2014, 22(3): 850–863. doi: 10.1109/TNET.2013.2259596
[9]	WANG Kaiwei, ZHOU Wuyang, and MAO Shiwen. Energy efficient joint resource scheduling for delay-aware traffic in cloud-RAN[C]. 2016 IEEE Global Communications Conference, Washington, USA, 2016: 1–6. doi: 10.1109/GLOCOM.2016.7841793.
[10]	SABELLA D, DE DOMENICO A, KATRANARAS E, et al. Energy efficiency benefits of RAN-as-a-service concept for a cloud-based 5G mobile network infrastructure[J]. IEEE Access, 2014, 2: 1586–1597. doi: 10.1109/ACCESS.2014.2381215
[11]	NEELY M J. Stochastic Network Optimization with Application to Communication and Queueing Systems[M]. Morgan & Claypool, 2010: 1–211. doi: 10.2200/S00271ED1V01Y201006CNT007.
[12]	NGUYEN K K, DUONG T Q, VIEN N A, et al. Non-cooperative energy efficient power allocation game in D2D communication: A multi-agent deep reinforcement learning approach[J]. IEEE Access, 2019, 7: 100480–100490. doi: 10.1109/ACCESS.2019.2930115
[13]	ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al. Deep reinforcement learning: A brief survey[J]. IEEE Signal Processing Magazine, 2017, 34(6): 26–38. doi: 10.1109/MSP.2017.2743240
[14]	NASIR Y S and GUO Dongning. Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(10): 2239–2250. doi: 10.1109/JSAC.2019.2933973