高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

无蜂窝大规模MIMO中基于深度强化学习的无人机辅助通信与资源调度

王朝炜 邓丹昊 王卫东 江帆

王朝炜, 邓丹昊, 王卫东, 江帆. 无蜂窝大规模MIMO中基于深度强化学习的无人机辅助通信与资源调度[J]. 电子与信息学报, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241
引用本文: 王朝炜, 邓丹昊, 王卫东, 江帆. 无蜂窝大规模MIMO中基于深度强化学习的无人机辅助通信与资源调度[J]. 电子与信息学报, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241
WANG Chaowei, DENG Danhao, WANG Weidong, JIANG Fan. UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach[J]. Journal of Electronics & Information Technology, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241
Citation: WANG Chaowei, DENG Danhao, WANG Weidong, JIANG Fan. UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach[J]. Journal of Electronics & Information Technology, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241

无蜂窝大规模MIMO中基于深度强化学习的无人机辅助通信与资源调度

doi: 10.11999/JEIT211241
基金项目: 国家重点研发计划(2020YFB1807204)
详细信息
    作者简介:

    王朝炜:男,1982年生,副教授,研究方向为下一代移动通信技术、无线传感器与IoT技术等

    邓丹昊:女,1996年生,博士生,研究方向为无线资源管理技术

    王卫东:男,1967年生,教授,研究方向为卫星移动通信、下一代移动通信技术、IoT技术等

    江帆:女,1982年生,教授,研究方向为基于人工智能的边缘计算及缓存技术、D2D通信技术、5G超密集异构网络中的无线资源管理等

    通讯作者:

    王朝炜 wangchaowei@bupt.edu.cn

  • 中图分类号: TN915

UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach

Funds: The National Key R&D Program of China (2020YFB1807204)
  • 摘要: 无蜂窝大规模多入多出(MIMO)网络中分布式接入点(AP)同时服务多个用户,可以实现较大区域内虚拟MIMO的大容量传输;而无人机辅助通信能够为该目标区域热点或边缘用户提供覆盖增强。为了降低反馈链路负载,并有效提升无人机辅助通信的频谱利用率,该文研究了基于AP功率分配、无人机服务区选择和接入用户选择的联合调度;首先将AP功率分配和无人机服务区选择问题联合建模为双动作马尔可夫决策过程 (DAMDP),提出了基于Q-learning和卷积神经网络(CNN)的深度强化学习(DRL)算法;然后将用户调度构造为一个0-1优化问题,并分解成子问题来求解。仿真结果表明,该文提出的基于DRL的资源调度方案与现有方案相比,可以有效提升无蜂窝大规模MIMO网络中频谱利用率。
  • 图  1  无蜂窝大规模MIMO网络中无人机辅助通信

    图  2  不同算法的收敛性

    图  3  不同算法的复杂度

    图  4  不同学习率下的错误率

    图  5  不同网络框架下的错误率

    图  6  不同最小距离下的频谱利用率(DQN和Q-learning方案经历60000步训练)

    图  7  不同服务区数量下的频谱利用率(DQN和Q-learning方案经历60000步训练)

    表  1  DQN算法框架

    层级输入内核激活函数输出
    卷积层1$ \left( {\sum {{M_l}} } \right) \times N \times 1 $3×3, 8ReLU$ \left( {\sum {{M_l}} } \right) \times N \times {\text{8}} $
    卷积层2$ \left( {\sum {{M_l}} } \right) \times N \times {\text{8}} $3×3, 16ReLU$ \left( {\sum {{M_l}} } \right) \times N \times {\text{16}} $
    全连接层1$ \left( {\sum {{M_l}} } \right) \times N \times {\text{16}} $NAReLU1024
    全连接层21024NALinear$ \prod {\left( {{M_l} \times N} \right)} $
    下载: 导出CSV
  • [1] NGO H Q, ASHIKHMIN A, YANG Hong, et al. Cell-free massive MIMO versus small cells[J]. IEEE Transactions on Wireless Communications, 2017, 16(3): 1834–1850. doi: 10.1109/TWC.2017.2655515
    [2] 尤肖虎, 尹浩, 邬贺铨. 6G与广域物联网[J]. 物联网学报, 2020, 4(1): 3–11. doi: 10.11959/j.issn.2096-3750.2020.00158

    YOU Xiaohu, YIN Hao, and WU Hequan. On 6G and wide-area IoT[J]. Chinese Journal on Internet of Things, 2020, 4(1): 3–11. doi: 10.11959/j.issn.2096-3750.2020.00158
    [3] GU Shushi, ZHANG Qinyu, and XIANG Wei. Coded storage-and-computation: A new paradigm to enhancing intelligent services in space-air-ground integrated networks[J]. IEEE Wireless Communications, 2020, 27(6): 44–51. doi: 10.1109/MWC.001.2000108
    [4] ZHAO Jianwei, GAO Feifei, WU Qihui, et al. Beam tracking for UAV mounted SatCom on-the-Move with massive antenna array[J]. IEEE Journal on Selected Areas in Communications, 2018, 36(2): 363–375. doi: 10.1109/JSAC.2018.2804239
    [5] JIANG Xu, YANG Zhutian, ZHAO Nan, et al. Resource allocation and trajectory optimization for UAV-enabled multi-user covert communications[J]. IEEE Transactions on Vehicular Technology, 2021, 70(2): 1989–1994. doi: 10.1109/TVT.2021.3053936
    [6] ZHAO Jianwei, GAO Feifei, KUANG Linling, et al. Channel tracking with flight control system for UAV mmWave MIMO communications[J]. IEEE Communications Letters, 2018, 22(6): 1224–1227. doi: 10.1109/LCOMM.2018.2824800
    [7] AMMAR H A, ADVE R, SHAHBAZPANAHI S, et al. Downlink resource allocation in multiuser cell-free MIMO networks with user-centric clustering[J]. IEEE Transactions on Wireless Communications, 2022, 21(3): 1482–1497.
    [8] AMMAR H A, ADVE R, SHAHBAZPANAHI S, et al. Distributed resource allocation optimization for user-centric cell-free MIMO networks[J]. IEEE Transactions on Wireless Communications, To be published.
    [9] XIA Xinjiang, ZHU Pengcheng, LI Jiamin, et al. Joint user selection and transceiver design for cell-free with network-assisted full duplexing[J]. IEEE Transactions on Wireless Communications, 2021, 20(12): 7856–7870. doi: 10.1109/TWC.2021.3088485
    [10] WANG Dongming, ZHANG Chuan, DU Yongqiang, et al. Implementation of a cloud-based cell-free distributed massive MIMO system[J]. IEEE Communications Magazine, 2020, 58(8): 61–67. doi: 10.1109/MCOM.001.2000106
    [11] WANG Dongming, WANG Menghan, ZHU Pengcheng, et al. Performance of network-assisted full-duplex for cell-free massive MIMO[J]. IEEE Transactions on Communications, 2020, 68(3): 1464–1478. doi: 10.1109/TCOMM.2019.2962158
    [12] ZHENG Jiakang, ZHANG Jiayi, and AI Bo. UAV communications with WPT-aided cell-free massive MIMO systems[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(10): 3114–3128. doi: 10.1109/JSAC.2021.3088632
    [13] D’ANDREA C, GARCIA-RODRIGUEZ A, GERACI G, et al. Analysis of UAV communications in cell-free massive MIMO systems[J]. IEEE Open Journal of the Communications Society, 2020, 1: 133–147. doi: 10.1109/OJCOMS.2020.2964983
    [14] SAMIR M, EBRAHIMI D, ASSI C, et al. Leveraging UAVs for coverage in cell-free vehicular networks: A deep reinforcement learning approach[J]. IEEE Transactions on Mobile Computing, 2021, 20(9): 2835–2847. doi: 10.1109/TMC.2020.2991326
    [15] XI Xing, CAO Xianbin, YANG Peng, et al. Joint user association and UAV location optimization for UAV-aided communications[J]. IEEE Wireless Communications Letters, 2019, 8(6): 1688–1691. doi: 10.1109/LWC.2019.2937077
    [16] GUO Yijun, YIN Sixing, and HAO Jianjun. Resource allocation and 3-D trajectory design in wireless networks assisted by rechargeable UAV[J]. IEEE Wireless Communications Letters, 2019, 8(3): 781–784. doi: 10.1109/LWC.2019.2892721
    [17] ZHANG Guangchi, YAN Haiqiang, ZENG Yong, et al. Trajectory optimization and power allocation for multi-hop UAV relaying communications[J]. IEEE Access, 2018, 6: 48566–48576. doi: 10.1109/ACCESS.2018.2868117
    [18] ZHAO Chenxi, LIU Junyu, SHENG Min, et al. Multi-UAV trajectory planning for energy-efficient content coverage: A decentralized learning-based approach[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(10): 3193–3207. doi: 10.1109/JSAC.2021.3088669
    [19] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
    [20] KOUSHIK A M, HU Fei, and KUMAR S. Deep Q-learning-based node positioning for throughput-optimal communications in dynamic UAV swarm network[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(3): 554–566. doi: 10.1109/TCCN.2019.2907520
    [21] CUI Jingjing, LIU Yuanwei, and NALLANATHAN A. Multi-agent reinforcement learning-based resource allocation for UAV networks[J]. IEEE Transactions on Wireless Communications, 2020, 19(2): 729–743. doi: 10.1109/TWC.2019.2935201
    [22] DING Ruijin, GAO Feifei, and SHEN X S. 3D UAV trajectory design and frequency band allocation for energy-efficient and fair communication: A deep reinforcement learning approach[J]. IEEE Transactions on Wireless Communications, 2020, 19(12): 7796–7809. doi: 10.1109/TWC.2020.3016024
    [23] DING Ruijin, XU Yadong, GAO Feifei, et al. Trajectory design and access control for air-ground coordinated communications system with multi-agent deep reinforcement learning[J]. IEEE Internet of Things Journal. To be published.
    [24] HE Ying, ZHANG Zheng, YU F R, et al. Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks[J]. IEEE Transactions on Vehicular Technology, 2017, 66(11): 10433–10445. doi: 10.1109/TVT.2017.2751641
    [25] HE Dawei, SUN Wei, and SHI Lei. The novel mobility models based on spiral line for aerial backbone networks[J]. IEEE Access, 2020, 8: 11297–11314. doi: 10.1109/ACCESS.2020.2965616
    [26] SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. 2nd ed. Cambridge: MIT Press, 2018: 1–130.
    [27] ZHANG Ran, WANG Miao, CAI L X, et al. Learning to be proactive: Self-regulation of UAV based networks with UAV and user dynamics[J]. IEEE Transactions on Wireless Communications, 2021, 20(7): 4406–4419. doi: 10.1109/TWC.2021.3058533
    [28] WANG Xue, JIN Tao, HU Liangshuai, et al. Energy-efficient power allocation and Q-learning-based relay selection for relay-aided D2D communication[J]. IEEE Transactions on Vehicular Technology, 2020, 69(6): 6452–6462. doi: 10.1109/TVT.2020.2985873
    [29] WANG Xiuhong, QIAO Qingli, and WANG Zheng’ou. Chaotic neural network technique for “0–1” programming problems[J]. Journal of Systems Engineering and Electronics, 2003, 14(4): 99–105.
  • 加载中
图(7) / 表(1)
计量
  • 文章访问数:  1773
  • HTML全文浏览量:  1109
  • PDF下载量:  327
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-11-08
  • 修回日期:  2022-03-02
  • 录用日期:  2022-03-02
  • 网络出版日期:  2022-03-04
  • 刊出日期:  2022-03-28

目录

    /

    返回文章
    返回