UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach

WANG Chaowei; DENG Danhao; WANG Weidong; JIANG Fan

doi:10.11999/JEIT211241

Volume 44 Issue 3

Mar. 2022

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2022 > 44(3): 835-843

WANG Chaowei, DENG Danhao, WANG Weidong, JIANG Fan. UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach[J]. Journal of Electronics & Information Technology, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241

Citation:

WANG Chaowei, DENG Danhao, WANG Weidong, JIANG Fan. UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach[J]. Journal of Electronics & Information Technology, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241

Citation:

WANG Chaowei, DENG Danhao, WANG Weidong, JIANG Fan. UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach[J]. Journal of Electronics & Information Technology, 2022, 44(3): 835-843. doi: 10.11999/JEIT211241

PDF( 1654 KB)

UAV Assisted Communication and Resource Scheduling in Cell-free Massive MIMO Based on Deep Reinforcement Learning Approach

doi: 10.11999/JEIT211241 cstr: 32379.14.JEIT211241

WANG Chaowei^{1, 2
,
,},
DENG Danhao^{1, 2},
WANG Weidong^{1, 2},
JIANG Fan³

1.
School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China
2.
Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing 100876, China
3.
School of Communication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710061, China

Funds: The National Key R&D Program of China (2020YFB1807204)

Received Date: 2021-11-08
Accepted Date: 2022-03-02
Rev Recd Date: 2022-03-02

Available Online: 2022-03-04

Publish Date: 2022-03-28

Abstract

Abstract

Distributed Access Points (AP) in the cell-free massive Multiple Input Multiple Output (MIMO) networks serve multiple users at the same time, which can achieve large-capacity transmission of virtual MIMO in a larger area. Unmanned Aerial Vehicle (UAV) assisted communication can provide coverage enhancement for hotspots or edge users in this area. In order to improve the spectrum efficiency and reduce the feedback overhead, a joint resource scheduling scheme that includes AP power allocation, UAV service zone selection and user scheduling is proposed in this paper. Firstly, the AP power allocation and the UAV service zone selection problems are jointly modeled as a Double-Action Markov Decision Process (DAMDP). Then, a Deep Reinforcement Learning (DRL) algorithm based on Q-learning and Convolutional Neural Networks (CNN) is proposed. Furthermore, the user scheduling problem is formulated as a 0-1 optimization problem and solved by dividing into sub-problems. Simulation results demonstrate that the proposed DRL-based resource scheduling scheme exhibits a higher spectrum efficiency than existing schemes.
- Cell-free massive MIMO,
- UAV assisted communication,
- Resource scheduling,
- Deep Reinforcement Learning (DRL)

FullText(HTML)

References(29)

References

[1]	NGO H Q, ASHIKHMIN A, YANG Hong, et al. Cell-free massive MIMO versus small cells[J]. IEEE Transactions on Wireless Communications, 2017, 16(3): 1834–1850. doi: 10.1109/TWC.2017.2655515
[2]	尤肖虎, 尹浩, 邬贺铨. 6G与广域物联网[J]. 物联网学报, 2020, 4(1): 3–11. doi: 10.11959/j.issn.2096-3750.2020.00158 YOU Xiaohu, YIN Hao, and WU Hequan. On 6G and wide-area IoT[J]. Chinese Journal on Internet of Things, 2020, 4(1): 3–11. doi: 10.11959/j.issn.2096-3750.2020.00158
[3]	GU Shushi, ZHANG Qinyu, and XIANG Wei. Coded storage-and-computation: A new paradigm to enhancing intelligent services in space-air-ground integrated networks[J]. IEEE Wireless Communications, 2020, 27(6): 44–51. doi: 10.1109/MWC.001.2000108
[4]	ZHAO Jianwei, GAO Feifei, WU Qihui, et al. Beam tracking for UAV mounted SatCom on-the-Move with massive antenna array[J]. IEEE Journal on Selected Areas in Communications, 2018, 36(2): 363–375. doi: 10.1109/JSAC.2018.2804239
[5]	JIANG Xu, YANG Zhutian, ZHAO Nan, et al. Resource allocation and trajectory optimization for UAV-enabled multi-user covert communications[J]. IEEE Transactions on Vehicular Technology, 2021, 70(2): 1989–1994. doi: 10.1109/TVT.2021.3053936
[6]	ZHAO Jianwei, GAO Feifei, KUANG Linling, et al. Channel tracking with flight control system for UAV mmWave MIMO communications[J]. IEEE Communications Letters, 2018, 22(6): 1224–1227. doi: 10.1109/LCOMM.2018.2824800
[7]	AMMAR H A, ADVE R, SHAHBAZPANAHI S, et al. Downlink resource allocation in multiuser cell-free MIMO networks with user-centric clustering[J]. IEEE Transactions on Wireless Communications, 2022, 21(3): 1482–1497.
[8]	AMMAR H A, ADVE R, SHAHBAZPANAHI S, et al. Distributed resource allocation optimization for user-centric cell-free MIMO networks[J]. IEEE Transactions on Wireless Communications, To be published.
[9]	XIA Xinjiang, ZHU Pengcheng, LI Jiamin, et al. Joint user selection and transceiver design for cell-free with network-assisted full duplexing[J]. IEEE Transactions on Wireless Communications, 2021, 20(12): 7856–7870. doi: 10.1109/TWC.2021.3088485
[10]	WANG Dongming, ZHANG Chuan, DU Yongqiang, et al. Implementation of a cloud-based cell-free distributed massive MIMO system[J]. IEEE Communications Magazine, 2020, 58(8): 61–67. doi: 10.1109/MCOM.001.2000106
[11]	WANG Dongming, WANG Menghan, ZHU Pengcheng, et al. Performance of network-assisted full-duplex for cell-free massive MIMO[J]. IEEE Transactions on Communications, 2020, 68(3): 1464–1478. doi: 10.1109/TCOMM.2019.2962158
[12]	ZHENG Jiakang, ZHANG Jiayi, and AI Bo. UAV communications with WPT-aided cell-free massive MIMO systems[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(10): 3114–3128. doi: 10.1109/JSAC.2021.3088632
[13]	D’ANDREA C, GARCIA-RODRIGUEZ A, GERACI G, et al. Analysis of UAV communications in cell-free massive MIMO systems[J]. IEEE Open Journal of the Communications Society, 2020, 1: 133–147. doi: 10.1109/OJCOMS.2020.2964983
[14]	SAMIR M, EBRAHIMI D, ASSI C, et al. Leveraging UAVs for coverage in cell-free vehicular networks: A deep reinforcement learning approach[J]. IEEE Transactions on Mobile Computing, 2021, 20(9): 2835–2847. doi: 10.1109/TMC.2020.2991326
[15]	XI Xing, CAO Xianbin, YANG Peng, et al. Joint user association and UAV location optimization for UAV-aided communications[J]. IEEE Wireless Communications Letters, 2019, 8(6): 1688–1691. doi: 10.1109/LWC.2019.2937077
[16]	GUO Yijun, YIN Sixing, and HAO Jianjun. Resource allocation and 3-D trajectory design in wireless networks assisted by rechargeable UAV[J]. IEEE Wireless Communications Letters, 2019, 8(3): 781–784. doi: 10.1109/LWC.2019.2892721
[17]	ZHANG Guangchi, YAN Haiqiang, ZENG Yong, et al. Trajectory optimization and power allocation for multi-hop UAV relaying communications[J]. IEEE Access, 2018, 6: 48566–48576. doi: 10.1109/ACCESS.2018.2868117
[18]	ZHAO Chenxi, LIU Junyu, SHENG Min, et al. Multi-UAV trajectory planning for energy-efficient content coverage: A decentralized learning-based approach[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(10): 3193–3207. doi: 10.1109/JSAC.2021.3088669
[19]	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
[20]	KOUSHIK A M, HU Fei, and KUMAR S. Deep Q-learning-based node positioning for throughput-optimal communications in dynamic UAV swarm network[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(3): 554–566. doi: 10.1109/TCCN.2019.2907520
[21]	CUI Jingjing, LIU Yuanwei, and NALLANATHAN A. Multi-agent reinforcement learning-based resource allocation for UAV networks[J]. IEEE Transactions on Wireless Communications, 2020, 19(2): 729–743. doi: 10.1109/TWC.2019.2935201
[22]	DING Ruijin, GAO Feifei, and SHEN X S. 3D UAV trajectory design and frequency band allocation for energy-efficient and fair communication: A deep reinforcement learning approach[J]. IEEE Transactions on Wireless Communications, 2020, 19(12): 7796–7809. doi: 10.1109/TWC.2020.3016024
[23]	DING Ruijin, XU Yadong, GAO Feifei, et al. Trajectory design and access control for air-ground coordinated communications system with multi-agent deep reinforcement learning[J]. IEEE Internet of Things Journal. To be published.
[24]	HE Ying, ZHANG Zheng, YU F R, et al. Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks[J]. IEEE Transactions on Vehicular Technology, 2017, 66(11): 10433–10445. doi: 10.1109/TVT.2017.2751641
[25]	HE Dawei, SUN Wei, and SHI Lei. The novel mobility models based on spiral line for aerial backbone networks[J]. IEEE Access, 2020, 8: 11297–11314. doi: 10.1109/ACCESS.2020.2965616
[26]	SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. 2nd ed. Cambridge: MIT Press, 2018: 1–130.
[27]	ZHANG Ran, WANG Miao, CAI L X, et al. Learning to be proactive: Self-regulation of UAV based networks with UAV and user dynamics[J]. IEEE Transactions on Wireless Communications, 2021, 20(7): 4406–4419. doi: 10.1109/TWC.2021.3058533
[28]	WANG Xue, JIN Tao, HU Liangshuai, et al. Energy-efficient power allocation and Q-learning-based relay selection for relay-aided D2D communication[J]. IEEE Transactions on Vehicular Technology, 2020, 69(6): 6452–6462. doi: 10.1109/TVT.2020.2985873
[29]	WANG Xiuhong, QIAO Qingli, and WANG Zheng’ou. Chaotic neural network technique for “0–1” programming problems[J]. Journal of Systems Engineering and Electronics, 2003, 14(4): 99–105.