A Resource Allocation Algorithm for Space-Air-Ground Integrated Network Based on Deep Reinforcement Learning

LIU Xuefang; MAO Weihao; YANG Qinghai

doi:10.11999/JEIT231016

Volume 46 Issue 7

Jul. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(7): 2831-2841

LIU Xuefang, MAO Weihao, YANG Qinghai. A Resource Allocation Algorithm for Space-Air-Ground Integrated Network Based on Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2024, 46(7): 2831-2841. doi: 10.11999/JEIT231016

Citation:

LIU Xuefang, MAO Weihao, YANG Qinghai. A Resource Allocation Algorithm for Space-Air-Ground Integrated Network Based on Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2024, 46(7): 2831-2841. doi: 10.11999/JEIT231016

Citation:

PDF( 3344 KB)

A Resource Allocation Algorithm for Space-Air-Ground Integrated Network Based on Deep Reinforcement Learning

doi: 10.11999/JEIT231016 cstr: 32379.14.JEIT231016

School of Telecommunications Engineering, Xidian University, Xi’an 710071, China

Funds: The National Key Research and Development Program of China (2020YFB1807700)

Received Date: 2023-09-18
Rev Recd Date: 2024-01-19

Available Online: 2024-01-31

Publish Date: 2024-07-29

Abstract

Abstract

The Space-Air-Ground Integrated Network (SAGIN) can effectively meet the communication needs of various service types by improving the resource utilization of the ground network, but ignoring the adaptive ability and robustness of the system and the Quality of Service (QoS) in different users. In response to this problem, a Deep Reinforcement Learning (DRL) Resource allocation algorithm for urban and suburban communications under the SAGIN architecture is proposed in this paper. Based on Reference Signal Reception Power (RSRP) defined in the 3rd Generation Partnership Project (3GPP) standard, considering ground co-frequency interference, and using the time-frequency resources of base stations in different domains as constraints, an optimization problem to maxmize the downlink throughput of system users is constructed. When using the Deep Q-network (DQN) algorithm to solve the optimization problem, a reward function which can comprehensively consider the user’s QoS requirements, system adaptability and system robustness is defined. Considering the service requirements of unmanned vehicles, immersive services and ordinary mobile communication services, the simulation results show that the value of the reward function which represents the performance of the system is increased by 39.1% compared with the greedy algorithm under 2 000 iterations. For the unmanned vehicle services, the average packet loss rate by the DQN algorithm is 38.07% lower than that by the greedy algorithm, and the delay by the DQN algorithm is also 6.05% lower than that by the greedy algorithm.
- Space-Air-Ground Integrated Network (SAGIN),
- Resource allocation,
- Deep Reinforcement Learning (DRL),
- Deep Q-Network (DQN)

FullText(HTML)

References(46)

References

[1]	钱志鸿, 田春生, 郭银景, 等. 智能网联交通系统的关键技术与发展[J]. 电子与信息学报, 2020, 42(1): 2–19. doi: 10.11999/JEIT190787. QIAN Zhihong, TIAN Chunsheng, GUO Yinjing, et al. The key technology and development of intelligent and connected transportation system[J]. Journal of Electronics & Information Technology, 2020, 42(1): 2–19. doi: 10.11999/JEIT190787.
[2]	QIU Chao, CHEN Zheyuan, REN Xiaoxu, et al. AImers-6G: AI-driven region-temporal resource provisioning for 6G immersive services[J]. IEEE Wireless Communications, 2023, 30(3): 196–203. doi: 10.1109/MWC.022.2200539.
[3]	CASONI M, GRAZIA C A, KLAPEZ M, et al. Integration of satellite and LTE for disaster recovery[J]. IEEE Communications Magazine, 2015, 53(3): 47–53. doi: 10.1109/MCOM.2015.7060481.
[4]	DING Xiang, WANG Xiaoqing, DOU Aixia, et al. The development of rapid earthquake disaster assessment system based on space-air-ground integrated earth observation[C]. 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 2021: 8456–8459. doi: 10.1109/IGARSS47720.2021.9553806.
[5]	张晓凯, 郭道省, 张邦宁. 空天地一体化网络研究现状与新技术的应用展望[J]. 天地一体化信息网络, 2021, 2(4): 19–26. doi: 10.11959/j.issn.2096-8930.2021039. ZHANG Xiaokai, GUO Daoxing, and ZHANG Bangning. Research status of space-air-ground integrated network and application prospects of new technologies[J]. Space-Integrated-Ground Information Networks, 2021, 2(4): 19–26. doi: 10.11959/j.issn.2096-8930.2021039.
[6]	DAI Cuiqin, LUO Junfeng, FU Shu, et al. Dynamic user association for resilient backhauling in satellite–terrestrial integrated networks[J]. IEEE Systems Journal, 2020, 14(4): 5025–5036. doi: 10.1109/jsyst.2020.2980314.
[7]	FERRÚS R, KOUMARAS H, SALLENT O, et al. SDN/NFV-enabled satellite communications networks: Opportunities, scenarios and challenges[J]. Physical Communication, 2016, 18: 95–112. doi: 10.1016/j.phycom.2015.10.007.
[8]	CHENG Nan, HE Jingchao, YIN Zhisheng, et al. 6G service-oriented space-air-ground integrated network: A survey[J]. Chinese Journal of Aeronautics, 2022, 35(9): 1–18. doi: 10.1016/j.cja.2021.12.013.
[9]	JIANG Weiwei. Software defined satellite networks: A survey[J]. Digital Communications and Networks, 2023, 9(6): 1243–1264. doi: 10.1016/j.dcan.2023.01.016.
[10]	LI Junling, SHI Weisen, WU Huaqing, et al. Cost-aware dynamic SFC mapping and scheduling in SDN/NFV-enabled space-air-ground-integrated networks for internet of vehicles[J]. IEEE Internet of Things Journal, 2022, 9(8): 5824–5838. doi: 10.1109/JIOT.2021.3058250.
[11]	GAO Xiangqiang, LIU Rongke, and KAUSHIK A. Service chaining placement based on satellite mission planning in ground station networks[J]. IEEE Transactions on Network and Service Management, 2021, 18(3): 3049–3063. doi: 10.1109/tnsm.2020.3045432.
[12]	YANG Dan, LIU Jiang, ZHANG Ran, et al. Multi-constraint virtual network embedding algorithm for satellite networks[C]. GLOBECOM 2020 - 2020 IEEE Global Communications Conference, Taipei, China, 2020: 1–6. doi: 10.1109/globecom42002.2020.9347993.
[13]	WANG Guangchao, ZHOU Sheng, ZHANG Shan, et al. SFC-based service provisioning for reconfigurable space-air-ground integrated networks[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(7): 1478–1489. doi: 10.1109/JSAC.2020.2986851.
[14]	ALSHAROA A and ALOUINI M S. Improvement of the global connectivity using integrated satellite-airborne-terrestrial networks with resource optimization[J]. IEEE Transactions on Wireless Communications, 2020, 19(8): 5088–5100. doi: 10.1109/TWC.2020.2988917.
[15]	华道本. 基于5G的低轨道卫星通信系统传输技术研究[D]. [硕士论文], 东南大学, 2019. doi: 10.27014/d.cnki.gdnau.2019.002673. HUA Daoben. Research on transmission technology of low earth orbit satellite communication system based on 5G[D]. [Master dissertation], Southeast University, 2019. doi: 10.27014/d.cnki.gdnau.2019.002673.
[16]	倪爽. 星地一体化网络接入与存储资源协同管控技术研究[D]. [硕士论文], 西安电子科技大学, 2021. doi: 10.27389/d.cnki.gxadu.2021.001600. NI Shuang. Coordinated access and cache resource management technology in terrestrial-satellite integrated network[D]. [Master dissertation], Xidian University, 2021. doi: 10.27389/d.cnki.gxadu.2021.001600.
[17]	陈新颖, 盛敏, 李博, 等. 面向6G的无人机通信综述[J]. 电子与信息学报, 2022, 44(3): 781–789. doi: 10.11999/JEIT210789. CHEN Xinying, SHENG Min, LI Bo, et al. Survey on unmanned aerial vehicle communications for 6G[J]. Journal of Electronics & Information Technology, 2022, 44(3): 781–789. doi: 10.11999/JEIT210789.
[18]	LI Qi, CAO Zehong, ZHONG Jiang, et al. Graph representation learning with encoding edges[J]. Neurocomputing, 2019, 361: 29–39. doi: 10.1016/j.neucom.2019.07.076.
[19]	LIU Jianhua, WANG Xin, SHEN Shigen, et al. A bayesian Q-learning game for dependable task offloading against DDoS attacks in sensor edge cloud[J]. IEEE Internet of Things Journal, 2021, 8(9): 7546–7561. doi: 10.1109/JIOT.2020.3038554.
[20]	LIU Jianhua, WANG Xin, SHEN Shigen, et al. Intelligent jamming defense using DNN stackelberg game in sensor edge cloud[J]. IEEE Internet of Things Journal, 2022, 9(6): 4356–4370. doi: 10.1109/JIOT.2021.3103196.
[21]	ZHANG Peiying, ZHANG Yi, KUMAR N, et al. Deep reinforcement learning algorithm for latency-oriented IIoT resource orchestration[J]. IEEE Internet of Things Journal, 2023, 10(8): 7153–7163. doi: 10.1109/JIOT.2022.3229270.
[22]	WANG Chao, LIU Lei, JIANG Chunxiao, et al. Incorporating distributed DRL into storage resource optimization of space-air-ground integrated wireless communication network[J]. IEEE Journal of Selected Topics in Signal Processing, 2022, 16(3): 434–446. doi: 10.1109/JSTSP.2021.3136027.
[23]	JIANG Fan, ZHANG Lan, SUN Changyin, et al. Clustering and resource allocation strategy for D2D multicast networks with machine learning approaches[J]. China Communications, 2021, 18(1): 196–211. doi: 10.23919/jcc.2021.01.017.
[24]	QIU Chao, YU F R, YAO Haipeng, et al. Blockchain-based software-defined industrial internet of things: A dueling deep Q-learning approach[J]. IEEE Internet of Things Journal, 2019, 6(3): 4627–4639. doi: 10.1109/jiot.2018.2871394.
[25]	ZHANG Peiying, WANG Chao, JIANG Chunxiao, et al. Deep reinforcement learning assisted federated learning algorithm for data management of IIoT[J]. IEEE Transactions on Industrial Informatics, 2021, 17(12): 8475–8484. doi: 10.1109/tii.2021.3064351.
[26]	李焜, 王喆. 无线通信电波传播模型的研究[J]. 无线通信技术, 2008, 17(1): 10–12. doi: 10.3969/j.issn.1003-8329.2008.01.003. LI Kun and WANG Zhe. Research of wireless communications radio wave propagation model[J]. Wireless Communication Technology, 2008, 17(1): 10–12. doi: 10.3969/j.issn.1003-8329.2008.01.003.
[27]	焦昆. TD-LTE链路预算研究[J]. 现代商贸工业, 2013, 25(16): 161–162. JIAO Kun. Research on TD-LTE link budget[J]. Modern Business Trade Industry, 2013, 25(16): 161–162.
[28]	宋树晨. LTE无线网络规划及其优化研究[D]. [硕士论文], 南京邮电大学, 2019. SONG Shuchen. Research on LTE wireless network planning construction and optimization[D]. [Master dissertation], Nanjing University of Posts and Telecommunications, 2019.
[29]	于美, 朱一帆, 李加淳, 等. 基于澳大利亚山火的无人机调度问题[J]. 高等数学研究, 2023, 26(2): 31–34. doi: 10.3969/j.issn.1008-1399.2023.02.011. YU Mei, ZHU Yifan, LI Jiachun, et al. UAV scheduling problems based on Australian bushfire[J]. Studies in College Mathematics, 2023, 26(2): 31–34. doi: 10.3969/j.issn.1008-1399.2023.02.011.
[30]	MARAL G, BOUSQUET M, and SUN Zhili. Satellite Communications Systems: Systems, Techniques and Technology[M]. 6th ed. Hoboken: Wiley & Sons, 2020: 189–273. doi: 10.1002/9781119673811.
[31]	3GPP TS 38.214. 5G NR, Physical layer procedures for data[S]. 2022.
[32]	3GPP TS 36.133 Evolved Universal Terrestrial Radio Access (E-UTRA); Requirements for support of radio resource management[S]. 2022.
[33]	KIM M G and JO H S. Performance analysis of NB-IoT uplink in low earth orbit non-terrestrial networks[J]. Sensors, 2022, 22(18): 7097. doi: 10.3390/s22187097.
[34]	MA Lin, JIN Ningdi, CUI Yang, et al. LTE user equipment RSRP difference elimination method using multidimensional scaling for LTE fingerprint-based positioning system[C]. 2017 IEEE International Conference on Communications (ICC), Paris, France, 2017: 1–6. doi: 10.1109/icc.2017.7997470.
[35]	CHEN Fatang, LI Xiu, ZHANG Yun, et al. Design and implementation of initial cell search in 5G NR systems[J]. China Communications, 2020, 17(5): 38–49. doi: 10.23919/jcc.2020.05.005.
[36]	3GPP TS 38.211 NR; Physical channels and modulation[S]. 2020.
[37]	3GPP TS 36.214 Evolved Universal Terrestrial Radio Access (E-UTRA); Physical layer; measurements[S]. 2022.
[38]	陈惠河. TD-LTE小区间干扰抑制技术研究[D]. [硕士论文], 吉林大学, 2013. CHEN Huihe. Research on the technology of inter-cell interference control in TD-LTE[D]. [Master dissertation], Jilin University, 2013.
[39]	VAN HASSELT H, GUEZ A, and SILVER D. Deep reinforcement learning with double Q-learning[C]. The 30th AAAI Conference on Artificial Intelligence, Phoenix, USA, 2022: 2094-2100. doi: 10.1609/aaai.v30i1.10295.
[40]	WANG Ziyu, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]. The 33rd International Conference on International Conference on Machine Learning, New York, USA, 2016: 1995–2003.
[41]	DE SANTIS E, GIUSEPPI A, PIETRABISSA A, et al. Satellite integration into 5G: Deep reinforcement learning for network selection[J]. Machine Intelligence Research, 2022, 19(2): 127–137. doi: 10.1007/s11633-022-1326-3.
[42]	彭代渊, 梁宏斌, 罗玉娇. 基于最大接收功率的异构蜂窝网络接入方法[P]. 中国, 106792893A, 2017. PENG Daiyuan, LIANG Hongbin, and LUO Yujiao Heterogeneous cellular network access method based on maximum received power[P]. CN, 106792893A, 2017.
[43]	姚继明, 郭经红, 张浩, 等. 基于功率优先级的电力LTE专网随机接入技术[J]. 电力系统自动化, 2016, 40(10): 127–131. doi: 10.7500/AEPS20150820005. YAO Jiming, GUO Jinghong, ZHANG Hao, et al. Random access technology of electric dedicated LTE network based on power priority[J]. Automation of Electric Power Systems, 2016, 40(10): 127–131. doi: 10.7500/AEPS20150820005.
[44]	王庆. 5G移动通信大量用户随机接入机制研究[D]. [硕士论文], 北京交通大学, 2018. WANG Qing. Contention-based random access for massive connections in 5G[D]. [Master dissertation], Beijing Jiaotong University, 2018.
[45]	MCPHAIL C, MAIER H R, KWAKKEL J H, et al. Robustness metrics: How are they calculated, when should they be used and why do they give different results?[S]. Earth’s Future, 2018, 6(2): 169–191. doi: 10.1002/2017EF000649.
[46]	LEE K and LIM S. Minimax optimal bandits for heavy tail rewards[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 1–15. doi: 10.1109/tnnls.2022.3203035.