面向无线传感器网络信息年龄的多无人机轨迹优化算法

胡昊南; 韩铭; 李文鹏; 张杰

doi:10.11999/JEIT230458

面向无线传感器网络信息年龄的多无人机轨迹优化算法

doi: 10.11999/JEIT230458

胡昊南²,
韩铭^{1, 2, ,},
李文鹏^{1, 2},
张杰^{2, 3}

1.
重庆邮电大学通信与信息工程学院重庆 400065
2.
移动通信技术重点实验室重庆 400065
3.
英国谢菲尔德大学谢菲尔德 S102TN

基金项目: 国家自然科学基金 (61831002)，重庆市研究生科研创新项目(CYS21300)

详细信息

作者简介:
胡昊南：男，副教授，研究方向为5G/B5G通信、5G NR-U蜂窝和WiFi共存网络、边缘计算

韩铭：女，硕士生，研究方向为无人机、机器学习

李文鹏：男，硕士生，研究方向为无线通信系统、机器学习

张杰：男，教授，研究方向为室内外联合5G/4G/WiFi网络

通讯作者:
韩铭　170818462@qq.com

中图分类号: TN929.5
计量
- 文章访问数: 558
- HTML全文浏览量: 283
- PDF下载量: 107
- 被引次数: 0
出版历程
- 收稿日期: 2023-05-19
- 修回日期: 2023-09-26
- 网络出版日期: 2023-10-08
- 刊出日期: 2024-04-24

Multi-Unmanned Aerial Vehicles Trajectory Optimization for Age of Information Minimization in Wireless Sensor Networks

HU Haonan²,
HAN Ming^{1, 2
, ,},
LI Wenpeng^{1, 2},
ZHANG Jie^{2, 3}

1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Key Laboratory of Mobile Communication Technology, Chongqing, Chongqing 400065, China
3.
The University of Sheffield, Sheffield S10 2TN, British

Funds: The National Natural Science Foundation of China (61831002), Chongqing Graduate Research Innovation Project (CYS21300)

摘要

摘要: 由于无线传感器网络(WSN)中传感器的传输功率有限，同时可能与基站(BS)传输距离较远，造成无法及时交付数据，数据新鲜度过低，影响时延敏感型业务决策质量。因此，采用无人机(UAV)辅助收集传感器数据，成为提升无线传感器网络数据新鲜度的有效手段。该文通过信息年龄(AoI)性能指标评估无线传感器网络数据新鲜度，并基于集中式训练分布式执行框架的多智能体近端策略优化(MAPPO)方法研究了无人机轨迹优化算法。通过联合优化所有无人机的飞行轨迹，实现地面节点平均加权信息年龄的最小化。仿真结果验证了所提多无人机路径规划算法在降低无线传感器网络信息年龄方面的有效性。
- 无人机辅助通信 /
- 信息年龄 /
- 轨迹规划 /
- 多智能体强化学习
Abstract: Due to the limited transmitting power of sensors in the Wireless Sensor Network (WSN) and high probability of large distance between sensors and their associated Base Station(BS), the sensor data may not be received in time. This will reduce the data freshness of sensor data and affect the quality of decision for delay sensitive service. Therefore, the use of Unmanned Aerial Vehicles (UAVs) to assist in collecting sensor data has become an effective solution to decrease the data freshness, measured by Age of Information (AoI), in wireless sensor networks. A UAV trajectory optimization algorithm based on the Multi-Agent Proximal Policy Optimization (MAPPO) method is developed in this paper, which employs a centralized-training and distributed-execution framework. By jointly optimizing the flight trajectories of all UAVs, the average AoI of all ground nodes is minimized. The simulation results verify the effectiveness of our proposed UAV trajectory optimization algorithm on minimizing the AoI in the WSN.
- Unmanned Aerial Vehicles(UAV)-assisted communication /
- Age of Information (AoI) /
- Trajectory planning /
- Multi-agent reinforcement learning

HTML全文

图 1 系统模型

下载: 全尺寸图片幻灯片

图 2 地面节点$ j $的AoI

下载: 全尺寸图片幻灯片

图 3 多无人机轨迹优化算法框架

下载: 全尺寸图片幻灯片

图 4 不同算法下得到的平均加权AoI比较

下载: 全尺寸图片幻灯片

图 5 不同学习率下得到的平均加权AoI比较

下载: 全尺寸图片幻灯片

图 6 不同地面节点数量下不同算法性能比较

下载: 全尺寸图片幻灯片

图 7 不同无人机数量下MAPPO不同框架的性能比较

下载: 全尺寸图片幻灯片

图 8 不同地面节点数量的无人机轨迹图

下载: 全尺寸图片幻灯片

算法1　基于MAPPO方法的多无人机路径规划算法
初始化：Actor网络、Critic网络的参数$ \theta $和$ \omega $，设置训练最大回　合数(episode)为$ T $，单个回合最大步数(step)为$ L $，无人机数量　为$ N $，批采样大小(batch size)为$ B $，迭代(epoch)大小为$ K $；
For episode=1: $ T $ do
初始化每个无人机状态$ {s_i}(t) $、联合状态$ S(t) $
For step=1: $ L $ do
每个无人机根据自身策略选择动作$ {a_i}(t) $
执行动作$ A(t) = \left\{ {{a_1}(t),{a_2}(t),\cdots,{a_n}(t)} \right\} $，得到奖励　　　　$ R(t) $和下一状态$ S(t{\text{ + 1}}) $
For UAV=1: $ N $ do
将采样轨迹$ ({s_i}(t),{a_i}(t),r(t),{s_i}(t + 1)) $存入经验池$ D $
基于采样轨迹计算GAE
For epoch=1:$ K $ do
从经验池$ D $随机抽取小批次数据$ B $来训练网络
应用Adam优化器求梯度下降
根据式(20)更新Actor网络参数$ \theta = \theta ' $
根据式(21)更新Critic网络参数$ \omega = \omega ' $
End for
End for
$ s(t) = s(t + 1),\;S(t) = S(t + 1) $
End for
End for

下载: 导出CSV

表 1 仿真环境参数

参数	数值
无人机飞行高度$ {h_i} $(m)	120
基站高度$ {h_{\text{b}}} $(m)	25
无人机的飞行速度$ v $(m)	10
最小防碰撞距离$ {d_{\min }} $(m)	5
无人机传输功率$ {P_{\text{u}}} $(dBm)	23
传感器传输功率$ {P_{\text{s}}} $(w)	0.1
噪声功率$ {\sigma ^2} $(dBm)	–100
通信带宽$ B $(MHz)	5
时隙长度$ {T_{\text{c}}} $(s)	1
数据包大小$ S $(bits)	1e6
NLoS衰减因子$ \psi $	0.6
路径衰落参数$ \mu $	2
参考距离为1m时的信道增益$ {\beta _0} $(dB)	–60

下载: 导出CSV

表 2 强化学习参数

参数	数值
训练回合数$ {T_{\max }} $	2e4
每个回合最大步数$ {l_{\max }} $	100
经验池大小$ D $	1e4
采样批次大小$ B $	256
折扣因子$ \gamma $	0.95
Actor网络学习率$ {\delta _1} $	3e–4
Critic网络学习率$ {\delta _2} $	3e–4
裁剪超参数$ \varepsilon $	0.2
GAE的$ \lambda $	0.95

下载: 导出CSV

参考文献(20)

[1]	SHAH S K, JOSHI K, KHANTWAL S, et al. IoT and WSN integration for data acquisition and supervisory control[C]. 2022 IEEE World Conference on Applied Intelligence and Computing, Sonbhadra, India, 2022: 513–516.
[2]	AL-MASHHADANI M A, HAMDI M M, and MUSTAFA A S. Role and challenges of the use of UAV-aided WSN monitoring system in large-scale sectors[C]. 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications, Ankara, Turkey, 2021: 1–5.
[3]	AL-SHARE R, SHURMAN M, and ALMA’AITAH A. A collaborative learning-based algorithm for task offloading in UAV-aided wireless sensor networks[J]. The Computer Journal, 2021, 64(10): 1575–1583. doi: 10.1093/comjnl/bxab100.
[4]	VERMA S and ADHYA A. Routing in UAVs-assisted 5G wireless sensor network: Recent advancements, challenges, research GAps, and future directions[C]. 2022 3rd International Conference on Intelligent Engineering and Management, London, United Kingdom, 2022: 422–428.
[5]	KAUL S, YATES R, and GRUTESER M. Real-time status: How often should one update?[C]. 2012 Proceedings IEEE INFOCOM, Orlando, USA, 2012: 2731–2735.
[6]	ZHANG Guangyang, SHEN Chao, SHI Qingjiang, et al. AoI minimization for WSN data collection with periodic updating scheme[J]. IEEE Transactions on Wireless Communications, 2023, 22(1): 32–46. doi: 10.1109/TWC.2022.3190986.
[7]	LIU Juan, WANG Xijun, BAI Bo, et al. Age-optimal trajectory planning for UAV-assisted data collection[C]. IEEE Conference on Computer Communications Workshops, Honolulu, USA, 2018: 553–558.
[8]	TONG Peng, LIU Juan, WANG Xijun, et al. UAV-enabled age-optimal data collection in wireless sensor networks[C]. 2019 IEEE International Conference on Communications Workshops, Shanghai, China, 2019: 1–6.
[9]	JIANG Wenwen, SHEN Chao, AI Bo, et al. Peak age of information minimization for UAV-aided wireless sensing and communications[C]. 2021 IEEE International Conference on Communications Workshops, Montreal, Canada, 2021: 1–6.
[10]	ZHOU Conghao, HE Hongli, YANG Peng, et al. Deep RL-based trajectory planning for AoI minimization in UAV-assisted IoT[C]. 2019 11th International Conference on Wireless Communications and Signal Processing, Xi'an, China, 2019: 1–6.
[11]	ABD-ELMAGID M, FERDOWSI A, DHILLON H S, et al. Deep reinforcement learning for minimizing age-of-information in UAV-assisted networks[C]. 2019 IEEE Global Communications Conference, Waikoloa, USA, 2019: 1–6.
[12]	YI Mengjie, WANG Xijun, LIU Juan, et al. Deep reinforcement learning for fresh data collection in UAV-assisted IoT networks[C]. IEEE Conference on Computer Communications Workshops, Toronto, Canada, 2020: 716–721.
[13]	ZHANG Qian, MIAO Jiansong, ZHANG Zhicai, et al. Energy-efficient video streaming in UAV-enabled wireless networks: A safe-DQN approach[C]. GLOBECOM 2020 - 2020 IEEE Global Communications Conference, Taipei, China, 2020: 1–7.
[14]	FU Fang, JIAO Qi, YU F R, et al. Securing UAV-to-vehicle communications: A curiosity-driven deep Q-learning network (C-DQN) approach[C]. 2021 IEEE International Conference on Communications Workshops, Montreal, Canada, 2021: 1–6.
[15]	HU Jingzhi, ZHANG Hongliang, SONG Lingyang, et al. Cooperative internet of UAVs: Distributed trajectory design by multi-agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(11): 6807–6821. doi: 10.1109/TCOMM.2020.3013599.
[16]	WU Fanyi, ZHANG Hongliang, WU Jianjun, et al. Cellular UAV-to-device communications: Trajectory design and mode selection by multi-agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(7): 4175–4189. doi: 10.1109/tcomm.2020.2986289.
[17]	WU Fanyi, ZHANG Hongliang, WU Jianjun, et al. UAV-to-device underlay communications: Age of information minimization by multi-agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2021, 69(7): 4461–4475. doi: 10.1109/TCOMM.2021.3065135.
[18]	SAMIR M, ASSI C, SHARAFEDDINE S, et al. Age of information aware trajectory planning of UAVs in intelligent transportation systems: A deep learning approach[J]. IEEE Transactions on Vehicular Technology, 2020, 69(11): 12382–12395. doi: 10.1109/TVT.2020.3023861.
[19]	AL-HOURANI A, KANDEEPAN S, and JAMALIPOUR A. Modeling air-to-ground path loss for low altitude platforms in urban environments[C]. 2014 IEEE Global Communications Conference, Austin, USA, 2014: 2898–2904.
[20]	SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. https://arxiv.org/abs/1707.06347, 2017.