高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向物联网的深度Q网络无人机路径规划

张建行 康凯 钱骅 杨淼

张建行, 康凯, 钱骅, 杨淼. 面向物联网的深度Q网络无人机路径规划[J]. 电子与信息学报, 2022, 44(11): 3850-3857. doi: 10.11999/JEIT210962
引用本文: 张建行, 康凯, 钱骅, 杨淼. 面向物联网的深度Q网络无人机路径规划[J]. 电子与信息学报, 2022, 44(11): 3850-3857. doi: 10.11999/JEIT210962
ZHANG Jianhang, KANG Kai, QIAN Hua, YANG Miao. UAV Trajectory Planning Based on Deep Q-Networkfor Internet of Things[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3850-3857. doi: 10.11999/JEIT210962
Citation: ZHANG Jianhang, KANG Kai, QIAN Hua, YANG Miao. UAV Trajectory Planning Based on Deep Q-Networkfor Internet of Things[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3850-3857. doi: 10.11999/JEIT210962

面向物联网的深度Q网络无人机路径规划

doi: 10.11999/JEIT210962
基金项目: 国家重点研发计划(2020YFB2205603),国家自然科学基金(61971286),上海市科技创新行动计划(19DZ1204300)
详细信息
    作者简介:

    张建行:男,博士生,研究方向为无人机辅助物联网通信

    康凯:男,正高级工程师,研究方向为无线通信物理层

    钱骅:男,研究员,研究方向为无线通信、非线性信号处理、大数据信号处理

    杨淼:男,博士生,研究方向为边缘智能与强化学习

    通讯作者:

    康凯 kangk@sari.ac.cn

  • 中图分类号: TP92

UAV Trajectory Planning Based on Deep Q-Networkfor Internet of Things

Funds: The National Key Research and Development Program of China (2020YFB2205603), The National Natural Science Foundation of China (61971286), The Science and Technology Commission Foundation of Shanghai (19DZ1204300)
  • 摘要: 随着无人机技术的广泛应用,基于无人机辅助数据收集的物联网架构扩展了物联网的应用范围,尤其适用于军事战场、灾害救援等极端场景。针对上述场景,该文提出一种基于深度Q网络(Deep Q-Network, DQN)框架的无人机飞行路径规划算法。该算法以无人机飞行周期内收集信息的平均信息年龄(Age of Information, AoI)为优化目标,来保证无人机收集数据的时效性。仿真结果表明,所提算法可以有效降低无人机单个飞行周期内收集数据的平均AoI。与随机算法、基于最大AoI的贪心算法、最短路径算法以及基于AoI的路径规划算法(AoI-based Trajectory Planning, ATP)相比,平均AoI分别降低了约81%, 67%, 56%和39%。该研究实现了无人机辅助物联网系统中,数据的高效、低时延采集。
  • 图  1  DQN算法框图

    图  2  不同算法在两种仿真场景下的性能对比

    图  3  不同地面节点的平均AoI对比

    图  4  节点个数对DQN算法性能的影响

    图  5  无人机飞行速度对DQN算法性能的影响

    表  1  基于DQN的无人机路径规划算法

     输入:学习速率$ \alpha $;打折率$ \gamma $;随机选取动作的参数$ \varepsilon $和$ \mu $;步长
        $ w $;
     (1) 初始化 网络$ {Q_r} $和$ {Q_t} $的参数,并令$ {\theta _r}{\text{ = }}{\theta _t} $
     (2) for 每一个训练回合 do
     (3)   初始化状态$ {s_k} $
     (4)   while $ {T_k} < {T_{\max }} $ do
     (5)     以$ \varepsilon $的概率随机选择动作$ {a_k} $,否则选择
           $ {a_k} = \arg {\max _a}{Q_r}(s,a;{\theta _r}) $
     (6)     执行动作$ {a_k} $,按照式(2)和式(5)更新状态,计算奖
           励,得到$ \left( {{s_k},{a_k},{r_k},{s_{k + 1}}} \right) $并存储在经验池中
     (7)     if 经验池已存储满 do
     (8)       随机抽取$ {N_b} $个样本按照式(11)训练
     (9)     End if
     (10)     if $ k\bmod w = 0 $ do
     (11)       $ {\theta _r}{\text{ = }}{\theta _t} $
     (12)     End if
     (13)     $ {s_k} \leftarrow {s_{k + 1}} $
     (14)     $ \varepsilon \leftarrow \varepsilon - \mu $
     (15)  end while
     (16) end for
    下载: 导出CSV

    表  2  DQN算法参数

    参数名称学习速率折扣因子随机概率衰减因子超参数经验池训练批次更新步长
    参数符号$ \alpha $$ \gamma $$ \varepsilon $$ \mu $$ \lambda $$ {\text{|}}D{\text{|}} $$ {\text{|}}B{\text{|}} $$ w $
    数值0.0010.90.950.000 1103000128100
    下载: 导出CSV

    表  3  不同算法的AoI性能对比(s)

    算法名称DQNATP最短路径贪心法随机算法
    仿真场景17.712.717.823.342.1
    仿真场景28.814.319.327.645.1
    下载: 导出CSV
  • [1] LI Shancang, XU Lida, and ZHAO Shanshan. The internet of things: A survey[J]. Information Systems Frontiers, 2015, 17(2): 243–259. doi: 10.1007/s10796-014-9492-7
    [2] ZENG Yong and ZHANG Rui. Energy-efficient UAV communication with trajectory optimization[J]. IEEE Transactions on Wireless Communications, 2017, 16(6): 3747–3760. doi: 10.1109/TWC.2017.2688328
    [3] 宋庆恒, 郑福春. 基于无人机的物联网无线通信的潜力与方法[J]. 物联网学报, 2019, 3(1): 82–89. doi: 10.11959/j.issn.2096-3750.2019.00096

    SONG Qingheng and ZHENG Fuchun. Potential and methods of wireless communications for Internet of things based on UAV[J]. Chinese Journal on Internet of Things, 2019, 3(1): 82–89. doi: 10.11959/j.issn.2096-3750.2019.00096
    [4] ZENG Yong, ZHANG Rui, and LIM T J. Wireless communications with unmanned aerial vehicles: Opportunities and challenges[J]. IEEE Communications Magazine, 2016, 54(5): 36–42. doi: 10.1109/MCOM.2016.7470933
    [5] MOZAFFARI M, SAAD W, BENNIS M, et al. A tutorial on UAVs for wireless networks: Applications, challenges, and open problems[J]. IEEE Communications Surveys & Tutorials, 2019, 21(3): 2334–2360. doi: 10.1109/COMST.2019.2902862
    [6] 东方, 吴媚, 朱文捷, 等. 物联网环境下面向能耗优化的无人机飞行规划[J]. 东南大学学报:自然科学版, 2020, 50(3): 555–562. doi: 10.3969/j.issn.1001-0505.2020.03.019

    DONG Fang, WU Mei, ZHU Wenjie, et al. Energy-efficient flight planning for UAV in IoT environment[J]. Journal of Southeast University:Natural Science Edition, 2020, 50(3): 555–562. doi: 10.3969/j.issn.1001-0505.2020.03.019
    [7] ZENG Yong, ZHANG Rui, and LIM T J. Throughput maximization for UAV-enabled mobile relaying systems[J]. IEEE Transactions on Communications, 2016, 64(12): 4983–4996. doi: 10.1109/TCOMM.2016.2611512
    [8] GONG Jie, CHANG T H, SHEN Chao, et al. Flight time minimization of UAV for data collection over wireless sensor networks[J]. IEEE Journal on Selected Areas in Communications, 2018, 36(9): 1942–1954. doi: 10.1109/JSAC.2018.2864420
    [9] MONWAR M, SEMIARI O, and SAAD W. Optimized path planning for inspection by unmanned aerial vehicles swarm with energy constraints[C]. Proceedings of 2018 IEEE Global Communications Conference, Abu Dhabi, United Arab Emirates, 2018: 1–6.
    [10] WU Qingqing, ZENG Yong, and ZHANG Rui. Joint trajectory and communication design for multi-UAV enabled wireless networks[J]. IEEE Transactions on Wireless Communications, 2018, 17(3): 2109–2121. doi: 10.1109/TWC.2017.2789293
    [11] 付澍, 杨祥月, 张海君, 等. 物联网数据收集中无人机路径智能规划[J]. 通信学报, 2021, 42(2): 124–133. doi: 10.11959/j.issn.1000-436x.2021036

    FU Shu, YANG Xiangyue, ZHANG Haijun, et al. UAV path intelligent planning in iot data collection[J]. Journal on Communications, 2021, 42(2): 124–133. doi: 10.11959/j.issn.1000-436x.2021036
    [12] DONG Yunquan, CHEN Zhengchuan, LIU Shanyun, et al. Age-upon-decisions minimizing scheduling in internet of things: To Be random or to Be deterministic?[J]. IEEE Internet of Things Journal, 2020, 7(2): 1081–1097. doi: 10.1109/JIOT.2019.2950054
    [13] KOSTA A, PAPPAS N, and ANGELAKIS V. Age of Information: A new concept, metric, and tool[J]. Foundation and Trends in Networking, 2017, 12(3): 162–259. doi: 10.1561.1300000060
    [14] ABD-ELMAGID M A, PAPPAS N, and DHILLON H S. On the role of age of information in the internet of things[J]. IEEE Communications Magazine, 2019, 57(12): 72–77. doi: 10.1109/MCOM.001.1900041
    [15] DE BERG M, GUDMUNDSSON J, KATZ M J, et al. TSP with neighborhoods of varying size[J]. Journal of Algorithms, 2005, 57(1): 22–36. doi: 10.1016/j.jalgor.2005.01.010
    [16] WANG Chengliang, MA Fei, YAN Junhui, et al. Efficient aerial data collection with UAV in large-scale wireless sensor networks[J/OL]. International Journal of Distributed Sensor Networks, 2015, 11(11).
    [17] ALI Z A, MASROOR S, and AAMIR M. UAV based data gathering in wireless sensor networks[J]. Wireless Personal Communications, 2019, 106(4): 1801–1811. doi: 10.1007/s11277-018-5693-6
    [18] CHENG C F and YU Chaofu. Data gathering in wireless sensor networks: A combine-TSP-reduce approach[J]. IEEE Transactions on Vehicular Technology, 2016, 65(4): 2309–2324. doi: 10.1109/TVT.2015.2502625
    [19] BANDEIRA T W, COUTINHO W P, BRITO A V, et al. Analysis of path planning algorithms based on travelling salesman problem embedded in UAVs[C]. Proceedings of 2015 Brazilian Symposium on Computing Systems Engineering (SBESC), Foz do Iguacu, Brazil, 2015: 70–75.
    [20] KAUL S, YATES R, and GRUTESER M. Real-time status: How often should one update?[C]. Proceedings of 2012 IEEE INFOCOM, Orlando, USA, 2012: 2731–2735.
    [21] ZHOU Conghao, HE Hongli, YANG Peng, et al. Deep RL-based trajectory planning for AoI minimization in UAV-assisted IoT[C]. Proceedings of the 11th International Conference on Wireless Communications and Signal Processing, Xi'an, China, 2019: 1–6.
    [22] MODARES J, GHANEI F, MASTRONARDE N, et al. UB-ANC planner: Energy efficient coverage path planning with multiple drones[C]. Proceedings of 2017 IEEE International Conference on Robotics and Automation, Singapore, 2017: 6182–6189.
    [23] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
    [24] SOMASUNDARA A A, RAMAMOORTHY A, and SRIVASTAVA M B. Mobile element scheduling with dynamic deadlines[J]. IEEE Transactions on Mobile Computing, 2007, 6(4): 395–410. doi: 10.1109/TMC.2007.57
  • 加载中
图(5) / 表(3)
计量
  • 文章访问数:  912
  • HTML全文浏览量:  392
  • PDF下载量:  251
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-09
  • 修回日期:  2021-11-05
  • 网络出版日期:  2022-04-14
  • 刊出日期:  2022-11-14

目录

    /

    返回文章
    返回