基于层次注意力机制的多无人机巡检路径规划方法

费博雯; 邢文杰; 刘大千

doi:10.11999/JEIT260192

基于层次注意力机制的多无人机巡检路径规划方法

doi: 10.11999/JEIT260192 cstr: 32379.14.JEIT260192

辽宁工程技术大学软件学院(人工智能学院) 葫芦岛 125105

基金项目: 国家自然科学基金(62302509, 62303477)；辽宁省科技厅人工智能创新发展计划项目(2023JH26/10300027)；辽宁省教育厅基本科研项目(LJ212410147090)

详细信息

作者简介:
费博雯：女，副教授，研究方向为分布式资源组织与优化、智能无人系统

邢文杰：男，硕士生，研究方向为多无人机路径规划

刘大千：男，教授，研究方向为智能无人系统、多无人平台协同优化

通讯作者:
费博雯　feibowen@lntu.edu.cn

中图分类号: TN929.5; TP391
计量
- 文章访问数: 282
- HTML全文浏览量: 184
- PDF下载量: 16
- 被引次数: 0
出版历程
- 收稿日期: 2026-02-27
- 修回日期: 2026-05-20
- 录用日期: 2026-05-29
- 网络出版日期: 2026-06-08

Hierarchical Attention Mechanism-based Path Planning for Multi-UAV Inspection

Software College (College of Artificial Intelligence), Liaoning Technical University, Huludao 125105, China

Funds: The National Natural Science Foundation of China (62302509,62303477), Liaoning Province Artificial Intelligence Innovation and Development Plan Project (2023JH26/10300027), The Basic Research Project of Liaoning Provincial Department of Education (LJ212410147090)

摘要

摘要: 针对复杂电网系统中的电力巡检任务，现有基于多无人机的巡检方法普遍存在协同调度能力不足、难以准确构建巡检节点间拓扑结构关系的问题，该文提出一种基于层次注意力机制的多无人机巡检路径规划方法(HAPPI)。该方法采用编码器-解码器架构，通过设计多层次注意力机制，进而增强全局信息感知能力，避免决策过程中的短视问题。在此基础上，通过显式学习充电站之间的拓扑关系与巡检设备点的空间分布特征，提出一种无人机路径选择策略，在严格电量约束的前提下，实现无人机编队总飞行距离的最小化。为验证方法性能，设计3种不同规模的场景进行实验评估。在这些规模场景中，所提方法相比基线方法平均性能提升约12%。实验结果表明，在不同问题规模、节点分布及充电站数量变化的情况下，HAPPI均表现出优越且稳定的性能，具有良好的泛化能力和路径规划效能。
- 电力巡检 /
- 多无人机 /
- 路径规划 /
- 层次注意力机制 /
- 显式学习
Abstract: Objective In modern power inspection, the use of multiple Unmanned Aerial Vehicles (UAVs) for cooperative inspection is an efficient but challenging task. Existing multi-UAV path planning methods often have limited cooperative scheduling capability. They also fail to accurately capture the topological relationships among heterogeneous nodes, especially those between device nodes and charging stations under strict energy constraints. To address these limitations, this paper proposes Hierarchical Attention mechanism-based Path Planning for multi-UAV Inspection (HAPPI). The objective is to minimize the total flight distance of the UAV fleet while ensuring that all device nodes are inspected and all UAVs safely return to the base station under energy and visit-count constraints. Methods The multi-UAV power inspection problem is first formulated as a combinatorial optimization problem with energy constraints. It is then modeled within a Markov Decision Process (MDP) framework. To solve this problem, HAPPI adopts an encoder-decoder architecture with a customized hierarchical attention mechanism. The encoder uses a multi-level attention design to model three types of node relationships. Self-attention among device nodes is used to learn spatial proximity and visit-order preferences. Cross-attention between device nodes and charging stations is used to model energy supply-demand relationships. Self-attention among charging stations is used to explicitly capture the topological structure of the charging-station network. This hierarchical design enables the model to distinguish functional differences and dependencies among heterogeneous nodes. The decoder integrates the global graph embedding, the embedding of the last visited node, and the current remaining energy of the UAV to generate a context vector. A single-head attention mechanism is then used to compute compatibility scores for all candidate nodes. A masking strategy excludes infeasible nodes, including visited nodes, unreachable nodes, nodes that would prevent the UAV from reaching a charging station, and premature returns to the base station. The final node is selected from a probability distribution generated by softmax, which supports both greedy and sampling decoding strategies. The policy network is trained using reinforcement learning, and a baseline network is used to stabilize training. Policy-gradient optimization is used to minimize the expected total path length (Fig. 2). Results and Discussions Extensive simulations are conducted on three problem scales: T20C2, with 20 device nodes and 2 charging stations; T60C6, with 60 device nodes and 6 charging stations; and T100C10, with 100 device nodes and 10 charging stations. The training results show that HAPPI achieves faster convergence and a lower final cost than the baseline Attention Model (AM) and Heterogeneous Attention-based Deep Reinforcement Learning (HADRL) methods (Fig. 4). In the comprehensive performance comparison, HAPPI with sampling obtains the shortest total path lengths on T60C6 and T100C10, with values of 6.21 and 8.41, respectively. It outperforms five classical metaheuristic algorithms and two deep reinforcement learning baselines on these two larger-scale scenarios (Table 1). On T20C2, HADRL with sampling achieves the shortest path length, whereas HAPPI remains highly competitive. Overall, HAPPI reduces the total path length by approximately 12% on average compared with the baseline methods. The visualization results show that HAPPI generates routes with fewer route crossovers and a more balanced workload among UAVs, improving safety and efficiency (Fig. 6 and Fig. 7). The single-UAV path length distribution further confirms the superior load-balancing capability of HAPPI across all problem scales (Fig. 8). Conclusions This paper presents HAPPI, a hierarchical attention mechanism-based deep reinforcement learning method for cooperative path planning in multi-UAV power inspection scenarios with multiple charging stations. By explicitly modeling spatial relationships among device nodes, energy dependencies between device nodes and charging stations, and the internal topology of the charging-station network, HAPPI improves information aggregation and constraint satisfaction. Experimental results across different problem scales show that HAPPI achieves higher planning quality, greater computational efficiency, and stronger generalization than heuristic and learning-based comparison methods. Future work will extend this framework to multi-objective optimization that considers time, risk, and energy trade-offs, and will further validate the method using real-world inspection data.
- Power inspection /
- Multiple Unmanned Aerial Vehicles (Multi-UAV) /
- Path planning /
- Hierarchical attention mechanism /
- Explicit learning

HTML全文

图 1 多无人机电力巡检路径规划示意图

下载: 全尺寸图片幻灯片

图 2 多无人机路径规划的层次化注意力模型架构图

下载: 全尺寸图片幻灯片

图 3 3种不同问题规模的节点分布

下载: 全尺寸图片幻灯片

图 4 3种算法在T20C2, T60C6, T100C10上的训练过程

下载: 全尺寸图片幻灯片

图 5 算法在不同规模问题下的目标函数值及运行时间对比

下载: 全尺寸图片幻灯片

图 6 3种算法在T60C6的路径规划结果

下载: 全尺寸图片幻灯片

图 7 3种算法在T100C10的路径规划结果

下载: 全尺寸图片幻灯片

图 8 3种不同问题规模下的单无人机路径长度

下载: 全尺寸图片幻灯片

图 9 在T60C6和T100C10下充电站数量与模型性能关系

下载: 全尺寸图片幻灯片

表 1 算法在T20C2, T60C6, T100C10上的性能对比

方法	T20C2			T60C6			T100C10
方法	目标值	差异(%)	时间(s)	目标值	差异(%)	时间(s)	目标值	差异(%)	时间(s)
ABC	6.36	58.6	2.38	17.77	186	11.17	32.40	285	58.78
VNS	6.45	60.8	0.58	14.02	125	3.71	24.89	195	9.48
SA	5.27	31.42	75.96	8.12	30.75	291.36	12.67	50.65	595.37
ACO	7.84	95.51	4.03	14.68	136	24.76	18.90	124	65.74
TS	7.72	92.5	1.06	16.84	171	4.48	27.70	229	10.91
AM(贪婪)	4.32	5.2	0.02	6.60	6.28	0.04	8.67	3.1	0.09
AM(采样)	4.27	4	0.02	6.58	5.9	0.04	8.63	2.6	0.08
HADRL(贪婪)	4.19	4.5	0.01	6.47	4.2	0.03	8.51	1.2	0.08
HADRL(采样)	4.01	<0.01	0.01	6.44	3.7	0.02	8.49	0.9	0.07
HAPPI(贪婪)	4.09	1.9	0.01	6.39	2.9	0.04	8.46	0.6	0.08
HAPPI(采样)	4.07	1.4	0.01	6.21	<0.01	0.03	8.41	<0.01	0.1

下载: 导出CSV

表 2 算法在高斯分布下的性能对比

问题场景	指标	AM(贪婪)	AM(采样)	HADRL(贪婪)	HADRL(采样)	HAPPI(贪婪)	HAPPI(采样)
T20C2(0.3)	目标值(km)	5.38	5.21	5.18	4.98	4.33	4.17
T20C2(0.3)	算法运行时间(s)	0.02	0.02	0.02	<0.01	0.01	0.01
T60C6(0.3)	目标值(km)	10.27	10.14	9.87	9.82	7.33	7.11
T60C6(0.3)	算法运行时间(s)	0.05	0.04	0.04	0.04	0.03	0.04
T100C10(0.3)	目标值(km)	13.28	12.94	11.83	11.51	10.37	10.19
T100C10(0.3)	算法运行时间(s)	0.07	0.08	0.06	0.06	0.07	0.07
T20C2(0.6)	目标值(km)	4.61	4.41	4.34	4.11	4.17	4.13
T20C2(0.6)	算法运行时间(s)	0.01	0.02	0.01	0.01	<0.01	0.02
T60C6(0.6)	目标值(km)	9.41	9.16	8.34	8.12	7.91	7.78
T60C6(0.6)	算法运行时间(s)	0.05	0.06	0.05	0.05	0.06	0.04
T100C10(0.6)	目标值(km)	10.93	10.87	10.73	10.15	10.01	9.97
T100C10(0.6)	算法运行时间(s)	0.05	0.05	0.07	0.06	0.05	0.04

下载: 导出CSV

表 3 算法在瑞利分布下的性能对比

问题场景	指标	AM(贪婪)	AM(采样)	HADRL(贪婪)	HADRL(采样)	HAPPI(贪婪)	HAPPI(采样)
T20C2(0.3)	目标值(km)	4.92	4.87	4.75	4.69	4.47	4.41
T20C2(0.3)	算法运行时间(s)	0.02	0.02	0.01	0.03	<0.01	0.01
T60C6(0.3)	目标值(km)	8.86	8.74	8.21	8.17	7.81	7.93
T60C6(0.3)	算法运行时间(s)	0.06	0.03	0.05	0.07	0.04	0.04
T100C10(0.3)	目标值(km)	12.34	11.98	11.02	10.97	10.75	10.42
T100C10(0.3)	算法运行时间(s)	0.08	0.09	0.08	0.07	0.09	0.08
T20C2(0.6)	目标值(km)	4.70	4.67	4.58	4.49	4.36	4.31
T20C2(0.6)	算法运行时间(s)	0.02	0.02	0.01	0.02	0.02	0.02
T60C6(0.6)	目标值(km)	9.63	9.59	8.83	8.65	7.92	7.78
T60C6(0.6)	算法运行时间(s)	0.06	0.06	0.04	0.06	0.05	0.05
T100C10(0.6)	目标值(km)	14.91	14.66	13.78	13.52	12.63	12.31
T100C10(0.6)	算法运行时间(s)	0.09	0.1	0.08	0.06	0.07	0.08

下载: 导出CSV

参考文献(28)

[1]	ZHANG Hui, XU Zhiwen, CHEN Bo, et al. An adaptive nearest point routing method based on charging nest deployment optimization for UAVs power tower inspection[J]. IEEE Transactions on Industrial Informatics, 2025, 21(10): 7881–7890. doi: 10.1109/TII.2025.3574421.
[2]	FEI Bowen, BAO Weidong, ZHU Xiaomin, et al. Autonomous cooperative search model for multi-UAV with limited communication network[J]. IEEE Internet of Things Journal, 2022, 9(19): 19346–19361. doi: 10.1109/JIOT.2022.3165278.
[3]	唐伦, 戴军, 成章超, 等. 基于数字孪生的多自动驾驶车辆分布式协同路径规划算法[J]. 电子与信息学报, 2024, 46(6): 2525–2532. doi: 10.11999/JEIT230678. TANG Lun, DAI Jun, CHENG Zhangchao, et al. Distributed collaborative path planning algorithm for multiple autonomous vehicles based on dgital twin[J]. Journal of Electronics & Information Technology, 2024, 46(6): 2525–2532. doi: 10.11999/ JEIT230678. doi: 10.11999/JEIT230678.
[4]	LI Junfei, HU Yanrong, and YANG S X. A novel knowledge-based genetic algorithm for robot path planning in complex environments[J]. IEEE Transactions on Evolutionary Computation, 2025, 29(2): 375–389. doi: 10.1109/TEVC.2025.3534026.
[5]	ZHANG Zhe, JIANG Ju, LING K V, et al. Real-time path planning for autonomous UAVs: An event-triggered multimodal adaptive pigeon-inspired optimization approach[J]. IEEE Transactions on Aerospace and Electronic Systems, 2025, 61(4): 10972–10981. doi: 10.1109/TAES.2025.3550128.
[6]	HU Weijun and MA Xianlong. Optimization algorithm of UAVs task assignment and path planning based on dynamic cluster particle swarm optimization[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(10): 18157–18169. doi: 10.1109/TITS.2025.3578464.
[7]	ZHANG Jing, CUI Yani, and REN Jia. Dynamic mission planning algorithm for UAV formation in battlefield environment[J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(4): 3750–3765. doi: 10.1109/TAES.2022.3231244.
[8]	XU Zhen, ZHANG Enze, and CHEN Qingwei. Rotary unmanned aerial vehicles path planning in rough terrain based on multi-objective particle swarm optimization[J]. Journal of Systems Engineering and Electronics, 2020, 31(1): 130–141. doi: 10.21629/JSEE.2020.01.14.
[9]	CHEN Jinchao, LING Fuyuan, ZHANG Ying, et al. Coverage path planning of heterogeneous unmanned aerial vehicles based on ant colony system[J]. Swarm and Evolutionary Computation, 2022, 69: 101005. doi: 10.1016/j.swevo.2021.101005.
[10]	RIBEIRO R G, COTA L P, EUZEBIO T A M, et al. Unmanned-aerial-vehicle routing problem with mobile charging stations for assisting search and rescue missions in postdisaster scenarios[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(11): 6682–6696. doi: 10.1109/TSMC.2021.3088776.
[11]	JONES M R, DJAHEL S, and WELSH K. An efficient and rapidly adaptable lightweight multi-destination urban path planning approach for UAVs using Q-learning[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(10): 6624–6636. doi: 10.1109/TIV.2024.3387018.
[12]	LIU Xiao, LIU Yuanwei, CHEN Yue, et al. Trajectory design and power control for multi-UAV assisted wireless networks: A machine learning approach[J]. IEEE Transactions on Vehicular Technology, 2019, 68(8): 7957–7969. doi: 10.1109/TVT.2019.2920284.
[13]	CHEN Jinchao, LI Tingyang, ZHANG Ying, et al. Global-and-local attention-based reinforcement learning for cooperative behaviour control of multiple UAVs[J]. IEEE Transactions on Vehicular Technology, 2024, 73(3): 4194–4206. doi: 10.1109/TVT.2023.3327571.
[14]	LI Jianqing, ZHU Yihao, LI Chaoyong, et al. A motion camouflage-inspired path planning method for UAVs based on reinforcement learning[J]. IEEE Transactions on Aerospace and Electronic Systems, 2025, 61(2): 4105–4114. doi: 10.1109/TAES.2024.3496417.
[15]	WESTHEIDER J, RÜCKIN J, and POPOVIĆ M. Multi-UAV adaptive path planning using deep reinforcement learning[C]. 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, Detroit, USA, 2023: 649–656. doi: 10.1109/IROS55552.2023.10342516.
[16]	杨来义, 毕敬, 苑海涛. 基于SAC算法的移动机器人智能路径规划[J]. 系统仿真学报, 2023, 35(8): 1726–1736. doi: 10.16182/j.issn1004731x.joss.22-0412. YANG Laiyi, BI Jing, and YUAN Haitao. Intelligent path planning for mobile robots based on SAC algorithm[J]. Journal of System Simulation, 2023, 35(8): 1726–1736. doi: 10.16182/j.issn1004731x.joss.22-0412.
[17]	CHU Zhenzhong, WANG Fulun, LEI Tingjun, et al. Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(1): 108–120. doi: 10.1109/TIV.2022.3153352.
[18]	FAN Mingfeng, WU Yaoxin, LIAO Tianjun, et al. Deep reinforcement learning for UAV routing in the presence of multiple charging stations[J]. IEEE Transactions on Vehicular Technology, 2023, 72(5): 5732–5746. doi: 10.1109/TVT.2022.3232607.
[19]	JIN Meixi, XU Tianbo, and XU Lin. Research on UAV path planning method based on reinforcement learning and behavioral tree modeling[C]. The 37th Chinese Control and Decision Conference (CCDC), Xiamen, China, 2025: 5348–5353. doi: 10.1109/CCDC65474.2025.11090282.
[20]	XI Meng, DAI Huiao, HE Jingyi, et al. A lightweight reinforcement-learning-based real-time path-planning method for unmanned aerial vehicles[J]. IEEE Internet of Things Journal, 2024, 11(12): 21061–21071. doi: 10.1109/JIOT.2024.3350525.
[21]	张建行, 康凯, 钱骅, 等. 面向物联网的深度Q网络无人机路径规划[J]. 电子与信息学报, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962. ZHANG Jianhang, KANG Kai, QIAN Hua, et al. UAV trajectory planning based on deep Q-network for Internet of Things[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962.
[22]	鲜斌, 宋宁. 基于模型预测控制与改进人工势场法的多无人机路径规划[J]. 控制与决策, 2024, 39(7): 2133–2141. doi: 10.13195/j.kzyjc.2023.0892. XIAN Bin and SONG Ning. A multiple UAVs path planning method based on model predictive control and improved artificial potential field[J]. Control and Decision, 2024, 39(7): 2133–2141. doi: 10.13195/j.kzyjc.2023.0892.
[23]	伍国华, 毛妮, 徐彬杰, 等. 基于自适应大规模邻域搜索算法的多车辆与多无人机协同配送方法[J]. 控制与决策, 2023, 38(1): 201–210. doi: 10.13195/j.kzyjc.2021.2268. WU Guohua, MAO Ni, XU Binjie, et al. The cooperative delivery of multiple vehicles and multiple drones based on adaptive large neighborhood search[J]. Control and Decision, 2023, 38(1): 201–210. doi: 10.13195/j.kzyjc.2021.2268.
[24]	BAHDANAU D, CHO K, and BENGIO Y. Neural machine translation by jointly learning to align and translate[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015.
[25]	ZHANG Xiumei, LI Wensong, LI Hui, et al. Research on AGV path planning based on improved artificial bee colony algorithm[C]. 2023 35th Chinese Control and Decision Conference, Yichang, China, 2023: 703–708. doi: 10.1109/CCDC58219.2023.10327207.
[26]	CHU Jun, LAI Xiangjing, and WANG Zhaoyang. A randomized variable neighborhood search algorithm for solving the capacitated vehicle routing problem[C]. 2022 China Automation Congress, Xiamen, China, 2022: 2028–2032. doi: 10.1109/CAC57257.2022.10055038.
[27]	高云飞, 胡钰林, 刘鸣柳, 等. 多无人机输电线路巡检联合轨迹设计方法[J]. 电子与信息学报, 2024, 46(5): 1958–1967. doi: 10.11999/JEIT231199. GAO Yunfei, HU Yulin, LIU Mingliu, et al. Joint multi-UAV trajectory design for power line inspection[J]. Journal of Electronics & Information Technology, 2024, 46(5): 1958–1967. doi: 10.11999/JEIT231199.
[28]	CHEN Junfeng, ZHANG Xueping, and XU Luyixiao. Path planning algorithm based on hybrid A^* and adaptive ant colony optimization[C]. The 18th International Conference on Computational Intelligence and Security, Chengdu, China, 2022: 43–48. doi: 10.1109/CIS58238.2022.00017.