The Method of Hierarchical Attention Mechanism-based Path Planning for Multi-UAV Inspection
-
摘要: 针对复杂电网系统中的电力巡检任务,现有基于多无人机的巡检方法普遍存在协同调度能力不足、难以准确构建巡检节点间拓扑结构关系的问题,提出一种基于层次注意力机制的多无人机巡检路径规划方法(Hierarchical Attention mechanism-based Path Planning for multi-UAV Inspection, HAPPI)。该方法采用编码器-解码器架构,通过设计多层次注意力机制,进而增强全局信息感知能力,避免决策过程中的短视问题。在此基础上,通过显式学习充电站之间的拓扑关系与巡检设备点的空间分布特征,提出一种无人机路径选择策略,在严格电量约束的前提下,实现无人机编队总飞行距离的最小化。为验证方法性能,设计三种不同规模的场景进行实验评估。在这些规模场景中,所提方法相比基线方法平均性能提升约12%。实验结果表明,在不同问题规模、节点分布及充电站数量变化的情况下,HAPPI均表现出优越且稳定的性能,具有良好的泛化能力和路径规划效能。Abstract:
Objective In the domain of modern power grid maintenance, the deployment of multiple Unmanned Aerial Vehicles (UAVs) for efficient and cooperative inspection has become a pivotal yet challenging task. Existing multi-UAV path planning methods often struggle with inadequate cooperative scheduling and fail to accurately capture the complex topological relationships among heterogeneous nodes—specifically between inspection equipment points and charging stations under strict energy constraints. To address these limitations, this paper proposes a novel Hierarchical Attention mechanism-based Path Planning method for multi-UAV power Inspection (HAPPI). The primary objective is to minimize the total flight distance of a UAV fleet while ensuring all devices are inspected and all UAVs safely return to the base, despite dynamic energy consumption and partial environmental observability. Methods The multi-UAV power inspection problem is first formulated as a combinatorial optimization problem with energy constraints and modeled within a Markov Decision Process (MDP) framework. To solve this problem, HAPPI adopts an encoder-decoder architecture powered by a customized hierarchical attention mechanism. The encoder employs a multi-level attention design comprising three dedicated layers: the first layer uses self-attention among all equipment nodes to learn their spatial proximity and visitation preferences; the second layer applies cross-attention between equipment nodes and charging stations to model their energy supply-demand relationships; and the third layer utilizes self-attention among charging stations to explicitly capture the topological structure of the charging network. This hierarchical design enables the model to discern functional differences and dependencies among heterogeneous nodes effectively. The decoder integrates the global graph embedding, the embedding of the last visited node, and the UAV's current remaining energy to generate a context vector. Through a single-head attention mechanism, it computes compatibility scores for all candidate nodes, followed by a masking strategy that invalidates infeasible nodes (e.g., visited nodes, unreachable nodes, or nodes that would strand the UAV). The final node selection follows a probability distribution derived via softmax, supporting both greedy and sampling-based decoding strategies. The policy network is trained using a reinforcement learning framework with a baseline network to stabilize training, optimizing the parameters via policy gradient to minimize the expected total path length ( Fig. 2 ).Results and Discussions Extensive simulations are conducted across three problem scales: T20C2 (20 devices, 2 stations), T60C6 (60 devices, 6 stations), and T100C10 (100 devices, 10 stations). The training process shows that HAPPI achieves faster convergence and lower final cost compared to baseline Attention Model (AM) and Heterogeneous Attention-based Deep Reinforcement Learning (HADRL) methods ( Fig. 4 ). In comprehensive performance comparisons, HAPPI (sampling) obtains the shortest total path lengths on T60C6 (6.21) and T100C10 (8.41), outperforming five classical meta-heuristic algorithms and the two deep reinforcement learning baselines (Table 3 ). Notably, HAPPI reduces total path length by approximately 12% on average compared to the baselines. Visualization of planned paths demonstrates that HAPPI generates routes with less crossover and more balanced workload distribution among UAVs, enhancing both safety and efficiency (Fig. 6 ,Fig. 7 ). The single-UAV path length distribution further confirms HAPPI's superior load-balancing capability across all scales (Fig. 8 ).Conclusions This paper presents HAPPI, a novel hierarchical attention mechanism-based deep reinforcement learning method for cooperative path planning in multi-UAV power inspection scenarios with multiple charging stations. By explicitly modeling the spatial relationships among equipment points, the energy dependencies between equipment and charging stations, and the internal topology of the charging network, HAPPI effectively addresses the shortcomings of existing methods in information aggregation and constraint satisfaction. Experimental results across various scales that HAPPI achieves superior planning quality, higher computational efficiency, and stronger generalization compared to state-of-the-art heuristic and learning-based algorithms. Future work will extend the framework to multi-objective optimization incorporating time, risk, and energy trade-offs, and validate the method with real-world inspection data. -
Key words:
- Power inspection /
- Multi-UAV /
- Path planning /
- Hierarchical attention mechanism /
- Explicit learning
-
表 1 算法在T20C2、T60C6、T100C10上的性能对比
方法 T20C2 T60C6 T100C10 目标值 差异(%) 时间/s 目标值 差异(%) 时间/s 目标值 差异(%) 时间/s ABC 6.36 58.6 2.38 17.77 186 11.17 32.40 285 58.78 VNS 6.45 60.8 0.58 14.02 125 3.71 24.89 195 9.48 SA 5.27 31.42 75.96 8.12 30.75 291.36 12.67 50.65 595.37 ACO 7.84 95.51 4.03 14.68 136 24.76 18.90 124 65.74 TS 7.72 92.5 1.06 16.84 171 4.48 27.70 229 10.91 AM(贪婪) 4.32 5.2 0.02 6.60 6.28 0.04 8.67 3.1 0.09 AM(采样) 4.27 4 0.02 6.58 5.9 0.04 8.63 2.6 0.08 HADRL(贪婪) 4.19 4.5 0.01 6.47 4.2 0.03 8.51 1.2 0.08 HADRL(采样) 4.01 <0.01 0.01 6.44 3.7 0.02 8.49 0.9 0.07 HAPPI(贪婪) 4.09 1.9 0.01 6.39 2.9 0.04 8.46 0.6 0.08 HAPPI(采样) 4.07 1.4 0.01 6.21 <0.01 0.03 8.41 <0.01 0.1 表 2 算法在高斯分布下的性能对比
问题场景 指标 AM(贪婪) AM(采样) HADRL(贪婪) HADRL(采样) HAPPI(贪婪) HAPPI(采样) T20C2(0.3) 目标值 5.38 5.21 5.18 4.98 4.33 4.17 时间 0.02 0.02 0.02 <0.01 0.01 0.01 T60C6(0.3) 目标值 10.27 10.14 9.87 9.82 7.33 7.11 时间 0.05 0.04 0.04 0.04 0.03 0.04 T100C10(0.3) 目标值 13.28 12.94 11.83 11.51 10.37 10.19 时间 0.07 0.08 0.06 0.06 0.07 0.07 T20C2(0.6) 目标值 4.61 4.41 4.34 4.11 4.17 4.13 时间 0.01 0.02 0.01 0.01 <0.01 0.02 T60C6(0.6) 目标值 9.41 9.16 8.34 8.12 7.91 7.78 时间 0.05 0.06 0.05 0.05 0.06 0.04 T100C10(0.6) 目标值 10.93 10.87 10.73 10.15 10.01 9.97 时间 0.05 0.05 0.07 0.06 0.05 0.04 表 3 算法在瑞利分布下的性能对比
问题场景 指标 AM(贪婪) AM(采样) HADRL(贪婪) HADRL(采样) HAPPI(贪婪) HAPPI(采样) T20C2(0.3) 目标值 4.92 4.87 4.75 4.69 4.47 4.41 时间 0.02 0.02 0.01 0.03 <0.01 0.01 T60C6(0.3) 目标值 8.86 8.74 8.21 8.17 7.81 7.93 时间 0.06 0.03 0.05 0.07 0.04 0.04 T100C10(0.3) 目标值 12.34 11.98 11.02 10.97 10.75 10.42 时间 0.08 0.09 0.08 0.07 0.09 0.08 T20C2(0.6) 目标值 4.70 4.67 4.58 4.49 4.36 4.31 时间 0.02 0.02 0.01 0.02 0.02 0.02 T60C6(0.6) 目标值 9.63 9.59 8.83 8.65 7.92 7.78 时间 0.06 0.06 0.04 0.06 0.05 0.05 T100C10(0.6) 目标值 14.91 14.66 13.78 13.52 12.63 12.31 时间 0.09 0.1 0.08 0.06 0.07 0.08 -
[1] ZHANG Hui, XU Zhiwen, CHEN Bo, et al. An adaptive nearest point routing method based on charging nest deployment optimization for UAVs power tower inspection[J]. IEEE Transactions on Industrial Informatics, 2025, 21(10): 7881–7890. doi: 10.1109/TII.2025.3574421. [2] FEI Bowen, BAO Weidong, ZHU Xiaomin, et al. Autonomous cooperative search model for multi-UAV with limited communication network[J]. IEEE Internet of Things Journal, 2022, 9(19): 19346–19361. doi: 10.1109/JIOT.2022.3165278. [3] 唐伦, 戴军, 成章超, 等. 基于数字孪生的多自动驾驶车辆分布式协同路径规划算法[J]. 电子与信息学报, 2024, 46(6): 2525–2532. doi: 10.11999/JEIT230678.TANG Lun, DAI Jun, CHENG Zhangchao, et al. Distributed collaborative path planning algorithm for multiple autonomous vehicles based on dgital twin[J]. Journal of Electronics & Information Technology, 2024, 46(6): 2525–2532. doi: 10.11999/ JEIT230678. doi: 10.11999/JEIT230678. [4] LI Junfei, HU Yanrong, and YANG S X. A novel knowledge-based genetic algorithm for robot path planning in complex environments[J]. IEEE Transactions on Evolutionary Computation, 2025, 29(2): 375–389. doi: 10.1109/TEVC.2025.3534026. [5] ZHANG Zhe, JIANG Ju, LING K V, et al. Real-time path planning for autonomous UAVs: An event-triggered multimodal adaptive pigeon-inspired optimization approach[J]. IEEE Transactions on Aerospace and Electronic Systems, 2025, 61(4): 10972–10981. doi: 10.1109/TAES.2025.3550128. [6] HU Weijun and MA Xianlong. Optimization algorithm of UAVs task assignment and path planning based on dynamic cluster particle swarm optimization[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(10): 18157–18169. doi: 10.1109/TITS.2025.3578464. [7] ZHANG Jing, CUI Yani, and REN Jia. Dynamic mission planning algorithm for UAV formation in battlefield environment[J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(4): 3750–3765. doi: 10.1109/TAES.2022.3231244. [8] XU Zhen, ZHANG Enze, and CHEN Qingwei. Rotary unmanned aerial vehicles path planning in rough terrain based on multi-objective particle swarm optimization[J]. Journal of Systems Engineering and Electronics, 2020, 31(1): 130–141. doi: 10.21629/JSEE.2020.01.14. [9] CHEN Jinchao, LING Fuyuan, ZHANG Ying, et al. Coverage path planning of heterogeneous unmanned aerial vehicles based on ant colony system[J]. Swarm and Evolutionary Computation, 2022, 69: 101005. doi: 10.1016/j.swevo.2021.101005. [10] RIBEIRO R G, COTA L P, EUZEBIO T A M, et al. Unmanned-aerial-vehicle routing problem with mobile charging stations for assisting search and rescue missions in postdisaster scenarios[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(11): 6682–6696. doi: 10.1109/TSMC.2021.3088776. [11] JONES M R, DJAHEL S, and WELSH K. An efficient and rapidly adaptable lightweight multi-destination urban path planning approach for UAVs using Q-learning[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(10): 6624–6636. doi: 10.1109/TIV.2024.3387018. [12] LIU Xiao, LIU Yuanwei, CHEN Yue, et al. Trajectory design and power control for multi-UAV assisted wireless networks: A machine learning approach[J]. IEEE Transactions on Vehicular Technology, 2019, 68(8): 7957–7969. doi: 10.1109/TVT.2019.2920284. [13] CHEN Jinchao, LI Tingyang, ZHANG Ying, et al. Global-and-local attention-based reinforcement learning for cooperative behaviour control of multiple UAVs[J]. IEEE Transactions on Vehicular Technology, 2024, 73(3): 4194–4206. doi: 10.1109/TVT.2023.3327571. [14] LI Jianqing, ZHU Yihao, LI Chaoyong, et al. A motion camouflage-inspired path planning method for UAVs based on reinforcement learning[J]. IEEE Transactions on Aerospace and Electronic Systems, 2025, 61(2): 4105–4114. doi: 10.1109/TAES.2024.3496417. [15] WESTHEIDER J, RÜCKIN J, and POPOVIĆ M. Multi-UAV adaptive path planning using deep reinforcement learning[C]. 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, Detroit, USA, 2023: 649–656. doi: 10.1109/IROS55552.2023.10342516. [16] 杨来义, 毕敬, 苑海涛. 基于SAC算法的移动机器人智能路径规划[J]. 系统仿真学报, 2023, 35(8): 1726–1736. doi: 10.16182/j.issn1004731x.joss.22-0412.YANG Laiyi, BI Jing, and YUAN Haitao. Intelligent path planning for mobile robots based on SAC algorithm[J]. Journal of System Simulation, 2023, 35(8): 1726–1736. doi: 10.16182/j.issn1004731x.joss.22-0412. [17] CHU Zhenzhong, WANG Fulun, LEI Tingjun, et al. Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(1): 108–120. doi: 10.1109/TIV.2022.3153352. [18] FAN Mingfeng, WU Yaoxin, LIAO Tianjun, et al. Deep reinforcement learning for UAV routing in the presence of multiple charging stations[J]. IEEE Transactions on Vehicular Technology, 2023, 72(5): 5732–5746. doi: 10.1109/TVT.2022.3232607. [19] JIN Meixi, XU Tianbo, and XU Lin. Research on UAV path planning method based on reinforcement learning and behavioral tree modeling[C]. The 37th Chinese Control and Decision Conference (CCDC), Xiamen, China, 2025: 5348–5353. doi: 10.1109/CCDC65474.2025.11090282. [20] XI Meng, DAI Huiao, HE Jingyi, et al. A lightweight reinforcement-learning-based real-time path-planning method for unmanned aerial vehicles[J]. IEEE Internet of Things Journal, 2024, 11(12): 21061–21071. doi: 10.1109/JIOT.2024.3350525. [21] 张建行, 康凯, 钱骅, 等. 面向物联网的深度Q网络无人机路径规划[J]. 电子与信息学报, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962.ZHANG Jianhang, KANG Kai, QIAN Hua, et al. UAV trajectory planning based on deep Q-network for Internet of Things[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962. [22] 鲜斌, 宋宁. 基于模型预测控制与改进人工势场法的多无人机路径规划[J]. 控制与决策, 2024, 39(7): 2133–2141. doi: 10.13195/j.kzyjc.2023.0892.XIAN Bin and SONG Ning. A multiple UAVs path planning method based on model predictive control and improved artificial potential field[J]. Control and Decision, 2024, 39(7): 2133–2141. doi: 10.13195/j.kzyjc.2023.0892. [23] 伍国华, 毛妮, 徐彬杰, 等. 基于自适应大规模邻域搜索算法的多车辆与多无人机协同配送方法[J]. 控制与决策, 2023, 38(1): 201–210. doi: 10.13195/j.kzyjc.2021.2268.WU Guohua, MAO Ni, XU Binjie, et al. The cooperative delivery of multiple vehicles and multiple drones based on adaptive large neighborhood search[J]. Control and Decision, 2023, 38(1): 201–210. doi: 10.13195/j.kzyjc.2021.2268. [24] BAHDANAU D, CHO K, and BENGIO Y. Neural machine translation by jointly learning to align and translate[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015. [25] ZHANG Xiumei, LI Wensong, LI Hui, et al. Research on AGV path planning based on improved artificial bee colony algorithm[C]. 2023 35th Chinese Control and Decision Conference, Yichang, China, 2023: 703–708. doi: 10.1109/CCDC58219.2023.10327207. [26] CHU Jun, LAI Xiangjing, and WANG Zhaoyang. A randomized variable neighborhood search algorithm for solving the capacitated vehicle routing problem[C]. 2022 China Automation Congress, Xiamen, China, 2022: 2028–2032. doi: 10.1109/CAC57257.2022.10055038. [27] 高云飞, 胡钰林, 刘鸣柳, 等. 多无人机输电线路巡检联合轨迹设计方法[J]. 电子与信息学报, 2024, 46(5): 1958–1967. doi: 10.11999/JEIT231199.GAO Yunfei, HU Yulin, LIU Mingliu, et al. Joint multi-UAV trajectory design for power line inspection[J]. Journal of Electronics & Information Technology, 2024, 46(5): 1958–1967. doi: 10.11999/JEIT231199. [28] CHEN Junfeng, ZHANG Xueping, and XU Luyixiao. Path planning algorithm based on hybrid A* and adaptive ant colony optimization[C]. The 18th International Conference on Computational Intelligence and Security, Chengdu, China, 2022: 43–48. doi: 10.1109/CIS58238.2022.00017. -
下载:
下载: