Performance Optimization of UAV-RIS-assisted Communication Networks Under No-Fly Zone Constraints
-
摘要: 在无人机(UAV)辅助通信网络的实际部署中,禁飞区(NFZs)会收缩可行空域并迫使无人机绕行,致使路径损耗加剧,从而引发通信性能下降。为恢复并增强覆盖,该文将可重构智能表面(RIS)集成于无人机平台并实施协同相位控制以构建可编程反射链路。然而,可重构智能表面的指向性增益对无人机姿态高度敏感,进而影响系统性能。为此,该文提出一种无人机搭载可重构智能表面的新型通信框架,考虑到多禁飞区环境,通过联合优化无人机轨迹、可重构智能表面相移、无人机姿态和基站波束赋形,建立通信速率最大化问题,并提出基于积分路径的完全规避禁飞区方案,在严格绕行禁飞区的同时保障禁飞区内外用户的通信。鉴于该优化问题具有高度复杂性,该文将其构建为马尔可夫决策(MDP)过程,并提出基于软演员-评论家的深度强化学习算法进行求解。仿真结果表明,在保证完全绕行禁飞区的同时,所提方法能够显著提升通信速率,并在可扩展性与稳定性方面优于基线方案。Abstract:
Objective Reconfigurable Intelligent Surfaces (RIS) mounted on Unmanned Aerial Vehicles (UAVs) are considered an effective approach to enhance wireless communication coverage and adaptability in complex or constrained environments. However, two major challenges remain in practical deployment. The existence of No-Fly Zones (NFZs), such as airports, government facilities, and high-rise areas, restricts the UAV flight trajectory and may result in communication blind spots. In addition, the continuous attitude variation of UAVs during flight causes dynamic misalignment between the RIS and the desired reflection direction, which reduces signal strength and system throughput. To address these challenges, a UAV-RIS-assisted communication framework is proposed that simultaneously considers NFZ avoidance and UAV attitude adjustment. In this framework, a quadrotor UAV equipped with a bottom-mounted RIS operates in an environment containing multiple polygonal NFZs and a group of Ground Users (GUs). The aim is to jointly optimize the UAV trajectory, RIS phase shift, UAV attitude (represented by Euler angles), and Base Station (BS) beamforming to maximize the system sum rate while ensuring complete obstacle avoidance and stable, high-quality service for GUs located both inside and outside NFZs. Methods To achieve this objective, a multi-variable coupled non-convex optimization problem is formulated, jointly capturing UAV trajectory, RIS configuration, UAV attitude, and BS beamforming under NFZ constraints. The RIS phase shifts are dynamically adjusted according to the UAV orientation to maintain beam alignment, and UAV motion follows quadrotor dynamics while avoiding polygonal NFZs. Because of the high dimensionality and non-convexity of the problem, conventional optimization approaches are computationally intensive and lack real-time adaptability. To address this issue, the problem is reformulated as a Markov Decision Process (MDP), which enables policy learning through deep reinforcement learning. The Soft Actor-Critic (SAC) algorithm is employed, leveraging entropy regularization to improve exploration efficiency and convergence stability. The UAV-RIS agent interacts iteratively with the environment, updating actor-critic networks to determine UAV position, RIS phase configuration, and BS beamforming. Through continuous learning, the proposed framework achieves higher throughput and reliable NFZ avoidance, outperforming existing benchmarks. Results and Discussions As shown in (Fig. 3), the proposed SAC algorithm achieves higher communication rates than PPO, DDPG, and TD3 during training, benefiting from entropy-regularized exploration that prevents premature convergence. Although DDPG converges faster, it exhibits instability and inferior long-term performance. As illustrated in (Fig. 4), the UAV trajectories under different conditions demonstrate the proposed algorithm’s capability to achieve complete obstacle avoidance while maintaining reliable communication. Regardless of variations in initial UAV positions, BS locations, or NFZ configurations, the UAV consistently avoids all NFZs and dynamically adjusts its trajectory to serve users located both inside and outside restricted zones, indicating strong adaptability and scalability of the proposed model. As shown in (Fig. 5), increasing the number of BS antennas enhances system performance. The proposed framework significantly outperforms fixed phase shift, random phase shift, and non-RIS schemes because of improved beamforming flexibility. Conclusions This paper investigates a UAV-RIS-assisted wireless communication system in which a quadrotor UAV carries a RIS to enhance signal reflection and ensure NFZ avoidance. Unlike conventional approaches that emphasize avoidance alone, a path integral-based method is proposed to generate obstacle-free trajectories while maintaining reliable service for GUs both inside and outside NFZs. To improve generality, NFZs are represented as prismatic obstacles with regular n-sided polygonal cross-sections. The system jointly optimizes UAV trajectory, RIS phase shifts, UAV attitude, and BS beamforming. A DRL framework based on the SAC algorithm is developed to enhance system efficiency. Simulation results demonstrate that the proposed approach achieves reliable NFZ avoidance and maximized sum rate, outperforms benchmarks in communication performance, scalability, and stability. -
1 本文提出的SAC算法
(1)初始化环境参数 (2)初始化评论网络参数、策略网络参数和目标网络参数 (3) 对每个训练周期执行: (4) 对每个环境交互步骤执行: (5) 根据当前状态$ {s_{\text{l}}} $从策略分布$ {\pi _\phi }({a_l}|{s_l}) $中采样动作$ {a_l} $ (6) 观察奖励$ r({s_l},{a_l}) $和下一个状态 (7) 将转移元组$ ({s_l},{a_l},r({s_l},{a_l}),{s_{l + 1}}) $存入经验回放缓冲区$ \mathcal{D} $ (8) 结束该环境步循环 (9) 对每个梯度更新步骤执行: (10) 从$ \mathcal{D} $中采样小批次样本 (11) 计算损失函数$ {L_Q}\left( \omega \right) $和$ {L_\pi }(\phi ) $ (12) 根据梯度下降公式$ \alpha \leftarrow \alpha - {\lambda _\alpha }{\nabla _\alpha }L(\alpha ) $更新温度参数 (13) 对目标网络执行软更新$ {\hat \omega _{\text{i}}} \leftarrow {\tau _\pi }{\omega _i} + (1 - {\tau _\pi }){\hat \omega _i} $,
$\forall i \in \{ 1,2\} $ -
[1] 陈新颖, 盛敏, 李博, 等. 面向6G的无人机通信综述[J]. 电子与信息学报, 2022, 44(3): 781–789. doi: 10.11999/JEIT210789.CHEN Xinying, SHENG Min, LI Bo, et al. Survey on unmanned aerial vehicle communications for 6G[J]. Journal of Electronics & Information Technology, 2022, 44(3): 781–789. doi: 10.11999/JEIT210789. [2] CHENG Nan, WU Shen, WANG Xiucheng, et al. AI for UAV-assisted IoT applications: A comprehensive review[J]. IEEE Internet of Things Journal, 2023, 10(16): 14438–14461. doi: 10.1109/JIOT.2023.3268316. [3] 陈发堂, 张若凡. 可重构智能反射面辅助的车联网资源分配算法研究[J]. 通信学报, 2023, 44(9): 70–78. doi: 10.11959/J.ISSN.1000-436x.2023145.CHEN Fatang and ZHANG Ruofan. Research on IoV resource allocation algorithm assisted by reconfigurable intelligent surface[J]. Journal on Communications, 2023, 44(9): 70–78. doi: 10.11959/J.ISSN.1000-436x.2023145. [4] JIAO Shiyu, FANG Fang, ZHOU Xiaotian, et al. Joint beamforming and phase shift design in downlink UAV networks with IRS-assisted NOMA[J]. Journal of Communications and Information Networks, 2020, 5(2): 138–149. doi: 10.23919/JCIN.2020.9130430. [5] ESKANDARI M, HUANG Hailong, SAVKIN A V, et al. Model predictive control-based 3D navigation of a RIS-equipped UAV for LoS wireless communication with a ground intelligent vehicle[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(3): 2371–2384. doi: 10.1109/TIV.2022.3232890. [6] CHENG Xin, LIN Yan, SHI Weiping, et al. Joint optimization for RIS-assisted wireless communications: From physical and electromagnetic perspectives[J]. IEEE Transactions on Communications, 2022, 70(1): 606–620. doi: 10.1109/TCOMM.2021.3120721. [7] ZENG Shuhao, ZHANG Hongliang, DI Boya, et al. Reconfigurable intelligent surface (RIS) assisted wireless coverage extension: RIS orientation and location optimization[J]. IEEE Communications Letters, 2021, 25(1): 269–273. doi: 10.1109/LCOMM.2020.3025345. [8] WU Peng, YUAN Xiaopeng, HU Yulin, et al. Trajectory and user assignment design for UAV communication network with no-fly zone[J]. IEEE Transactions on Vehicular Technology, 2024, 73(10): 15820–15825. doi: 10.1109/TVT.2024.3410395. [9] LIU Zhenrong, ZENG Yuan, ZHANG Wei, et al. Trajectory design for UAV communications with no-fly zones by deep reinforcement learning[C]. 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, Canada, 2021: 1–5. doi: 10.1109/ICCWorkshops50388.2021.9473572. [10] LEE W and LEE K. Robust trajectory and resource allocation for UAV communications in uncertain environments with no-fly zone: A deep learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(10): 14233–14244. doi: 10.1109/TITS.2024.3399913. [11] XU Dongfang, SUN Yan, NG D W K, et al. Multiuser MISO UAV communications in uncertain environments with no-fly zones: Robust trajectory and resource allocation design[J]. IEEE Transactions on Communications, 2020, 68(5): 3153–3172. doi: 10.1109/TCOMM.2020.2970043. [12] 王庆, 孙玮, 张程程, 等. 基于深度强化学习的无人机集群通信与网络资源优化调度[J]. 无线电工程, 2024, 54(12): 2942–2949. doi: 10.3969/J.ISSN.1003-3106.2024.12.022.WANG Qing, SUN Wei, ZHANG Chengcheng, et al. Optimized scheduling of UAV cluster communication and network resources based on deep reinforcement learning[J]. Radio Engineering, 2024, 54(12): 2942–2949. doi: 10.3969/J.ISSN.1003-3106.2024.12.022. [13] JIANG Weiheng, XIONG Peiyun, NIE Jiangtian, et al. Robust design of IRS-aided multi-group multicast system with imperfect CSI[J]. IEEE Transactions on Wireless Communications, 2023, 22(9): 6314–6328. doi: 10.1109/TWC.2023.3241453. [14] GOUDARZI S, SOLEYMANI S A, ANISI M H, et al. Optimizing UAV-assisted vehicular edge computing with age of information: An SAC-based solution[J]. IEEE Internet of Things Journal, 2025, 12(5): 4555–4569. doi: 10.1109/JIOT.2025.3529836. [15] 陈真, 杜晓宇, 唐杰, 等. 基于深度强化学习的RIS辅助通感融合网络: 挑战与机遇[J]. 电子与信息学报, 2024, 46(9): 3467–3473. doi: 10.11999/JEIT240086.CHEN Zhen, DU Xiaoyu, TANG Jie, et al. DRL-based RIS-assisted ISAC network: Challenges and opportunities[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3467–3473. doi: 10.11999/JEIT240086. -
下载:
下载: