Dual-Reconfigurable Intelligent Surface Phase Shift Optimization and Unmanned Aerial Vehicle Trajectory Control for Vehicle Communication
-
摘要: 针对无人机(UAV)携带智能反射面(RIS)与固定RIS共同辅助移动的用户车辆(UE)通信的场景,建立UAV飞行轨迹和双RIS相移联合优化问题,使UE在移动过程中始终保持通信速率最大。由于系统的复杂性和环境的动态性,该文提出一种基于深度确定性策略梯度算法和相移对齐方法来处理连续轨迹和RIS相移的优化问题。仿真结果验证了所提的联合优化算法在1 000个Episode以内便能得到较稳定的奖励值,通过与其它基准方法对比,表明了所提算法可在双RIS部署的环境中比使用随机轨迹和相移算法时通信速率至少可提高3 dB。最后给出了不同基站和RIS的部署位置下的UAV的最优轨迹,并对不同车速下算法的适用性进行了仿真分析。Abstract:
This study considers a scenario in which an Unmanned Aerial Vehicle (UAV) equipped with a Reconfigurable Intelligent Surface (RIS) cooperates with a fixed RIS to enhance communication with a mobile User Equipment (UE) vehicle. A joint optimization problem is formulated to maximize the UE’s communication rate by controlling the UAV’s flight trajectory and the phase shifts of both RISs. Given the system complexity and environmental dynamics, a solution is proposed that integrates a Deep Deterministic Policy Gradient (DDPG) algorithm with phase-shift alignment to optimize continuous UAV trajectories and RIS configurations. Simulation results confirm that the proposed method achieves stable reward convergence within 1,000 training episodes. Compared with benchmark approaches, the algorithm improves communication rates by at least 3 dB over random trajectory and phase-shift strategies in dual-RIS deployments. The study further presents optimal UAV trajectories under varying base station and RIS placements and evaluates algorithm performance across different vehicle speeds. Objective This study investigates a vehicular communication scenario in which a UAV-mounted RIS cooperates with a fixed RIS to assist a mobile UE device. A joint optimization framework is established to maximize UE communication rates during movement by simultaneously optimizing the UAV trajectory and the phase shifts of both RISs. To address system complexity and environmental dynamics, a DDPG algorithm is employed for continuous trajectory control, while a low-complexity phase-shift alignment method configures the RISs. Simulation results show that the proposed algorithm achieves stable reward convergence within 1,000 training episodes and improves communication rates by at least 3 dB compared with randomized trajectory and phase-shift baselines. It also outperforms alternative reinforcement learning approaches, including Twin Delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC). Optimal UAV trajectories are derived for various base station and RIS deployment scenarios, with additional simulations confirming robustness across a range of vehicle speeds. Methods This study establishes a Multiple-Input Single-Output (MISO) system in which a UAV-mounted RIS cooperates with a fixed RIS to support mobile vehicular communication, with the objective of maximizing user information rates. To address the complexity of continuous trajectory control under dynamic environmental conditions, a DDPG-based algorithm is developed. The phase shifts of RIS elements are optimized using a low-complexity alignment method. A reward function based on the achievable information rates of vehicular users is designed to guide the agent’s actions and facilitate policy learning. The proposed framework enhances adaptability by dynamically optimizing UAV trajectories and RIS configurations under time-varying channel conditions. Results and Discussions (1) The convergence behavior of the DDPG algorithm is verified in Fig. 3, where the reward values progressively converge as the number of training episodes increases. (2) Fig. 4 shows the effect of varying the number of RIS elements on system performance, indicating that additional elements lead to a steady increase in reward values, confirming the channel gain enhancement provided by RIS deployment. (3) As shown in Fig. 5, the DDPG algorithm outperforms baseline methods and demonstrates greater adaptability to target scenarios; concurrently, optimized RIS phase shifts yield significantly higher rewards than random configurations, validating the proposed phase-alignment strategy. (4) Figs. 6–7 highlight notable variations in UAV trajectories and system performance across different base station and RIS deployments, demonstrating the adaptability of the trajectory optimization strategy. Fig. 8 further compares performance across scenarios with optimized UAV trajectories, highlighting the algorithm’s versatility. (5) System performance under different UE mobility speeds is evaluated in Fig. 9, showing a performance decline at higher speeds, indicating strong efficacy in low-speed environments but reduced effectiveness under high-speed conditions. These results collectively illustrate the operational strengths and limitations of the proposed framework in dynamic vehicular communication systems. Conclusions This paper investigates a vehicular communication scenario assisted by both fixed and UAV-mounted mobile RISs, aiming to maximize UE information rates under dynamic mobility conditions. A joint optimization framework is developed, combining dual-RIS phase shift alignment based on channel state information with UAV trajectory planning using a DDPG algorithm. The proposed method features a low-complexity design that addresses both network architecture and RIS configuration challenges. Extensive simulations under varying vehicular speeds, RIS element counts, and base station deployments demonstrate the algorithm’s superiority over SAC, TD3, and randomized phase shift strategies. Results further highlight the framework’s adaptability to heterogeneous base station–RIS topologies and reveal performance degradation at higher vehicle speeds, indicating the need for future research into real-time adaptive mechanisms. -
1 基于DDPG的UAV轨迹和RIS相移联合优化算法
用参数$ {{\boldsymbol{\theta}} ^\mu } $和$ {{\boldsymbol{\theta}} ^Q} $初始化Actor网络$ \mu ( \cdot ) $、Critic网络$ Q( \cdot ) $、Target_
Actor网络$ \mu '( \cdot ) $和Target_Critic网络$ Q'( \cdot ) $;初始化经验回放池
Replaybuffer;设置Neps和T;for episode = 1,2,···,Neps do; 初始化环境,初始化两个RIS的相移矩阵,获取初始状态$ {s_t} $; for t =1,2,···,T do; 初始化UAV动作的位移矩阵; 根据当前策略和噪声选择动作$ {a_t} = \mu ({s_t}|{{\boldsymbol{\theta}} ^\mu }) + {N_t} $; 执行动作$ {a_t} $; 根据式(19)和式(20)优化的两个RIS相移; 根据式(11)计算车辆的信息速率; 根据式(13)计算奖励$ {r_{t'}} $,环境变为$ {s_{t + 1}} $; 将$ {\text{(}}{s_t},{a_t},{r_{t'}},{s_{t + 1}},{\text{done}}) $存入Replaybuffer中; if 训练开始 then; 从Replaybuffer中随机抽取256个样本进行训练; 根据式(15)更新Critic网络; 根据式(16)更新Actor网络; 根据式(17)更新Target网络; end if; end for; end for 表 1 系统参数
参数 参数描述 取值 $ {X^{\max }} $ UAV在x轴方向活动范围 600 m $ {Y^{\max }} $ UAV在y轴方向活动范围 400 m $ {Z^{\max }} $ UAV在z轴方向最大高度 200 m $ {Z^{\min }} $ UAV在z轴方向最低高度 10 m $ {x^{\max }},{y^{\max }},{z^{\max }} $ UAV在时隙内最大运动距离 10 m $ B $ 带宽 1 MHz $ \gamma $ 折现因子 0.99 $ F $ UAV飞出边界的惩罚 100 $ {\alpha _{{\text{rb}}}},{\alpha _{{\text{ub}}}} $ 路径损耗指数 2.2 $ {\alpha _{{\text{cr}}}}{\text{,}}{\alpha _{{\text{uc}}}} $ 路径损耗指数 2.5 $ T $ 时隙数 100 $ {N^{{\text{eps}}}} $ 训练轮次 3000 $ {\sigma ^2} $ 噪声功率 –110 dBm M 天线个数 4 N RIS单元数 40 v 车辆移动速度 6 m/s $ d{\text{r}} $ 天线单元之间的间隔 $ \lambda /2 $ -
[1] SHARMA V, BENNIS M, KUMAR R. UAV-assisted heterogeneous networks for capacity enhancement[J]. IEEE Communications Letters, 2016, 20(6): 1207–1210. doi: 10.1109/LCOMM.2016.2553103. [2] XIA W, POLESE M, MEZZAVILLA M, et al. Millimeter wave remote UAV control and communications for public safety scenarios[C]. 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), Boston, USA, 2019: 1-7. doi: 10.1109/SAHCN.2019.8824919. [3] WANG Liang, WANG Kezhi, PAN Cunhua, et al. Multi-agent deep reinforcement learning-based trajectory planning for multi-UAV assisted mobile edge computing[J]. IEEE Transactions on Cognitive Communications and Networking, 2021, 7(1): 73–84. doi: 10.1109/TCCN.2020.3027695. [4] KAUL S, GRUTESER M, RAI V, et al. Minimizing age of information in vehicular networks[C]. 2011 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, Salt Lake City, 2011: 350–358. doi: 10.1109/SAHCN.2011.5984917. [5] HUANG Wenhuan, YANG Zhaohui, PAN Cunhua, et al. Joint power, altitude, location and bandwidth optimization for UAV with underlaid D2D communications[J]. IEEE Wireless Communications Letters, 2019, 8(2): 524–527. doi: 10.1109/LWC.2018.2878706. [6] WANG Liang, WANG Kezhi, PAN Cunhua, et al. Deep reinforcement learning based dynamic trajectory control for UAV-assisted mobile edge computing[J]. IEEE Transactions on Mobile Computing, 2022, 21(10): 3536–3550. doi: 10.1109/TMC.2021.3059691. [7] ZENG Yong, XU Jie, and ZHANG Rui. Energy minimization for wireless communication with rotary-wing UAV[J]. IEEE Transactions on Wireless Communications, 2019, 18(4): 2329–2345. doi: 10.1109/TWC.2019.2902559. [8] 林粤伟, 王溢, 张奇勋, 等. 面向6G的通信感知一体化车联网研究综述[J]. 信号处理, 2023, 39(6): 963–974. doi: 10.16798/j.issn.1003-0530.2023.06.002.LIN Yuewei, WANG Yi, ZHANG Qixun, et al. Overview of the research on 6G oriented internet of vehicles for integrated sensing and communication[J]. Journal of Signal Processing, 2023, 39(6): 963–974. doi: 10.16798/j.issn.1003-0530.2023.06.002. [9] 虞湘宾, 于凯, 钱盼盼. 面向6G移动通信的可重构智能反射表面技术研究综述[J]. 南京航空航天大学学报, 2023, 55(5): 745–756. doi: 10.16356/j.1005-2615.2023.05.001.YU Xiangbin, YU Kai, and QIAN Panpan. Overview of reconfigurable intelligent surface for 6G mobile communication[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2023, 55(5): 745–756. doi: 10.16356/j.1005-2615.2023.05.001. [10] LI Lianlin, CUI Tiejun, JI Wei, et al. Electromagnetic reprogrammable coding-metasurface holograms[J]. Nature Communications, 2017, 8(1): 197. doi: 10.1038/s41467-017-00164-9. [11] DI RENZO M, ZAPPONE A, DEBBAH M, et al. Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(11): 2450–2525. doi: 10.1109/JSAC.2020.3007211. [12] 何艺, 仲伟志, 万诗晴, 等. 智能反射面辅助的MU-MISO车联网毫米波通信联合波束赋形[J]. 信号处理, 2024, 40(2): 336–344. doi: 10.16798/j.issn.1003-0530.2024.02.011.HE Yi, ZHONG Weizhi, WAN Shiqing, et al. Joint beamforming for IRS-aided MU-MISO millimeter wave communication of vehicular network[J]. Journal of Signal Processing, 2024, 40(2): 336–344. doi: 10.16798/j.issn.1003-0530.2024.02.011. [13] WANG Y, GUAN P, YU H, et al. Reconfigurable intelligent surfaces for energy efficiency in full-duplex communication system[J]. IEEE Transactions on Communications, 2022, 70(5): 3507–3523. doi: 10.1109/TCOMM.2022.3165299. [14] WANG Kezhi, CHEN Yunfei, and DI RENZO M. Outage probability of dual-hop selective AF with randomly distributed and fixed interferers[J]. IEEE Transactions on Vehicular Technology, 2015, 64(10): 4603–4616. doi: 10.1109/TVT.2014.2366727. [15] WU Qingqing and ZHANG Rui. Beamforming optimization for wireless network aided by intelligent reflecting surface with discrete phase shifts[J]. IEEE Transactions on Communications, 2020, 68(3): 1838–1851. doi: 10.1109/TCOMM.2019.2958916. [16] YANG Yifei, ZHENG Beixiong, ZHANG Shuowen, et al. Intelligent reflecting surface meets OFDM: Protocol design and rate maximization[J]. IEEE Transactions on Communications, 2020, 68(7): 4522–4535. doi: 10.1109/TCOMM.2020.2981458. [17] HE Jinglian, YU Kaiqiang, SHI Yuanming, et al. Reconfigurable intelligent surface assisted massive MIMO with antenna selection[J]. IEEE Transactions on Wireless Communications, 2022, 21(7): 4769–4783. doi: 10.1109/TWC.2021.3133272. [18] AL-HILO A, SAMIR M, ELHATTAB M, et al. Reconfigurable intelligent surface enabled vehicular communication: Joint User scheduling and passive beamforming[J]. IEEE Transactions on Vehicular Technology, 2022, 71(3): 2333–2345. doi: 10.1109/TVT.2022.3141935. [19] DENG Junquan, TIRKKONEN O, FREIJ-HOLLANTI R, et al. Resource allocation and interference management for opportunistic relaying in integrated mmWave/sub-6 GHz 5G networks[J]. IEEE Communications Magazine, 2017, 55(6): 94–101. doi: 10.1109/MCOM.2017.1601120. [20] 徐可馨, 隆克平, 陆阳, 等. 可重构智能超表面辅助的非地面网络安全传输与轨迹优化[J]. 电子与信息学报, 2025, 47(2): 296–304. doi: 10.11999/JEIT240981.XU Kexin, LONG Keping, LU Yang, et al. Joint secure transmission and trajectory optimization for reconfigurable intelligent surface-aided non-terrestrial networks[J]. Journal of Electronics & Information Technology, 2025, 47(2): 296–304. doi: 10.11999/JEIT240981. [21] 刘学敏, 钱玉文, 宋耀良, 等. 一种基于无人机与智能反射面的隐蔽通信系统研究[J]. 电子与信息学报, 2025, 47(2): 386–396. doi: 10.11999/JEIT240663.LIU Xuemin, QIAN Yuwen, SONG Yaoliang, et al. An intelligent reflecting surface assisted covert communication system with a cooperative unmanned aerial vehicle[J]. Journal of Electronics & Information Technology, 2025, 47(2): 386–396. doi: 10.11999/JEIT240663. [22] 仲伟志, 万诗晴, 段洪涛, 等. 一种基于合作协同进化的智能超表面辅助无人机通信系统联合波束成形方法[J]. 电子与信息学报, 2025, 47(2): 334–343. doi: 10.11999/JEIT240561.ZHONG Weizhi, WAN Shiqing, DUAN Hongtao, et al. A joint beamforming method based on cooperative co-evolutionary in reconfigurable intelligent surface-assisted unmanned aerial vehicle communication system[J]. Journal of Electronics & Information Technology, 2025, 47(2): 334–343. doi: 10.11999/JEIT240561. -