Dynamic Spectrum Access Algorithm for Evaluating Spectrum Stability in Cognitive Vehicular Networks.
-
摘要: 在带频谱认知的车联网中,由于车辆终端的高动态移动性和无线电环境的复杂性,使频谱的稳定性难以评估。该文提出一种评估频谱稳定性的动态频谱接入算法。首先,基于信噪比、接收信号强度和带宽参数,利用长短期记忆神经网络预测出参数在未来1个周期内多时刻的值,并计算各参数1个周期的变化率,将结果作为频谱稳定性的评估指标。其次,利用K-Means算法对变化率向量进行聚类,构建稳定性评估模型。再次,根据稳定性评估结果重构了状态空间和奖励函数,提出一种基于强化学习的动态频谱接入算法。最后,实验结果表明,所提算法能够满足不同车辆终端业务的稳定性需求,提高频谱资源的利用率,同时降低频谱接入过程中的碰撞概率。Abstract:
Objective With the exponential growth of vehicle terminals and the widespread adoption of cognitive vehicular network applications, the existing licensed spectrum resources are inadequate to meet the communication demands of Cognitive Vehicular Networks (CVN). The rapid development of CVN and the increasing complexity of vehicular communication scenarios have intensified spectrum resource scarcity. Dynamic Spectrum Access (DSA) technology has emerged as a key solution to alleviate this scarcity by enabling efficient use of underutilized spectrum bands. While current DSA algorithms ensure basic spectrum utilization, they struggle to comprehensively evaluate spectrum stability and meet the differentiated stability requirements of vehicular network applications. For example, safety-critical applications such as collision avoidance systems demand ultra-reliable, low-latency communication, while infotainment applications prioritize high throughput. This paper proposes a novel framework integrating spectrum stability assessment with deep reinforcement learning. The framework constructs a multi-dimensional parameter-based model for spectrum stability, designs a reinforcement learning architecture incorporating gated mechanisms and dueling neural networks, and establishes a dynamically adaptive reward function to enable intelligent spectrum resource allocation. This research offers a solution for vehicular network spectrum management that combines theoretical depth with practical engineering value, paving the way for more reliable and efficient vehicular communication systems. Methods This study employs an integrated approach to address the spectrum allocation challenges in CVN. A time-series prediction model is developed using Long Short-Term Memory (LSTM) neural networks, which leverage three-dimensional time-series data of Signal-to-Noise Ratio (SNR), Received Signal Strength (RSS), and bandwidth to make multi-step predictions for future cycles. The rate of change for each parameter is calculated as a stability evaluation metric, providing a quantitative measure of spectrum stability. To ensure consistency in the evaluation process, the rate of change for each parameter is normalized using Min-Max normalization, and the standardized results are input into the K-Means algorithm for stability clustering of the rate-of-change vectors. By calculating the centroid coordinates of each cluster and their norms, a stability index is derived, forming the stability assessment model. Building upon the Deep Q-Network (DQN), a Gated Recurrent Unit (GRU) is introduced to create a temporal state encoder that captures the temporal dependencies in spectrum data. Additionally, a Dueling Network architecture is employed to decouple the state value and action advantage functions, enabling more accurate estimation of the long-term value of spectrum allocation decisions. The reward function incorporates trade-off coefficients to achieve a reasonable allocation of spectrum resources with different stability levels, ensuring a balance between spectrum utilization and collision probability while meeting the diverse stability requirements of vehicular network applications. The proposed framework is designed to be scalable and adaptable to various vehicular network scenarios, including urban, highway, and rural environments. Results and Discussions Simulation results show that the optimized stepwise prediction algorithm significantly improves performance. In both the training and test sets, the algorithm achieves a Root Mean Squared Error (RMSE) of less than 0.1, with no significant overfitting observed ( Fig. 5 ,Fig. 6 ). This indicates that the algorithm generalizes well to unseen data, making it suitable for real-world deployment. Additionally, the loss function of the proposed algorithm decreases significantly as the number of iterations increases, converging around 150 iterations (Fig. 7 ). The prediction accuracy also stabilizes around 150 iterations (Fig. 8 ), suggesting that the algorithm achieves consistent performance within a reasonable training period. These results demonstrate that the proposed prediction algorithm can deliver high-accuracy multi-step predictions for stability parameters across a sufficient number of channels, providing a solid foundation for spectrum stability assessment. Furthermore, the proposed access algorithm consistently outperforms comparative algorithms in terms of spectrum utilization over 20 iterations, while maintaining lower collision probabilities (Fig. 9 ,Fig. 10 ). As the number of iterations increases, the cumulative stability index and throughput of the proposed algorithm steadily improve, exceeding the performance of comparative algorithms at all stages. This demonstrates that the proposed algorithm can meet the diverse requirements of vehicle terminals for channel stability and throughput, while ensuring high spectrum utilization and low collision probability. As the number of vehicle terminals increases, the proposed algorithm exhibits faster convergence compared to other algorithms, confirming its robustness in large-scale scenarios. These findings highlight the potential of the proposed framework to meet the growing demands of next-generation vehicular networks.Conclusions This study proposes an integrated “evaluation-decision-optimization” spectrum management paradigm for CVN. By proposing a multi-dimensional time-series feature-based spectrum stability quantification framework and designing a hybrid deep reinforcement learning architecture incorporating gated mechanisms and dueling networks, the research addresses the critical challenge of balancing spectrum efficiency with stability in dynamic vehicular environments. The development of an interpretable reward function enables intelligent spectrum allocation that adapts to diverse quality-of-service requirements, ensuring that both safety-critical and non-safety-critical applications receive the necessary resources. Experimental results show significant improvements in spectrum utilization, collision probability, and system throughput compared to traditional approaches, while maintaining robust performance in large-scale scenarios. These findings advance the theoretical understanding of spectrum management in CVN and provide a practical framework for implementing adaptive DSA solutions in next-generation intelligent transportation systems. Future research will explore extending the proposed framework to support multi-agent scenarios, where multiple vehicles and infrastructure nodes collaboratively optimize spectrum allocation. Additionally, integrating edge computing and federated learning techniques could further enhance the scalability and efficiency of the framework. The proposed methodology offers a scalable and efficient approach to spectrum resource allocation, paving the way for more reliable and high-performance vehicular communication networks. -
表 1 频谱参数定义表
变量 描述 $ P_t^{i,j} $ $t$时刻车辆终端$i$检测到信道$ j $的发射功率 $ h_t^{i,j} $ $t$时刻车辆终端$i$所接入信道$j$的信道增益 $ d_t^{i,j} $ $t$时刻车辆终端$i$到信道$j$所属基站间的距离 $ {\mu _t} $ 均值为0,标准差为$\sigma $的背景噪声 ${\text{high}}(f_t^{i,j})$ $t$时刻车辆终端$i$所接入信道$j$的最高频率 ${\text{low}}(f_t^{i,j})$ $t$时刻车辆终端$i$所接入信道$j$的最低频率 $ T $ 预测时间步长 $ B_T^{'i,j} $ 信道$ j $在时间步长$ T $内的带宽变化率 $ \xi _T^{'i,j} $ 信道$ j $在时间步长$ T $内的接收信号强度变化率 $N$ 车辆终端集合$N = \{ 1,2,\cdots,n\} $ $M$ 信道集合$M = \{ 1,2,\cdots,m\} $ 1 基于强化学习的动态频谱接入算法
输入:学习率$ \alpha $,折扣因子$ \beta $,探索概率$\varepsilon $,Mini-batch的长度$ L $ 输出:最优Q-Network参数${\theta _t}$ (1)为每一个车辆终端以随机权重${\theta _t}$的方式初始化Q-Network (2) for Iteration $ I $=1, 2, ···, $ i $ do (3) for Time slot $ T $=1, 2, ···, $ t $ do (4) for User $ N $ =1, 2, ···, $ n $ do (5) 使用Mini-batch从经验回放池中随机提取$ L $条经验 (6) 使用经验元组根据式(11)对损失函数进行梯度下降 (7) 更新神经网络参数 (8) 车辆终端根据式(12)更新Q值 (9) 车辆终端根据式(13)选择信道接入 (10) 车辆终端根据式(9)获得奖励 (11) end for (12) for User$ N $ =1, 2, ···, $ n $ do (13) 获取下一个状态空间向量${\boldsymbol{s}}_{t + 1}^{N,M}$,进行下一次信道选择 (14) end for (15) end for (16) end for 表 2 仿真参数设置
强化学习训练参数 参数设置 授权信道数目 2 车辆终端$n$ 10 传输高稳定性业务车辆终端 5 传输低稳定性业务车辆终端 5 学习率$\alpha $ 0.001 折扣因子$\beta $ 0.95 探索概率$\varepsilon $ $1.0 \to 0.1$ 激活函数 ReLu 优化器 Adam Mini-batch大小 4个经验元组 单次训练次数 5000 次循环总训练次数 20次 车辆速度 [10, 15] m/s 表 3 预测模型参数设置
预测模型参数 参数设置 学习率 $1.0 \to 0.001$ 损失函数 RMSE 优化器 Adam 单次训练次数
总训练次数
训练集样本
测试集样本20
500
650
300 -
[1] CHUANG M C. Cooperation-assisted spectrum handover mechanism in vehicular Ad Hoc networks[J]. Electronics, 2021, 10(6): 742. doi: 10.3390/electronics10060742. [2] CHENG Nan, ZHANG Ning, LU Ning, et al. Opportunistic spectrum access for CR-VANETs: A game-theoretic approach[J]. IEEE Transactions on Vehicular Technology, 2014, 63(1): 237–251. doi: 10.1109/TVT.2013.2274201. [3] NIYATO D, HOSSAIN E, and WANG Ping. Optimal channel access management with QoS support for cognitive vehicular networks[J]. IEEE Transactions on Mobile Computing, 2011, 10(4): 573–591. doi: 10.1109/TMC.2010.191. [4] XIANG Ping, SHAN Hangguan, WANG Miao, et al. Multi-agent RL enables decentralized spectrum access in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2021, 70(10): 10750–10762. doi: 10.1109/TVT.2021.3103058. [5] SANGIORGIO M and DERCOLE F. Robustness of LSTM neural networks for multi-step forecasting of chaotic time series[J]. Chaos, Solitons & Fractals, 2020, 139: 110045. [6] BALDI P and SADOWSKI P. The dropout learning algorithm[J]. Artificial Intelligence, 2014, 210: 78–122. doi: 10.1016/j.artint.2014.02.004. [7] KODINARIYA T M and MAKWANA P R. Review on determining number of cluster in K-Means clustering[J]. International Journal of Advance Research in Computer Science and Management Studies, 2013, 1(6): 90–95. [8] ALI P J M. Investigating the Impact of min-max data normalization on the regression performance of K-nearest neighbor with different similarity measurements[J]. ARO-The Scientific Journal of Koya University, 2022, 10(1): 85–91. doi: 10.14500/aro.10955. [9] NEVES D E, ISHITANI L, and DO PATROCÍNIO JÚNIOR Z K G. Advances and challenges in learning from experience replay[J]. Artificial Intelligence Review, 2024, 58(2): 54. doi: 10.1007/s10462-024-11062-0. [10] MAHJOUB S, CHRIFI-ALAOUI L, MARHIC B, et al. Predicting energy consumption using LSTM, multi-layer GRU and drop-GRU neural networks[J]. Sensors, 2022, 22(11): 4062. doi: 10.3390/s22114062. [11] ZHOU Tianchen, YAKUWA Y, OKAMURA N, et al. Dueling network architecture for GNN in the deep reinforcement learning for the automated ICT system design[J]. IEEE Access, 2025, 13: 21870–21879. doi: 10.1109/ACCESS.2025.3534246. [12] CHANG H H, SONG Hao, YI Yang, et al. Distributive dynamic spectrum access through deep reinforcement learning: A reservoir computing based approach[J]. IEEE Internet of Things Journal, 2019, 6(2): 1938–1948. doi: 10.1109/JIOT.2018.2872441. [13] LE T D and KADDOUM G. A distributed channel access scheme for vehicles in multi-agent V2I systems[J]. IEEE Transactions on Cognitive Communications and Networking, 2020, 6(4): 1297–1307. doi: 10.1109/TCCN.2020.2966604. [14] CHEN Lingling, ZHAO Quanjun, FU Ke, et al. Multi-user reinforcement learning based multi-reward for spectrum access in cognitive vehicular networks[J]. Telecommunication Systems, 2023, 83(1): 51–65. doi: 10.1007/s11235-023-01004-6. [15] CHEN Lingling, WANG Ziwei, ZHAO Xiaohui, et al. A dynamic spectrum access algorithm based on deep reinforcement learning with novel multi-vehicle reward functions in cognitive vehicular networks[J]. Telecommunication Systems, 2024, 87(2): 359–383. doi: 10.1007/s11235-024-01188-5. [16] KAR K, SARKAR S, and TASSIULAS L. Achieving proportional fairness using local information in aloha networks[J]. IEEE Transactions on Automatic Control, 2004, 49(10): 1858–1863. doi: 10.1109/TAC.2004.835596. [17] LE T D and KADDOUM G. LSTM-based channel access scheme for vehicles in cognitive vehicular networks with multi-agent settings[J]. IEEE Transactions on Vehicular Technology, 2021, 70(9): 9132–9143. doi: 10.1109/TVT.2021.3100591. [18] WANG Lei, HU Jun, ZHANG Chudi, et al. Deep learning models for spectrum prediction: A review[J]. IEEE Sensors Journal, 2024, 24(18): 28553–28575. doi: 10.1109/JSEN.2024.3416738. [19] 陈曦, 杨健. 动态频谱接入中基于最小贝叶斯风险的稳健频谱预测[J]. 电子与信息学报, 2018, 40(3): 734–742. doi: 10.11999/JEIT170519.CHEN Xi and YANG Jian. Minimum Bayesian risk based robust spectrum prediction in dynamic spectrum access[J]. Journal of Electronics & Information Technology, 2018, 40(3): 734–742. doi: 10.11999/JEIT170519. -