Digital Twin Sensing Information Synchronization Strategy Based on Intelligent Hierarchical Slicing Technique
-
摘要: 针对传感数据在无线接入网(RAN)中传输的不可靠性与不及时性造成数字孪生(DTs)同步信息的不精确问题,该文提出一种基于智能分层切片技术的DTs传感信息同步策略。该策略在双时间尺度下,以最大化传感信息满意度和最小化切片重配置及DTs同步成本为目标,联合优化切片无线资源配置以及DTs传感信息同步问题。首先,在大时间尺度,利用网络切片为有着不同服务质量(QoS)的DTs提供隔离以及解决部署问题;在小时间尺度,通过更加灵活的无线资源分配来提高DTs传感信息同步任务对动态环境的适应性,进一步提高通信性能,建立更逼近于物理实体的DTs。其次,为了求解不同时间尺度的优化问题,该文提出一种双层深度强化学习(DRL)框架实现高效的网络资源交互,其中下层控制算法利用优先经验放回(PER)机制加快收敛速度。最后,仿真结果验证了所提策略的有效性。Abstract: In order to mitigate the problem of inaccurate synchronization sensory information in Digital Twins (DTs) caused by unreliable and delayed transmission in Radio Access Networks (RAN), a sensory information synchronization strategy for DTs based on intelligent hierarchical slicing technology is proposed. The strategy aims to optimize the allocation of wireless resources for slicing and the synchronization of DTs’ sensing information in dual time scales, with the goals of maximizing the satisfaction of sensing information and minimizing the costs associated with slicing reconfiguration and DTs’ synchronization. Firstly, at large time scales, network slicing is employed to provide isolation for DTs with varying Quality of Service (QoS) and resolve deployment challenges; At small time scales, a more flexible wireless resource allocation is utilized to enhance the adaptability of DTs’ sensory information synchronization to dynamic environments. Secondly, in order to optimize the synchronization of DTs’ sensory information at different time scales, a two-layer Deep Reinforcement Learning (DRL) framework is introduced to facilitate efficient network resource interaction, and in the framework the lower-layer control algorithm incorporates the Prioritized Experience Replay (PER) mechanism to accelerate convergence speed. Finally, the effectiveness of the proposed strategy is validated through simulation results.
-
1 基于PER-MADDPG的下层控制算法
输入:学习率$ \lambda $,小批量大小Z,经验池${D_{\mathrm{L}}}$,参数$ \nu $,参数$ \beta $ 输出:下层控制策略 (1) for ${\text{episode = }}1 \sim {E_{\mathrm{L}}}$ do (2) 所有代理都观察初始环境状态${\boldsymbol{s}}$ (3) for $ {\text{step = }}1 \sim {T_{\mathrm{L}}} $ do (4) 所有智能体按照策略采取行动$ {\boldsymbol{a}} $并添加环境噪声$ {N_t} $ (5) 与环境交互获得各自惩罚奖励$ r $以及跳转到下一状态
$ s{'} $,并把经验$ \left({\boldsymbol{s}},{\boldsymbol{a}},r,{\boldsymbol{s}}{'}\right) $存储在${D_{\mathrm{L}}}$(6) for 智能体$ {{m}} = 1 \sim M $ do (7) for $ {{z}} = 1 \sim Z $ do (8) 从经验池${D_{\mathrm{L}}}$中以$P\left( k \right)$的概率抽取样本$w$ (9) 根据实际奖励计算TD-error${\delta _w}$以及计算权重${\omega _w}$ (10) 根据绝对TD-error$ \left| {{\delta _w}} \right| $更新样本$w$基于排名的优先级 (11) end for (12) 计算全局$ \mathcal{L}\left( {\theta _m^{{Q}}} \right) = \dfrac{1}{Z}\displaystyle\sum \limits_z {\omega _w}\delta _w^2 $,并最小化
$ \mathcal{L}\left( {\theta _m^{{Q}}} \right) $来更新评论家网络(13) 计算策略梯度$ {\nabla _{\theta _m^{\mathrm{E}}}}J $,更新行动家网络 (14) end for (15) 更新智能体的目标网络 (16) end for (17) end for 2 基于DDQN的上层控制算法
输入:概率分布$ \psi $,探索概率$\varepsilon $,小批量大小$B$,采样数据的学
习回合数输出:上层控制策略 (1) 初始化神经网络参数 (2) for ${\text{episode = }}1 \sim {E_{\mathrm{U}}}$ do (3) 观察环境获得初始观测值${\boldsymbol{s}}$ (4) for $ \text{step=}1\sim {T}_{{\mathrm{U}}} $ do (5) 根据$\varepsilon $-贪婪策略选择动作${\boldsymbol{a}}$,即选择探索动作还是最
大$Q$值对应动作(6) 控制器与环境交互获得$r$并跳转到下一状态${\boldsymbol{s}}'$,并采
集经验$\left( {{\boldsymbol{s}},{\boldsymbol{a}},r,{\boldsymbol{s}}'} \right)$放到回放池${D_{\mathrm{U}}}$(7) 从回放池${D_{\mathrm{U}}}$抽取一批经验 (8) 计算梯度$ {\nabla _\mu }\mathcal{L}(\mu ) $,完成网络参数$\mu $反向更新 (9) 每隔$ G $步,复制网络参数$ \mu $给目标网络参数$ \mu \_ $ (10) end for (11) end for 表 1 仿真参数设置
参数 值 参数 值 基站数量 4 下层评论家/
行动家学习率0.01/0.001 IoT设备 20 上层/下层折扣因子 0.9/0.95 带宽 1.8 MHz 上层/下层最小批 512/32 每个LTI的长度($\tau $) 100 ms 单位DT迁移/实例化成本 15/15 每个STL的长度($\Delta T$) 5 s 切片1/切片2速率阈值 600/300 最大传输功率 40 mW 上层贪婪率 0.1 -
[1] ZEB S, MAHMOOD A, HASSAN S A, et al. Industrial digital twins at the nexus of NextG wireless networks and computational intelligence: A survey[J]. Journal of Network and Computer Applications, 2022, 200: 103309. doi: 10.1016/j.jnca.2021.103309. [2] LIN Xingqin, KUNDU L, DICK C, et al. 6G digital twin networks: From theory to practice[J]. IEEE Communications Magazine, 2023, 61(11): 72–78. doi: 10.1109/MCOM.001.2200830. [3] KURUVATTI N P, HABIBI M A, PARTANI S, et al. Empowering 6G communication systems with digital twin technology: A comprehensive survey[J]. IEEE Access, 2022, 10: 112158–112186. doi: 10.1109/ACCESS.2022.3215493. [4] KHAN L U, SAAD W, NIYATO D, et al. Digital-twin-enabled 6G: Vision, architectural trends, and future directions[J]. IEEE Communications Magazine, 2022, 60(1): 74–80. doi: 10.1109/MCOM.001.21143. [5] WU Yiwen, ZHANG Ke, and ZHANG Yan. Digital twin networks: A survey[J]. IEEE Internet of Things Journal, 2021, 8(18): 13789–13804. doi: 10.1109/JIOT.2021.3079510. [6] LU Yunlong, HUANG Xiaohong, ZHANG Ke, et al. Low-latency federated learning and blockchain for edge association in digital twin empowered 6G networks[J]. IEEE Transactions on Industrial Informatics, 2021, 17(7): 5098–5107. doi: 10.1109/TII.2020.3017668. [7] LIU Tong, TANG Lun, WANG Weili, et al. Resource allocation in DT-assisted internet of vehicles via edge intelligent cooperation[J]. IEEE Internet of Things Journal, 2022, 9(18): 17608–17626. doi: 10.1109/JIOT.2022.3156100. [8] LU Yunlong, MAHARJAN S, and ZHANG Yan. Adaptive edge association for wireless digital twin networks in 6G[J]. IEEE Internet of Things Journal, 2021, 8(22): 16219–16230. doi: 10.1109/JIOT.2021.3098508. [9] SUI Tianju, YOU Keyou, and FU Minyue. Stability conditions for multi-sensor state estimation over a Lossy network[J]. Automatica, 2015, 53: 1–9. doi: 10.1016/j.automatica.2014.12.022. [10] CHUKHNO O, CHUKHNO N, ARANITI G, et al. Placement of social digital twins at the edge for beyond 5G IoT networks[J]. IEEE Internet of Things Journal, 2022, 9(23): 23927–23940. doi: 10.1109/JIOT.2022.3190737. [11] LYU Ling, DAI Yanpeng, CHENG Nan, et al. AoI-aware co-design of cooperative transmission and state estimation for marine IoT systems[J]. IEEE Internet of Things Journal, 2021, 8(10): 7889–7901. doi: 10.1109/JIOT.2020.3041287. [12] WIJETHILAKA S and LIYANAGE M. Survey on network slicing for internet of things realization in 5G networks[J]. IEEE Communications Surveys & Tutorials, 2021, 23(2): 957–994. doi: 10.1109/COMST.2021.3067807. [13] CHIANG Y, HSU C H, CHEN G H, et al. Deep Q-learning-based dynamic network slicing and task offloading in edge network[J]. IEEE Transactions on Network and Service Management, 2023, 20(1): 369–384. doi: 10.1109/TNSM.2022.3208776. [14] YE Feng, WANG Jie, LI Jiamin, et al. Intelligent hierarchical network slicing based on dynamic multi-connectivity in cell-free distributed massive MIMO systems[J]. IEEE Transactions on Vehicular Technology, 2023, 72(9): 11855–11870. doi: 10.1109/TVT.2023.3268822. [15] KHAN L U, HAN Zhu, SAAD W, et al. Digital twin of wireless systems: Overview, taxonomy, challenges, and opportunities[J]. IEEE Communications Surveys & Tutorials, 2022, 24(4): 2230–2254. doi: 10.1109/COMST.2022.3198273.