高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

数字孪生辅助联邦学习中的边缘选择和资源分配联合优化

唐伦 文明艳 单贞贞 陈前斌

唐伦, 文明艳, 单贞贞, 陈前斌. 数字孪生辅助联邦学习中的边缘选择和资源分配联合优化[J]. 电子与信息学报, 2024, 46(4): 1343-1352. doi: 10.11999/JEIT230421
引用本文: 唐伦, 文明艳, 单贞贞, 陈前斌. 数字孪生辅助联邦学习中的边缘选择和资源分配联合优化[J]. 电子与信息学报, 2024, 46(4): 1343-1352. doi: 10.11999/JEIT230421
TANG Lun, WEN Mingyan, SHAN Zhenzhen, CHEN Qianbin. Joint Optimization of Edge Selection and Resource Allocation in Digital Twin-assisted Federated Learning[J]. Journal of Electronics & Information Technology, 2024, 46(4): 1343-1352. doi: 10.11999/JEIT230421
Citation: TANG Lun, WEN Mingyan, SHAN Zhenzhen, CHEN Qianbin. Joint Optimization of Edge Selection and Resource Allocation in Digital Twin-assisted Federated Learning[J]. Journal of Electronics & Information Technology, 2024, 46(4): 1343-1352. doi: 10.11999/JEIT230421

数字孪生辅助联邦学习中的边缘选择和资源分配联合优化

doi: 10.11999/JEIT230421
基金项目: 国家自然科学基金(62071078),重庆市教委科学技术研究项目(KJZD-M201800601),四川省科技计划(2021YFQ0053)
详细信息
    作者简介:

    唐伦:男,教授,博士生导师,研究方向为新一代无线通信网络、异构蜂窝网络、软件定义无线网络等

    文明艳:女,硕士生,研究方向为移动边缘计算辅助智能驾驶技术、联邦学习效率优化等

    单贞贞:女,硕士生,研究方向为边缘智能协同计算资源分配、联邦学习资源协同优化等

    陈前斌:男,教授,博士生导师,研究方向为个人通信、多媒体信息处理与传输、下一代移动通信网络、异构蜂窝网络等

    通讯作者:

    文明艳 wenming155968@163.com

  • 中图分类号: TN929.5

Joint Optimization of Edge Selection and Resource Allocation in Digital Twin-assisted Federated Learning

Funds: The National Natural Science Foundation of China (62071078), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601), Sichuan Science and Technology Program (2021YFQ0053)
  • 摘要: 在基于联邦学习的智能驾驶中,智能网联汽车(ICV)的资源限制和可能出现的设备故障会导致联邦学习训练精度下降、时延和能耗增加等问题。为此该文提出数字孪生辅助联邦学习中的边缘选择和资源分配优化方案。该方案首先提出数字孪生辅助联邦学习机制,使得ICV能够选择在本地或利用其数字孪生体参与联邦学习。其次,通过构建数字孪生辅助联邦学习的计算和通信模型,建立以最小化累积训练时延和能耗为目标的边缘选择和资源分配联合优化问题,并将其转化为部分可观测的马尔可夫决策过程。最后,提出基于多智能体参数化Q网络(MPDQN)的边缘选择和资源分配算法,用于学习近似最优的边缘选择和资源分配策略,以实现联邦学习累积时延和能耗最小化。仿真结果表明,所提算法在保证模型精度的同时,有效降低联邦学习累积训练时延和能耗。
  • 图  1  数字孪生辅助联邦学习架构图

    图  2  单个ES中的PDQN框架

    图  3  不同联邦学习机制的训练损失对比

    图  4  不同联邦学习机制的训练精度对比

    图  5  不同边缘选择策略的训练精度对比

    图  6  不同边缘选择策略的时延和能耗对比

    图  7  不同资源分配策略的训练精度对比

    图  8  不同资源分配策略的时延和能耗对比

    算法1 基于MPDQN的边缘选择和资源分配算法
     输入:学习率$ ({\lambda _{\text{d}}},{\lambda _{\text{c}}}) $,学习回合数$ {N_{{\text{max}}}} $,概率分布$ \psi $,探索概率$\varepsilon $,小批量大小$B$,采样数据的学习回合数量${N_{{\text{sam}}}}$
     输出:边缘选择和资源分配策略
     (1) 初始化网络参数$ ({\theta _{\text{d}}},{\theta _{\text{c}}}) $和经验回放池
     (2) for $i = 1,2,\cdots,{N_{{\text{max}}}}$ do
     (3)  收到初始状态${{\boldsymbol{s}}_1} = {\{ {{\boldsymbol{s}}_{m,1} }\} _{\forall m \in \mathcal{M} } }$
     (4)  for 数字孪生辅助联邦学习全局迭代$ k\in \mathcal{K} $ do
     (5)   for 智能体$ m \in \mathcal{M} $ do
     (6)    根据式(22)计算连续动作参数${{\boldsymbol{f}}_m}(k)$
          根据$\varepsilon $贪婪策略选择动作${{\boldsymbol{a}}_{m,k} } = \{ {{\boldsymbol{\varphi}} _m}(k),{ {\boldsymbol{f} }_m}(k)\}$:
     (7)     ${{\boldsymbol{a}}}_{m,k}=\left\{\begin{aligned}& 分布\psi 的样本,\varepsilon \\ & ({{\boldsymbol{\varphi}} }_{m}(k),{\boldsymbol f}_{m}(k)), {\varphi }_{m}(k)=\text{arg}\underset{{\boldsymbol{\varphi}} }{\text{max} }Q({{\boldsymbol{s}}}_{m,k},{{\boldsymbol{\varphi}} }_{m}(k),{\boldsymbol f}_{m}^{*}(k)),1-\varepsilon \end{aligned} \right.$
     (9)    执行动作${{\boldsymbol{a}}_{m,k}}$,获得瞬时奖励${{\boldsymbol{r}}_{m,k}}$和下一个状态$ {{\boldsymbol{s}}_{m,k + 1}} $
     (10)    将元组$ [{{\boldsymbol{s}}_{m,k}},{{\boldsymbol{a}}_{m,k}},{{\boldsymbol{r}}_{m,k}},{{\boldsymbol{s}}_{m,k + 1}}] $存入经验回放池${\mathcal{D}_m}$
     (11)    经验回放池${\mathcal{D}_m}$中采样一组小批量$B$的数据样本
     (12)    根据式(19)更新TQN的目标函数$ {y_m}(k) $
     (13)    根据式(20)和式(21)分别计算损失函数$ L({\varpi _{m,{\text{d}}}}) $和$ L({\varpi _{m,{\text{c}}}}) $
     (15)    根据式(22)和式(23)更新网络参数$ {\varpi _{m,{\text{d}}}}(k + 1) $和$ {\varpi _{m,{\text{c}}}}(k + 1) $
     (17)    if $(i > {N_{{\text{sam}}}})$ then
     (18)     从经验回放池$\mathcal{D}$中采样一组小批量$B$的数据样本
     (19)     更新参数${\varpi _{ {\text{d,me} } } }(k + 1) \leftarrow {\lambda '_{ {\text{me} } } }{\nabla _{ {\varpi _{ {\text{d,me} } } } } }l{\text{(} }\varpi {\text{)} }$和${\varpi _{ {\text{c,me} } } }(k + 1) \leftarrow {\lambda '_{ {\text{me} } } }\nabla l{\text{(} }{\varpi _{\text{c} } }{\text{)} }$
     (21)     融合网络下发最新的参数至各个智能体
     (22)    end if
     (23)   end for
     (24) end for
     (25) end for
    下载: 导出CSV
  • [1] BOUKERCHE A and DE GRANDE R E. Vehicular cloud computing: Architectures, applications, and mobility[J]. Computer Networks, 2018, 135: 171–189. doi: 10.1016/j.comnet.2018.01.004.
    [2] ARENA F and PAU G. An overview of vehicular communications[J]. Future Internet, 2019, 11(2): 27. doi: 10.3390/fi11020027.
    [3] BENNIS M. Federated learning and control at the wireless network edge[J]. GetMobile:Mobile Computing and Communications, 2021, 24(3): 9–13. doi: 10.1145/3447853.3447857.
    [4] CHEN Mingzhe, POOR H V, SAAD W, et al. Convergence time minimization of federated learning over wireless networks[C]. ICC 2020–2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 2020: 1–6.
    [5] WU Yiwen, ZHANG Ke, and ZHANG Yan. Digital twin networks: a survey[J]. IEEE Internet of Things Journal, 2021, 8(18): 13789–13804. doi: 10.1109/JIOT.2021.3079510.
    [6] GRIEVES M and VICKERS J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems[M]. KAHLEN F J, FLUMERFELT S, and ALVES A. Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches. Cham, Germany: Springer, 2017: 85–113.
    [7] DAI Yueyue, GUAN Yongliang, LEUNG K K, et al. Reconfigurable intelligent surface for low-latency edge computing in 6G[J]. IEEE Wireless Communications, 2021, 28(6): 72–79. doi: 10.1109/MWC.001.2100229.
    [8] SUN Wen, LEI Shiyu, WANG Lu, et al. Adaptive federated learning and digital twin for industrial internet of things[J]. IEEE Transactions on Industrial Informatics, 2021, 17(8): 5605–5614. doi: 10.1109/TII.2020.3034674.
    [9] HUI Yilong, ZHAO Gaosheng, LI Chengle, et al. Digital twins enabled on-demand matching for multi-task federated learning in HetVNets[J]. IEEE Transactions on Vehicular Technology, 2023, 72(2): 2352–2364. doi: 10.1109/TVT.2022.3211005.
    [10] LU Yunlong, MAHARJAN S, and ZHANG Yan. Adaptive edge association for wireless digital twin networks in 6G[J]. IEEE Internet of Things Journal, 2021, 8(22): 16219–16230. doi: 10.1109/JIOT.2021.3098508.
    [11] XIONG Jiechao, WANG Qing, YANG Zhuoran, et al. Parametrized deep Q-networks learning: Reinforcement learning with discrete-continuous hybrid action space[J]. arXiv: 1810.06394, 2018.
    [12] YIN Sixing and YU F R. Resource allocation and trajectory design in UAV-aided cellular networks based on multiagent reinforcement learning[J]. IEEE Internet of Things Journal, 2022, 9(4): 2933–2943. doi: 10.1109/JIOT.2021.3094651.
    [13] XIAO Han, RASUL K, and VOLLGRAF R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[J]. arXiv: 1708.07747, 2017.
    [14] YU Xiangbin, XU Weiye, LEUNG S H, et al. Power allocation for energy efficient optimization of distributed MIMO system with beamforming[J]. IEEE Transactions on Vehicular Technology, 2019, 68(9): 8966–8981. doi: 10.1109/TVT.2019.2931291.
    [15] ZHANG Jiaxiang, LIU Yiming, QIN Xiaoqi, et al. Energy-efficient federated learning framework for digital twin-enabled industrial internet of things[C]. The IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Helsinki, Finland, 2021: 1160–1166.
  • 加载中
图(8) / 表(1)
计量
  • 文章访问数:  319
  • HTML全文浏览量:  156
  • PDF下载量:  75
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-05-15
  • 修回日期:  2023-09-14
  • 网络出版日期:  2023-09-15
  • 刊出日期:  2024-04-24

目录

    /

    返回文章
    返回