数字孪生辅助联邦学习中的边缘选择和资源分配联合优化

唐伦; 文明艳; 单贞贞; 陈前斌

doi:10.11999/JEIT230421

数字孪生辅助联邦学习中的边缘选择和资源分配联合优化

doi: 10.11999/JEIT230421

1.
重庆邮电大学通信与信息工程学院重庆 400065
2.
移动通信技术重庆市重点实验室重庆 400065

基金项目: 国家自然科学基金(62071078)，重庆市教委科学技术研究项目(KJZD-M201800601)，四川省科技计划(2021YFQ0053)

详细信息

作者简介:
唐伦：男，教授，博士生导师，研究方向为新一代无线通信网络、异构蜂窝网络、软件定义无线网络等

文明艳：女，硕士生，研究方向为移动边缘计算辅助智能驾驶技术、联邦学习效率优化等

单贞贞：女，硕士生，研究方向为边缘智能协同计算资源分配、联邦学习资源协同优化等

陈前斌：男，教授，博士生导师，研究方向为个人通信、多媒体信息处理与传输、下一代移动通信网络、异构蜂窝网络等

通讯作者:
文明艳　wenming155968@163.com

中图分类号: TN929.5
计量
- 文章访问数: 805
- HTML全文浏览量: 353
- PDF下载量: 129
- 被引次数: 13
出版历程
- 收稿日期: 2023-05-15
- 修回日期: 2023-09-14
- 网络出版日期: 2023-09-15
- 刊出日期: 2024-04-24

Joint Optimization of Edge Selection and Resource Allocation in Digital Twin-assisted Federated Learning

1.
School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
Chongqing Key Laboratory of mobile Communications Technology, Chongqing, 400065, China

Funds: The National Natural Science Foundation of China (62071078), The Science and Technology Research Program of Chongqing Municipal Education Commission (KJZD-M201800601), Sichuan Science and Technology Program (2021YFQ0053)

摘要

摘要: 在基于联邦学习的智能驾驶中，智能网联汽车(ICV)的资源限制和可能出现的设备故障会导致联邦学习训练精度下降、时延和能耗增加等问题。为此该文提出数字孪生辅助联邦学习中的边缘选择和资源分配优化方案。该方案首先提出数字孪生辅助联邦学习机制，使得ICV能够选择在本地或利用其数字孪生体参与联邦学习。其次，通过构建数字孪生辅助联邦学习的计算和通信模型，建立以最小化累积训练时延和能耗为目标的边缘选择和资源分配联合优化问题，并将其转化为部分可观测的马尔可夫决策过程。最后，提出基于多智能体参数化Q网络(MPDQN)的边缘选择和资源分配算法，用于学习近似最优的边缘选择和资源分配策略，以实现联邦学习累积时延和能耗最小化。仿真结果表明，所提算法在保证模型精度的同时，有效降低联邦学习累积训练时延和能耗。
- 智能驾驶 /
- 联邦学习 /
- 数字孪生 /
- 深度强化学习
Abstract: In intelligent driving based on federated learning, the resource constraints of Intelligent Connected Vehicle (ICV) and possible device failures will lead to the decrease of the precision of federated learning training and the increase of delay and energy consumption. Therefore, an optimization scheme of edge selection and resource allocation in digital twin-assisted federated learning is proposed. Firstly, a digital twin-assisted federated learning mechanism is proposed, allowing ICV to choose to participate in federated learning locally or through its digital twin. Secondly, by constructing a computational and communication model for digital twin-assisted federated learning, an edge selection and computing resource allocation joint optimization problem is established with the objective of minimizing cumulative training delay and energy consumption, and is transformed into a partially observable Markov decision process. Finally, an edge selection and resource allocation algorithm based on Multi-agent Parametrized Deep Q-Networks (MPDQN) is proposed to learn approximately optimal edge selection and resource allocation strategies to minimize federated learning cumulative delay and energy consumption. Simulation results show that the proposed algorithm can effectively reduce cumulative training delay and energy consumption of federated learning training while ensuring model accuracy.
- Intelligent driving /
- Federated learning /
- Digital twin /
- Deep reinforcement learning

HTML全文

图 1 数字孪生辅助联邦学习架构图

下载: 全尺寸图片幻灯片

图 2 单个ES中的PDQN框架

下载: 全尺寸图片幻灯片

图 3 不同联邦学习机制的训练损失对比

下载: 全尺寸图片幻灯片

图 4 不同联邦学习机制的训练精度对比

下载: 全尺寸图片幻灯片

图 5 不同边缘选择策略的训练精度对比

下载: 全尺寸图片幻灯片

图 6 不同边缘选择策略的时延和能耗对比

下载: 全尺寸图片幻灯片

图 7 不同资源分配策略的训练精度对比

下载: 全尺寸图片幻灯片

图 8 不同资源分配策略的时延和能耗对比

下载: 全尺寸图片幻灯片

算法1　基于MPDQN的边缘选择和资源分配算法
输入：学习率 $({\lambda _{\text{d}}},{\lambda _{\text{c}}})$ ，学习回合数 ${N_{{\text{max}}}}$ ，概率分布 $\psi$ ，探索概率 $\varepsilon$ ，小批量大小 $B$ ，采样数据的学习回合数量 ${N_{{\text{sam}}}}$
输出：边缘选择和资源分配策略
(1) 初始化网络参数 $({\theta _{\text{d}}},{\theta _{\text{c}}})$ 和经验回放池
(2) for $i = 1,2,\cdots,{N_{{\text{max}}}}$ do
(3) 　收到初始状态 ${{\boldsymbol{s}}_1} = {\{ {{\boldsymbol{s}}_{m,1} }\} _{\forall m \in \mathcal{M} } }$
(4) 　for 数字孪生辅助联邦学习全局迭代 $k\in \mathcal{K}$ do
(5) 　　for 智能体 $m \in \mathcal{M}$ do
(6) 　　　根据式(22)计算连续动作参数 ${{\boldsymbol{f}}_m}(k)$
根据 $\varepsilon$ 贪婪策略选择动作 ${{\boldsymbol{a}}_{m,k} } = \{ {{\boldsymbol{\varphi}} _m}(k),{ {\boldsymbol{f} }_m}(k)\}$ ：
(7) 　　　　 ${{\boldsymbol{a}}}_{m,k}=\left\{\begin{aligned}& 分布\psi 的样本，\varepsilon \\ & ({{\boldsymbol{\varphi}} }_{m}(k),{\boldsymbol f}_{m}(k)), {\varphi }_{m}(k)=\text{arg}\underset{{\boldsymbol{\varphi}} }{\text{max} }Q({{\boldsymbol{s}}}_{m,k},{{\boldsymbol{\varphi}} }_{m}(k),{\boldsymbol f}_{m}^{*}(k))，1-\varepsilon \end{aligned} \right.$
(9) 　　　执行动作 ${{\boldsymbol{a}}_{m,k}}$ ，获得瞬时奖励 ${{\boldsymbol{r}}_{m,k}}$ 和下一个状态 ${{\boldsymbol{s}}_{m,k + 1}}$
(10) 　　将元组 $[{{\boldsymbol{s}}_{m,k}},{{\boldsymbol{a}}_{m,k}},{{\boldsymbol{r}}_{m,k}},{{\boldsymbol{s}}_{m,k + 1}}]$ 存入经验回放池 ${\mathcal{D}_m}$
(11) 　　经验回放池 ${\mathcal{D}_m}$ 中采样一组小批量 $B$ 的数据样本
(12) 　　根据式(19)更新TQN的目标函数 ${y_m}(k)$
(13) 　　根据式(20)和式(21)分别计算损失函数 $L({\varpi _{m,{\text{d}}}})$ 和 $L({\varpi _{m,{\text{c}}}})$
(15) 　　根据式(22)和式(23)更新网络参数 ${\varpi _{m,{\text{d}}}}(k + 1)$ 和 ${\varpi _{m,{\text{c}}}}(k + 1)$
(17) 　　 if $(i > {N_{{\text{sam}}}})$ then
(18) 　　　从经验回放池 $\mathcal{D}$ 中采样一组小批量 $B$ 的数据样本
(19) 　　　更新参数 ${\varpi _{ {\text{d,me} } } }(k + 1) \leftarrow {\lambda '_{ {\text{me} } } }{\nabla _{ {\varpi _{ {\text{d,me} } } } } }l{\text{(} }\varpi {\text{)} }$ 和 ${\varpi _{ {\text{c,me} } } }(k + 1) \leftarrow {\lambda '_{ {\text{me} } } }\nabla l{\text{(} }{\varpi _{\text{c} } }{\text{)} }$
(21) 　　　融合网络下发最新的参数至各个智能体
(22) 　　 end if
(23) 　 end for
(24) end for
(25) end for

下载: 导出CSV

参考文献(15)

[1]	BOUKERCHE A and DE GRANDE R E. Vehicular cloud computing: Architectures, applications, and mobility[J]. Computer Networks, 2018, 135: 171–189. doi: 10.1016/j.comnet.2018.01.004.
[2]	ARENA F and PAU G. An overview of vehicular communications[J]. Future Internet, 2019, 11(2): 27. doi: 10.3390/fi11020027.
[3]	BENNIS M. Federated learning and control at the wireless network edge[J]. GetMobile:Mobile Computing and Communications, 2021, 24(3): 9–13. doi: 10.1145/3447853.3447857.
[4]	CHEN Mingzhe, POOR H V, SAAD W, et al. Convergence time minimization of federated learning over wireless networks[C]. ICC 2020–2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 2020: 1–6.
[5]	WU Yiwen, ZHANG Ke, and ZHANG Yan. Digital twin networks: a survey[J]. IEEE Internet of Things Journal, 2021, 8(18): 13789–13804. doi: 10.1109/JIOT.2021.3079510.
[6]	GRIEVES M and VICKERS J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems[M]. KAHLEN F J, FLUMERFELT S, and ALVES A. Transdisciplinary Perspectives on Complex Systems: New Findings and Approaches. Cham, Germany: Springer, 2017: 85–113.
[7]	DAI Yueyue, GUAN Yongliang, LEUNG K K, et al. Reconfigurable intelligent surface for low-latency edge computing in 6G[J]. IEEE Wireless Communications, 2021, 28(6): 72–79. doi: 10.1109/MWC.001.2100229.
[8]	SUN Wen, LEI Shiyu, WANG Lu, et al. Adaptive federated learning and digital twin for industrial internet of things[J]. IEEE Transactions on Industrial Informatics, 2021, 17(8): 5605–5614. doi: 10.1109/TII.2020.3034674.
[9]	HUI Yilong, ZHAO Gaosheng, LI Chengle, et al. Digital twins enabled on-demand matching for multi-task federated learning in HetVNets[J]. IEEE Transactions on Vehicular Technology, 2023, 72(2): 2352–2364. doi: 10.1109/TVT.2022.3211005.
[10]	LU Yunlong, MAHARJAN S, and ZHANG Yan. Adaptive edge association for wireless digital twin networks in 6G[J]. IEEE Internet of Things Journal, 2021, 8(22): 16219–16230. doi: 10.1109/JIOT.2021.3098508.
[11]	XIONG Jiechao, WANG Qing, YANG Zhuoran, et al. Parametrized deep Q-networks learning: Reinforcement learning with discrete-continuous hybrid action space[J]. arXiv: 1810.06394, 2018.
[12]	YIN Sixing and YU F R. Resource allocation and trajectory design in UAV-aided cellular networks based on multiagent reinforcement learning[J]. IEEE Internet of Things Journal, 2022, 9(4): 2933–2943. doi: 10.1109/JIOT.2021.3094651.
[13]	XIAO Han, RASUL K, and VOLLGRAF R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[J]. arXiv: 1708.07747, 2017.
[14]	YU Xiangbin, XU Weiye, LEUNG S H, et al. Power allocation for energy efficient optimization of distributed MIMO system with beamforming[J]. IEEE Transactions on Vehicular Technology, 2019, 68(9): 8966–8981. doi: 10.1109/TVT.2019.2931291.
[15]	ZHANG Jiaxiang, LIU Yiming, QIN Xiaoqi, et al. Energy-efficient federated learning framework for digital twin-enabled industrial internet of things[C]. The IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Helsinki, Finland, 2021: 1160–1166.

施引文献

期刊类型引用(4)

1.	张仕斌，黄曦，昌燕，闫丽丽，程稳. 大数据环境下量子机器学习的研究进展及发展趋势. 电子科技大学学报. 2021(06): 802-819 . 百度学术
2.	谢丽霞，魏瑞炘. 一种面向物联网节点的综合信任度评估模型. 西安电子科技大学学报. 2019(04): 58-65 . 百度学术
3.	谢丽霞，魏瑞炘. 物联网节点动态信任度评估方法. 计算机应用. 2019(09): 2597-2603 . 百度学术
4.	廖新考，王力生，刘晓建，许晓洁. 网络环境下的个性化信任模型PTM. 计算机科学. 2017(08): 100-106 . 百度学术