高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于自然梯度Actor-Critic强化学习的卫星边缘网络服务功能链部署方法

高媛 方海 赵扬 杨旭

高媛, 方海, 赵扬, 杨旭. 基于自然梯度Actor-Critic强化学习的卫星边缘网络服务功能链部署方法[J]. 电子与信息学报, 2023, 45(2): 455-463. doi: 10.11999/JEIT211384
引用本文: 高媛, 方海, 赵扬, 杨旭. 基于自然梯度Actor-Critic强化学习的卫星边缘网络服务功能链部署方法[J]. 电子与信息学报, 2023, 45(2): 455-463. doi: 10.11999/JEIT211384
GAO Yuan, FANG Hai, ZHAO Yang, YANG Xu. A Satellite Edge Network Service Function Chain Deployment Method Based on Natural Gradient Actor-Critic Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2023, 45(2): 455-463. doi: 10.11999/JEIT211384
Citation: GAO Yuan, FANG Hai, ZHAO Yang, YANG Xu. A Satellite Edge Network Service Function Chain Deployment Method Based on Natural Gradient Actor-Critic Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2023, 45(2): 455-463. doi: 10.11999/JEIT211384

基于自然梯度Actor-Critic强化学习的卫星边缘网络服务功能链部署方法

doi: 10.11999/JEIT211384
基金项目: 国家重点研发计划(2020YFB1808003)
详细信息
    作者简介:

    高媛:女,硕士,工程师,研究方向为卫星边缘计算

    方海:男,博士,高级工程师,研究方向为卫星边缘计算和软件定义载荷

    赵扬:男,博士,高级工程师,研究方向为卫星通信

    杨旭:男,硕士,研究员,研究方向为软件定义卫星

    通讯作者:

    高媛 gaoy199034@126.com

  • 中图分类号: TN927.2

A Satellite Edge Network Service Function Chain Deployment Method Based on Natural Gradient Actor-Critic Reinforcement Learning

Funds: The National Key Research and Development Program of China (2020YFB1808003)
  • 摘要: 鉴于低轨卫星网络的高动态性和空间环境的复杂性,如何提供在线的快速服务功能链(SFC)部署方法,成为低轨卫星边缘网络中亟待解决的问题。综合考虑节点和链路容量等约束以及服务迁移等切换代价,针对部署多接入边缘计算(MEC)服务器的低轨卫星网络,该文提出一种基于自然梯度参与者-评价者(Actor-Critic)强化学习架构的SFC在线部署方法。首先,针对低轨卫星网络的环境高动态性, 对实时容量约束和迁移代价进行建模;其次,引入马尔可夫决策过程(MDP),综合考虑服务迁移和卫星坐标等因素,描述低轨卫星网络的状态转移过程;最后,提出一种基于自然梯度的在线SFC部署强化学习方法,不同于标准梯度,自然梯度法进行模型层面的更新,以避免神经网络的训练陷入局部最优解。仿真结果表明,该文方法可逼近全局最优解,并在端到端时延性能上优于基于标准梯度的强化学习部署方法。
  • 图  1  SFC部署和迁移示意图

    图  2  不同学习率和样本批量大小对平均奖励函数的影响

    图  3  不同服务请求个数对端到端时延性能的影响

    图  4  每条SFC内不同SF个数对端到端时延性能的影响

    图  5  每条极地轨道上卫星节点个数对端到端时延性能的影响

    算法1 基于自然梯度的Actor-Critic算法
     输入:当前网络状态${{\boldsymbol{s}}_t}$
     输出:在线SFC部署结果${{\boldsymbol{a}}_t}$
     (1) 初始化神经网络参数$ {\boldsymbol{w}} $,$ {\boldsymbol{w}}' $,$ {\boldsymbol{\theta}} $和$ {\boldsymbol{\theta}} ' $和经验回放池$\mathcal{D}$
     (2) for episode=1, 2,···, E do
     (3) 重置环境,并重置${r_0} = 0$
     (4) for t = 0, 1,···, T–1 do
     (5)   根据Actor的策略函数${\pi _\theta }( \cdot |{s_t})$,随机抽样获取${a_t}$
     (6)   求解松弛式(15)的松弛问题,获取奖励值${r_t}$,并观察系
         统转移到的下一状态${{\boldsymbol{s}}_{t + 1}}$
     (7)   将4元组$({{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{r_t},{{\boldsymbol{s}}_{t + 1}})$存储到经验回放池$\mathcal{D}$
     (8)   随机抽取出包含$D$个样本的小批量
     (9)   根据式(20)和式(27),依序更新参数$ {\boldsymbol{w}} $和$ {\boldsymbol{\theta}} $
     (10)   按照设定的更新频率,更新$ {\boldsymbol{w}}' $和${\boldsymbol{ \theta}} ' $
     (11) end
     (12) end
    下载: 导出CSV
  • [1] ZHANG Jiaxin, ZHANG Xing, WANG Peng, et al. Double-edge intelligent integrated satellite terrestrial networks[J]. China Communications, 2020, 17(9): 128–146. doi: 10.23919/JCC.2020.09.011
    [2] 王鹏, 张佳鑫, 张兴, 等. 低轨卫星智能多接入边缘计算网络: 需求、架构、机遇与挑战[J]. 移动通信, 2021, 45(5): 35–46. doi: 10.3969/j.issn.1006-1010.2021.05.007

    WANG Peng, ZHANG Jiaxin, ZHANG Xing, et al. Low earth orbit satellite intelligent multi-access edge computing networks: Requirements, architecture, opportunities and challenges[J]. Mobile Communications, 2021, 45(5): 35–46. doi: 10.3969/j.issn.1006-1010.2021.05.007
    [3] 唐琴琴, 谢人超, 刘旭, 等. 融合MEC的星地协同网络: 架构、 关键技术与挑战[J]. 通信学报, 2020, 41(4): 162–181. doi: 10.11959/j.issn.1000-436x.2020082

    TANG Qinqin, XIE Renchao, LIU Xu, et al. MEC enabled satellite-terrestrial network: Architecture, key technique and challenge[J]. Journal on Communications, 2020, 41(4): 162–181. doi: 10.11959/j.issn.1000-436x.2020082
    [4] LI Guanglei, ZHOU Huachun, FENG Bohan, et al. Horizontal-based orchestration for multi-domain SFC in SDN/NFV-enabled satellite/terrestrial networks[J]. China Communications, 2018, 15(5): 77–91. doi: 10.1109/cc.2018.8387988
    [5] WANG Guangchao, ZHOU Sheng, ZHANG Shan, et al. SFC-based service provisioning for reconfigurable space-air-ground integrated networks[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(7): 1478–1489. doi: 10.1109/JSAC.2020.2986851
    [6] 王婷, 黄昊楠, 张兴, 等. 空天地一体化网络基于服务功能链的资源分配[J]. 无线电通信技术, 2021, 47(5): 611–617. doi: 10.3969/j.issn.1003-3114.2021.05.014

    WANG Ting, HUANG Haonan, ZHANG Xing, et al. Resource allocation based on service function chain in space-air-ground integrated networks[J]. Radio Communications Technology, 2021, 47(5): 611–617. doi: 10.3969/j.issn.1003-3114.2021.05.014
    [7] SHI Keyi, ZHANG Xiushe, ZHANG Shun, et al. Time-expanded graph based energy-efficient delay-bounded multicast over satellite networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69(9): 10380–10384. doi: 10.1109/TVT.2020.2988023
    [8] 邱航, 汤红波, 游伟. 基于深度Q网络的在线服务功能链部署方法[J]. 电子与信息学报, 2021, 43(11): 3122–3130. doi: 10.11999/JEIT201009

    QIU Hang, TANG Hongbo, and YOU Wei. Online service function chain deployment method based on deep Q network[J]. Journal of Electronics &Information Technology, 2021, 43(11): 3122–3130. doi: 10.11999/JEIT201009
    [9] 唐伦, 曹睿, 廖皓, 等. 基于深度强化学习的服务功能链可靠部署算法[J]. 电子与信息学报, 2020, 42(12): 2931–2938. doi: 10.11999/JEIT190969

    TANG Lun, CAO Rui, LIAO Hao, et al. Reliable deployment algorithm of service function chain based on deep reinforcement learning[J]. Journal of Electronics &Information Technology, 2020, 42(12): 2931–2938. doi: 10.11999/JEIT190969
    [10] LI Taixin, ZHOU Huachun, LUO Hongbin, et al. Service function chain in small satellite-based software defined satellite networks[J]. China Communications, 2018, 15(3): 157–167. doi: 10.1109/CC.2018.8331999
    [11] YANG Huiting, LIU Wei, LI Hongyan, et al. Maximum flow routing strategy for space information network with service function constraints[J]. IEEE Transactions on Wireless Communications, 2022, 21(5): 2903–2923. doi: 10.1109/TWC.2021.3116983
    [12] GAO Xiangqiang, LIU Rongke, and KAUSHIK A. Service chaining placement based on satellite mission planning in ground station networks[J]. IEEE Transactions on Network and Service Management, 2021, 18(3): 3049–3063. doi: 10.1109/TNSM.2020.3045432
    [13] QU Kaige, ZHUANG Weihua, YE Qiang, et al. Dynamic flow migration for embedded services in SDN/NFV-enabled 5G core networks[J]. IEEE Transactions on Communications, 2020, 68(4): 2394–2408. doi: 10.1109/TCOMM.2020.2968907
    [14] ZHOU Zhi, WU Qiong, and CHEN Xu. Online orchestration of cross-edge service function chaining for cost-efficient edge computing[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(8): 1866–1880. doi: 10.1109/JSAC.2019.2927070
    [15] ZHENG Gao, TSIOPOULOS A, and FRIDERIKOS V. Optimal VNF chains management for proactive caching[J]. IEEE Transactions on Wireless Communications, 2018, 17(10): 6735–6748. doi: 10.1109/TWC.2018.2863685
    [16] KARIMZADEH-FARSHBAFAN M, SHAH-MANSOURI V, and NIYATO D. A dynamic reliability-aware service placement for Network Function Virtualization (NFV)[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(2): 318–333. doi: 10.1109/JSAC.2019.2959196
    [17] WEI Yifei, YU F R, SONG Mei, et al. Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor–critic deep reinforcement learning[J]. IEEE Internet of Things Journal, 2019, 6(2): 2061–2073. doi: 10.1109/JIOT.2018.2878435
    [18] BHATNAGAR S, SUTTON R, GHAVAMZADEH M, et al. Natural actor-critic algorithms[J]. Automatica, 2009, 45(11): 2471–2482. doi: 10.1016/j.automatica.2009.07.008
    [19] WANG Lingxiao, CAI Qi, YANG Zhuoran, et al. . Neural policy gradient methods: Global optimality and rates of convergence[C]. The 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020: 1–46.
    [20] 吴琦, 郭孟泽, 朱立东. 大规模低轨卫星网络移动性管理方案[J]. 中兴通讯技术, 2021, 27(5): 28–35. doi: 10.12142/ZTETJ.202105007

    WU Qi, GUO Mengze, and ZHU Lidong. Large-scale low earth orbit satellite network mobility management scheme[J]. ZTE Technology Journal, 2021, 27(5): 28–35. doi: 10.12142/ZTETJ.202105007
  • 加载中
图(5) / 表(1)
计量
  • 文章访问数:  718
  • HTML全文浏览量:  378
  • PDF下载量:  172
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-11-30
  • 修回日期:  2022-06-06
  • 录用日期:  2022-06-22
  • 网络出版日期:  2022-06-28
  • 刊出日期:  2023-02-07

目录

    /

    返回文章
    返回