高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度增强学习的软件定义网络路由优化机制

兰巨龙 于倡和 胡宇翔 李子勇

兰巨龙, 于倡和, 胡宇翔, 李子勇. 基于深度增强学习的软件定义网络路由优化机制[J]. 电子与信息学报, 2019, 41(11): 2669-2674. doi: 10.11999/JEIT180870
引用本文: 兰巨龙, 于倡和, 胡宇翔, 李子勇. 基于深度增强学习的软件定义网络路由优化机制[J]. 电子与信息学报, 2019, 41(11): 2669-2674. doi: 10.11999/JEIT180870
Julong LAN, Changhe YU, Yuxiang HU, Ziyong LI. A SDN Routing Optimization Mechanism Based on Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2019, 41(11): 2669-2674. doi: 10.11999/JEIT180870
Citation: Julong LAN, Changhe YU, Yuxiang HU, Ziyong LI. A SDN Routing Optimization Mechanism Based on Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2019, 41(11): 2669-2674. doi: 10.11999/JEIT180870

基于深度增强学习的软件定义网络路由优化机制

doi: 10.11999/JEIT180870
基金项目: 国家自然科学基金群体创新项目(61521003),国家自然科学基金(61502530)
详细信息
    作者简介:

    兰巨龙:男,1962年生,教授,博士生导师,主要研究方向为新型网络体系结构与网络安全

    于倡和:男,1993年生,硕士,研究方向为新型网络体系结构与网络安全

    通讯作者:

    于倡和 yu_changhe@hotmail.com

  • 中图分类号: TP393

A SDN Routing Optimization Mechanism Based on Deep Reinforcement Learning

Funds: The National Natural Science Foundation of China for Innovative Research Groups (61521003), The National Natural Science Foundation of China (61502530)
  • 摘要: 为优化软件定义网络(SDN)的路由选路,该文将深度增强学习原理引入到软件定义网络的选路过程,提出一种基于深度增强学习的路由优化选路机制,用以削减网络运行时延、提高吞吐量等网络性能,实现连续时间上的黑盒优化,减少网络运维成本。此外,该文通过实验对所提出的路由优化机制进行评估,实验结果表明,路由优化机制具有良好的收敛性与有效性,较传统路由协议可提供更优的路由方案与实现更稳定的性能。
  • 图  1  加装机器学习机制的SDN网络架构

    图  2  DDPG的训练运行框架

    图  3  DDPG优化SDN路由选路的框架设计

    图  4  不同流量强度下网络的时延随训练步数的变化

    图  5  DDPG智能体与随机路由对比

    图  6  DDPG与OSPF的网络运行时延对比

  • BOUTABA R, SALAHUDDIN M A, LIMAM N, et al. A comprehensive survey on machine learning for networking: Evolution, applications and research opportunities[J]. Journal of Internet Services and Applications, 2018, 9(1): 16. doi: 10.1186/s13174-018-0087-2
    FADLULLAH Z M, TANG Fengxiao, MAO Bomin, et al. State-of-the-art deep learning: Evolving machine intelligence toward tomorrow’s intelligent network traffic control systems[J]. IEEE Communications Surveys & Tutorials, 2017, 19(4): 2432–2455. doi: 10.1109/COMST.2017.2707140
    LI Wei, LI Guojun, and YU Xiufen. A fast traffic classification method based on SDN network[C]. The 4th International Conference on Electronics, Communications and Networks, Beijing, China, 2015: 223–229.
    WANG Fu, LIU Bo, ZHANG Lijia, et al. Dynamic routing and spectrum assignment based on multilayer virtual topology and ant colony optimization in elastic software-defined optical networks[J]. Optical Engineering, 2017, 56(7): 076111. doi: 10.1117/1.OE.56.7.076111
    PARSAEI M R, MOHAMMADI R, and JAVIDAN R. A new adaptive traffic engineering method for telesurgery using ACO algorithm over Software Defined Networks[J]. European Research in Telemedicine, 2017, 6(3/4): 173–180. doi: 10.1016/j.eurtel.2017.10.003
    WANG Junchao, DE LAAT C, and ZHAO Zhiming. QoS-aware virtual SDN network planning[C]. 2017 IFIP/IEEE Symposium on Integrated Network and Service Management, Lisbon, Portugal, 2017: 644–647. doi: 10.23919/INM.2017.7987350.
    LIN S C, AKYILDIZ I F, WANG Pu, et al. QoS-aware adaptive routing in multi-layer hierarchical software defined networks: a reinforcement learning approach[C]. 2016 IEEE International Conference on Services Computing, San Francisco, USA, 2016: 25–33. doi: 10.1109/SCC.2016.12.
    JIANG Jingyan, HU Liang, HAO Pingting, et al. Q-FDBA: Improving QoE fairness for video streaming[J]. Multimedia Tools and Applications, 2018, 77(9): 10787–10806. doi: 10.1007/s11042-017-4917-1
    SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge, MA: The MIT Press, 1988.
    SENDRA S, REGO A, LLORET J, et al. Including artificial intelligence in a routing protocol using Software Defined Networks[C]. 2017 IEEE International Conference on Communications Workshops, Paris, France, 2017: 670–674. doi: 10.1109/ICCW.2017.7962735.
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
    LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[P]. USA, Patent, WO2017019555, 2017.
    MESTRES A, RODRIGUEZ-NATAL A, CARNER J, et al. Knowledge-defined networking[J]. ACM SIGCOMM Computer Communication Review, 2017, 47(3): 2–10. doi: 10.1145/3138808.3138810
    SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]. International Conference on Machine Learning, Beijing, China, 2014: I-387–I-395.
    VARGA A and HORNIG R. An overview of the OMNeT++ simulation environment[C]. The 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops, Marseille, France, 2008: 60.
    ROUGHAN M. Simplifying the synthesis of internet traffic matrices[J]. ACM SIGCOMM Computer Communication Review, 2005, 35(5): 93–96. doi: 10.1145/1096536.1096551
    PAN S J and YANG Qiang. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345–1359. doi: 10.1109/TKDE.2009.191
  • 加载中
图(6)
计量
  • 文章访问数:  5700
  • HTML全文浏览量:  2196
  • PDF下载量:  253
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-09-06
  • 修回日期:  2019-05-12
  • 网络出版日期:  2019-05-27
  • 刊出日期:  2019-11-01

目录

    /

    返回文章
    返回