高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于强化学习的立体全景视频自适应流

兰诚栋 饶迎节 宋彩霞 陈建

兰诚栋, 饶迎节, 宋彩霞, 陈建. 基于强化学习的立体全景视频自适应流[J]. 电子与信息学报, 2022, 44(4): 1461-1468. doi: 10.11999/JEIT200908
引用本文: 兰诚栋, 饶迎节, 宋彩霞, 陈建. 基于强化学习的立体全景视频自适应流[J]. 电子与信息学报, 2022, 44(4): 1461-1468. doi: 10.11999/JEIT200908
LAN Chengdong, RAO Yingjie, SONG Caixia, CHEN Jian. Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1461-1468. doi: 10.11999/JEIT200908
Citation: LAN Chengdong, RAO Yingjie, SONG Caixia, CHEN Jian. Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2022, 44(4): 1461-1468. doi: 10.11999/JEIT200908

基于强化学习的立体全景视频自适应流

doi: 10.11999/JEIT200908
基金项目: 国家自然科学基金(62001117),福建省自然科学基金(2017J01757)
详细信息
    作者简介:

    兰诚栋:男,1981年生,副教授,研究方向为视频编码与处理、人工智能、多媒体网络传输

    饶迎节:男,1994年生,硕士生,研究方向为多媒体网络传输、全景视频编解码、机器学习

    宋彩霞:女,1996年生,硕士生,研究方向为图像重建、全景视频编解码、深度学习

    陈建:女,1981年生,副教授,研究方向为视频编码与处理

    通讯作者:

    陈建 chenjian-fzu@163.com

  • 中图分类号: TN919

Adaptive Streaming of Stereoscopic Panoramic Video Based on Reinforcement Learning

Funds: The National Natural Science Foundation of China (62001117), Fujian Province Natural Science Foundation (2017J01757)
  • 摘要: 针对当前立体全景视频传输缺少有效的流自适应方法,且传统全景视频流自适应策略传输双目立体全景视频使得传输数据加倍,所需带宽巨大的问题,该文提出一种基于多智能体强化学习的立体全景视频非对称传输自适应流方法,以实时应对网络带宽波动。首先,根据人眼对视频显著性区域的偏爱,左右视点中每个瓦片(tile)对立体视频的感知质量的贡献度不同,提出一个基于tiles的左右视点观看概率预测方法。其次,设计了一种基于策略-评价(Actor-Critic)的多智能体强化学习框架,对左右视点进行联合码率控制。最后,根据模型结构和双目抑制原理,设计合理的奖励函数。实验结果表明,与传统流自适应传输策略相比,该文所提方法更加适用于基于tiles的立体全景视频传输,实现在有限带宽下提高用户的体验质量(QoE),为立体全景视频联合码率控制提供了一种全新的方法和思路。
  • 图  1  基于DASH的立体全景视频流系统结构图

    图  2  基于tile的视点预测概率模型

    图  3  算法结构图

    图  4  4G和5G带宽轨迹

    图  5  各算法性能比较

    图  6  各算法CDF比较

    表  1  时间测试与视点预测精度

    方法静态
    显著性提取
    动态
    显著性提取
    视差提取总共时间预测精度
    Plato67.4 ms0.89
    本文4.2 ms10.3 ms23.7 ms121.6 ms0.91
    下载: 导出CSV
  • [1] 高媛, 刘德建, 黄真真, 等. 虚拟现实技术促进学习的核心要素及其挑战[J]. 电化教育研究, 2016, 37(10): 77–87,103.

    GAO Yuan, LIU Dejian, HUANG Zhenzhen, et al. The core factors and challenges of virtual reality technology enhanced learning[J]. e-Education Research, 2016, 37(10): 77–87,103.
    [2] CISCO. Cisco visual networking index: Global mobile data traffic forecast update, 2017-2022[EB/OL]. https://s3.amazonaws.com/media.mediapost.com/uploads/CiscoForecast.pdf, 2019.
    [3] HUANG Jingwei, CHEN Zhili, CEYLAN D, et al. 6-DOF VR videos with a single 360-camera[C]. 2017 IEEE Virtual Reality, Los Angeles, USA, 2017: 37–44.
    [4] JIANG Xiaolan, CHIANG Yihan, ZHAO Yang, et al. Plato: Learning-based adaptive streaming of 360-Degree videos[C]. 2018 IEEE 43rd Conference on Local Computer Networks, Chicago, USA, 2018: 393–400.
    [5] KAN Nuowen, ZOU Junni, TANG Kexin, et al. Deep reinforcement learning-based rate adaptation for adaptive 360-Degree video streaming[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, 2019: 4030–4034.
    [6] NAIK D, CURCIO I D D, and TOUKOMAA H. Optimized viewport dependent streaming of stereoscopic omnidirectional video[C]. The 23rd Packet Video Workshop, Amsterdam, Netherlands, 2018: 37–42.
    [7] CURCIO I D D, NAIK D, TOUKOMAA H, et al. Subjective quality of spatially asymmetric omnidirectional stereoscopic video for streaming adaptation[C]. First International Conference on Smart Multimedia, Toulon, France, 2018: 417–428.
    [8] CURCIO I D D, TOUKOMAA H, and NAIK D. Bandwidth reduction of omnidirectional viewport-dependent video streaming via subjective quality assessment[C]. The 2nd International Workshop on Multimedia Alternate Realities, Mountain View, USA, 2017: 9–14.
    [9] XU Guisen, WANG Yueming, WANG Zhenyu, et al. Asymmetric representation for 3D panoramic video[C]. 18th Pacific-Rim Conference on Multimedia, Harbin, China, 2018: 683–690.
    [10] CHANG Yongjun and KIM M. Binocular suppression-based stereoscopic video coding by joint rate control with KKT conditions for a hybrid video codec system[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2015, 25(1): 99–111. doi: 10.1109/TCSVT.2014.2330658
    [11] 杨福星, 孙博文, 夏进. 基于DASH的全景视频传输应用研究[J]. 无线互联科技, 2018, 15(3): 25–28. doi: 10.3969/j.issn.1672-6944.2018.03.010

    YANG Fuxing, SUN Bowen and XIA Jin. Study on the panoramic video transmission based on DASH[J]. Wireless Internet Technology, 2018, 15(3): 25–28. doi: 10.3969/j.issn.1672-6944.2018.03.010
    [12] KÖPÜKLÜ O, KOSE N, GUNDUZ A, et al. Resource efficient 3d convolutional neural networks[C]. IEEE/CVF International Conference on Computer Vision Workshop, Seoul, Korea (South), 2019: 1910–1919.
    [13] LAGOUDAKIS M G and PARR R. Least-squares policy iteration[J]. Journal of Machine Learning Research, 2003, 4: 1107–1149.
    [14] BAN Yixuan, XIE Lan, XU Zhimin, et al. An optimal spatial-temporal smoothness approach for tile-based 360-Degree video streaming[C]. 2017 IEEE Visual Communications and Image Processing, St. Petersburg, USA, 2017: 1–4.
    [15] BATTISTI F, CARLI M, LE CALLET P, et al. Toward the assessment of quality of experience for asymmetric encoding in immersive media[J]. IEEE Transactions on Broadcasting, 2018, 64(2): 392–406. doi: 10.1109/TBC.2018.2828607
    [16] https://github.com/rao567/3dvideo.
    [17] CORBILLON X, DE SIMONE F, and SIMON G. 360-Degree video head movement dataset[C]. The 8th ACM on Multimedia Systems Conference, Taipei, China, 2017: 199–204.
    [18] VAN DER HOOFT J, PETRANGELI S, WAUTERS T, et al. HTTP/2-based adaptive streaming of HEVC video over 4G/LTE networks[J]. IEEE Communications Letters, 2016, 20(11): 2177–2180.
    [19] RACA D, LEAHY D, SREENAN C J, et al. Beyond throughput, the next generation: A 5G dataset with channel and context metrics[C]. The 11th ACM Multimedia Systems Conference, Istanbul, Turkey, 2020: 303–308.
    [20] YOUTUBE, Recommended upload encoding settings[EB/OL].https://yongqiang.blog.csdn.net/article/details/103602709, 2019.
    [21] NGUYEN D V, TRAN H T T, PHAM A T, et al. An optimal tile-based approach for viewport-adaptive 360-Degree video streaming[J]. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019, 9(1): 29–42. doi: 10.1109/JETCAS.2019.2899488
    [22] SAYGILI G, GURLER C G, and TEKALP A M. Evaluation of asymmetric stereo video coding and rate scaling for adaptive 3D video streaming[J]. IEEE Transactions on Broadcasting, 2011, 57(2): 593–601. doi: 10.1109/TBC.2011.2131450
  • 加载中
图(6) / 表(1)
计量
  • 文章访问数:  850
  • HTML全文浏览量:  414
  • PDF下载量:  98
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-10-23
  • 修回日期:  2022-01-05
  • 录用日期:  2022-01-14
  • 网络出版日期:  2022-02-02
  • 刊出日期:  2022-04-18

目录

    /

    返回文章
    返回