高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多分支网络的深度图帧内编码单元快速划分算法

刘畅 贾克斌 刘鹏宇

刘畅, 贾克斌, 刘鹏宇. 基于多分支网络的深度图帧内编码单元快速划分算法[J]. 电子与信息学报, 2022, 44(12): 4357-4366. doi: 10.11999/JEIT211010
引用本文: 刘畅, 贾克斌, 刘鹏宇. 基于多分支网络的深度图帧内编码单元快速划分算法[J]. 电子与信息学报, 2022, 44(12): 4357-4366. doi: 10.11999/JEIT211010
LIU Chang, JIA Kebin, LIU Pengyu. Fast Partition Algorithm in Depth Map Intra-frame Coding Unit Based on Multi-branch Network[J]. Journal of Electronics & Information Technology, 2022, 44(12): 4357-4366. doi: 10.11999/JEIT211010
Citation: LIU Chang, JIA Kebin, LIU Pengyu. Fast Partition Algorithm in Depth Map Intra-frame Coding Unit Based on Multi-branch Network[J]. Journal of Electronics & Information Technology, 2022, 44(12): 4357-4366. doi: 10.11999/JEIT211010

基于多分支网络的深度图帧内编码单元快速划分算法

doi: 10.11999/JEIT211010
基金项目: 国家重点研发计划(2018YFF01010100),北京市自然科学基金(4212001),青海省基础研究计划(2020-ZJ-709, 2021-ZJ-704)
详细信息
    作者简介:

    刘畅:女,博士生,研究方向为3维视频编码

    贾克斌:男,教授,研究方向为多媒体信息处理

    刘鹏宇:女,副教授,研究方向为智能媒体信息处理

    通讯作者:

    贾克斌 kebinj@bjut.edu.cn

  • 中图分类号: TN919.81

Fast Partition Algorithm in Depth Map Intra-frame Coding Unit Based on Multi-branch Network

Funds: The National Key Research and Development Project of China (2018YFF01010100), Beijing Natural Science Foundation (4212001), The Basic Research Program of Qinghai Province (2020-ZJ-709, 2021-ZJ-704)
  • 摘要: 3维高效视频编码(3D-HEVC)标准是最新的3维(3D)视频编码标准,但由于其引入深度图编码技术导致编码复杂度大幅增加。其中,深度图帧内编码单元(CU)的四叉树划分占3D-HEVC编码复杂度的90%以上。对此,在3D-HEVC深度图帧内编码模式下,针对CU四叉树划分复杂度高的问题,该文提出一种基于深度学习的CU划分结构快速预测方案。首先,构建学习深度图CU划分结构信息的数据集;其次,搭建预测CU划分结构的多分支卷积神经网络(MB-CNN)模型,并利用构建的数据集训练MB-CNN模型;最后,将MB-CNN模型嵌入3D-HEVC的测试平台,通过直接预测深度图帧内编码模式下CU的划分结构来降低CU划分复杂度。与标准算法相比,编码复杂度平均降低了37.4%。实验结果表明,在不影响合成视点质量的前提下,该文所提算法有效地降低了3D-HEVC的编码复杂度。
  • 图  1  3D-HEVC编码结构

    图  2  6个标准测试序列的编码时间统计

    图  3  深度图中CTU的四叉树划分过程

    图  4  编码单元纹理复杂度和编码单元深度之间的关系

    图  5  MB-CNN模型架构图

    图  6  深度图帧内编码单元快速划分流程图

    图  7  合成视点PSNR的计算过程示意图

    图  8  不同迭代次数下不同尺寸CU的预测准确率

    图  9  Poznan_Hall2视频序列在合成视点0.25上的主观质量对比

    表  1  编码单元深度和QP的关系(%)

    深度=0(尺寸=64×64)深度=1(尺寸=32×32)深度=2(尺寸=16×16)深度=3(尺寸=8×8)
    QP=22,不同CU深度占比29.293.4310.7556.10
    QP=39,不同CU深度占比70.7210.258.8710.17
    平均占比50.016.849.8133.13
    下载: 导出CSV

    表  2  本文构建的数据集

    数据集类型序列分辨率帧范围样本个数
    训练集Kendo1024×7680~29957600
    GT_Fly1920×10880~249127500
    验证集Balloons1024×768290~2991920
    Poznan_Hall21920×1088210~2195100
    测试集Newspaper1024×768280~2993840
    Undo_Dancer1920×1088230~24910200
    样本总和 206160
    下载: 导出CSV

    表  3  训练样本的组成形式

    深度划分:0,不划分:1
    01
    11011
    20 0 0 00 0 0 01 0 1 00 0 1 0
    3最小编码单元为8×8,向下不再划分
    组成形式1, 1011, 0000, 0000, 1010, 0010
    下载: 导出CSV

    表  4  实验环境

    硬件实验环境
    名称型号
    处理器Intel(R) Xeon(R) CPU E31230@ 3.20 GHz
    运行内存8.00 GB RAM
    显卡适配器NVIDIA Quadro K2000
    软件实验环境
    名称型号
    操作系统Windows 10
    Python3.5
    Tensorflow1.4.0
    CUDA8.0
    下载: 导出CSV

    表  5  编码参数配置

    编码配置参数数量
    Max CU Width64
    Max CU Height64
    Max Partition Depth4
    GOPSize1
    QP值 (纹理, 深度){(25, 34), (30, 39), (35, 42), (40, 45)}
    下载: 导出CSV

    表  6  标准测试序列及其参数

    序列分辨率帧率视点
    Balloons1024×768303 1 5
    Newspaper1024×768304 2 6
    Poznan_Hall21920×1088256 7 5
    Poznan_Street1920×1088254 5 3
    下载: 导出CSV

    表  7  本文算法、参考文献算法与HTM16.0的时间节省比较(%)

    序列文献[10]文献[12]文献[16]本文算法
    $\Delta {T_2}$$\Delta {T_3}$$\Delta {T_4}$$\Delta {T_1}$
    Balloons25.920.231.933.1
    Newspaper26.314.735.545.3
    Poznan_Hall225.940.635.936.7
    Poznan_Street24.025.436.734.7
    平均值 (分辨率:1024×768)26.117.533.739.2
    平均值 (分辨率:1920×1088)25.033.036.335.6
    平均值25.525.335.037.4
    下载: 导出CSV

    表  8  本文算法与HTM16.0的率失真性能比较(%)

    序列纹理视频 0纹理视频 1纹理视频 2纹理视频 PSNR /
    纹理视频比特率
    纹理视频 PSNR /
    总比特率
    合成视点 PSNR /
    总比特率
    Balloons00000.47.7
    Newspaper00000.34.4
    Poznan_Hall2000006.2
    Poznan_Street0000–0.15.4
    1024×76800000.46.0
    1920×1088000–0.4–0.15.8
    平均值00000.25.9
    下载: 导出CSV
  • [1] LIU Shan, LIU Lu, YANG Hua, et al. Research on 5G technology based on Internet of things[C]. 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 2020: 1821–1823.
    [2] KUFA J and KRATOCHVIL T. Visual quality assessment considering ultra HD, Full HD resolution and viewing distance[C]. The 29th International Conference Radioelektronika, Pardubice, Czech Republic, 2019: 1–4.
    [3] LI Tiansong, YU Li, WANG Hongkui, et al. A bit allocation method based on inter-view dependency and spatio-temporal correlation for multi-view texture video coding[J]. IEEE Transactions on Broadcasting, 2021, 67(1): 159–173. doi: 10.1109/TBC.2020.3028340
    [4] 王莉, 曹一凡, 杜高明, 等. 一种低延迟的3维高效视频编码中深度建模模式编码器[J]. 电子与信息学报, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798

    WANG Li, CAO Yifan, DU Gaoming, et al. A low-latency depth modelling mode-1 encoder in 3D-high efficiency video coding standard[J]. Journal of Electronics &Information Technology, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798
    [5] CHEN Ying, HANNUKSELA M M, SUZUKI T, et al. Overview of the MVC + D 3D video coding standard[J]. Journal of Visual Communication and Image Representation, 2014, 25(4): 679–688. doi: 10.1016/j.jvcir.2013.03.013
    [6] TIAN Shishun, ZHANG Lu, ZOU Wenbin, et al. Quality assessment of DIBR-synthesized views: An overview[J]. Neurocomputing, 2021, 423: 158–178. doi: 10.1016/j.neucom.2020.09.062
    [7] 齐美彬, 陈秀丽, 杨艳芳, 等. 高效率视频编码帧内预测编码单元划分快速算法[J]. 电子与信息学报, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148

    QI Meibin, CHEN Xiuli, YANG Yanfang, et al. Fast coding unit splitting algorithm for high efficiency video coding intra prediction[J]. Journal of Electronics &Information Technology, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148
    [8] ZUO Jiabao, CHEN Jing, ZENG Huanqiang, et al. Bi-layer texture discriminant fast depth intra coding for 3D-HEVC[J]. IEEE Access, 2019, 7: 34265–34274. doi: 10.1109/ACCESS.2019.2897161
    [9] LI Tiansong, WANG Hongkui, CHEN Yamei, et al. Fast depth intra coding based on spatial correlation and rate distortion cost in 3D-HEVC[J]. Signal Processing:Image Communication, 2020, 80: 115668. doi: 10.1016/j.image.2019.115668
    [10] LI Tiansong, YU Li, WANG Shengwei, et al. Simplified depth intra coding based on texture feature and spatial correlation in 3D-HEVC[C]. 2018 Data Compression Conference, Snowbird, USA, 2018: 421.
    [11] SALDANHA M, SANCHEZ G, MARCON C, et al. Fast 3D-HEVC depth map encoding using machine learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(3): 850–861. doi: 10.1109/TCSVT.2019.2898122
    [12] FU Changhong, CHEN Hao, CHAN Y L, et al. Fast depth intra coding based on decision tree in 3D-HEVC[J]. IEEE Access, 2019, 7: 173138–173147. doi: 10.1109/ACCESS.2019.2956994
    [13] SALDANHA M, SANCHEZ G, MARCON C, et al. Fast 3D-HEVC depth maps intra-frame prediction using data mining[C]. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018: 1738–1742.
    [14] XU Mai, LI Tianyi, WANG Zulin, et al. Reducing complexity of HEVC: A deep learning approach[J]. IEEE Transactions on Image Processing, 2018, 27(10): 5044–5059. doi: 10.1109/TIP.2018.2847035
    [15] TANG Genwei, JING Minge, ZENG Xiaoyang, et al. Adaptive CU split decision with pooling-variable CNN for VVC intra encoding[C]. 2019 IEEE Visual Communications and Image Processing (VCIP), Sydney, Australia, 2019: 1–4.
    [16] 李雅婷, 杨静. 3D-HEVC深度图帧内预测快速编码算法[J]. 光电子·激光, 2020, 31(2): 222–228. doi: 10.16136/j.joel.2020.02.0344

    LI Yating and YANG Jing. Fast intra coding algorithm for depth map in 3D-HEVC[J]. Journal of Optoelectronics Laser, 2020, 31(2): 222–228. doi: 10.16136/j.joel.2020.02.0344
    [17] XIE Saining and TU Zhuowen. Holistically-nested edge detection[J]. International Journal of Computer Vision, 2017, 125(1/3): 3–18. doi: 10.1007/s11263-017-1004-z
    [18] SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, USA, 2015: 1–14.
    [19] Tanimoto Lab. Nagoya University multi-view sequences download list[EB/OL].https://www.fujii.nuee.nagoya-u.ac.jp/multiview-data/, 2017.
    [20] FENG Zeqi, LIU Pengyu, JIA Kebin, et al. Fast intra CTU depth decision for HEVC[J]. IEEE Access, 2018, 6: 45262–45269. doi: 10.1109/ACCESS.2018.2864881
    [21] JCT-3V. 3D-HEVC reference software[EB/OL]. https://mpeg.chiariglione.org/standards/mpeg-h/hevc-reference-software.
    [22] BJONTEGAARD G. Calculation of average PSNR differences between RD curves[C]. The 13th Video Coding Experts Group Meeting, Austin, USA, 2001: VCEG-M33.
  • 加载中
图(9) / 表(8)
计量
  • 文章访问数:  666
  • HTML全文浏览量:  318
  • PDF下载量:  77
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-23
  • 修回日期:  2021-12-01
  • 录用日期:  2021-12-06
  • 网络出版日期:  2021-12-11
  • 刊出日期:  2022-12-16

目录

    /

    返回文章
    返回