Fast Prediction Algorithm in High Efficiency Video Coding Intra-mode Based on Deep Feature Learning
-
摘要: 高效视频编码(HEVC)标准相对于H.264/AVC标准提升了压缩效率,但由于引入的编码单元四叉树划分结构也使得编码复杂度大幅度提升。对此,该文提出一种针对HEVC帧内编码模式下编码单元(CU)划分表征矢量预测的多层特征传递卷积神经网络(MLFT-CNN),大幅度降低了视频编码复杂度。首先,提出融合CU划分结构信息的降分辨率特征提取模块;其次,改进通道注意力机制以提升特征的纹理表达性能;再次,设计特征传递机制,用高深度编码单元划分特征指导低深度编码单元的划分;最后建立分段特征表示的目标损失函数,训练端到端的CU划分表征矢量预测网络。实验结果表明,在不影响视频编码质量的前提下,该文所提算法有效地降低了HEVC的编码复杂度,与标准方法相比,编码复杂度平均下降了70.96%。Abstract: Compared to H.264/AVC coding standard, High Efficiency Video Coding (HEVC) improves the compression efficiency, but the consequent disadvantage is the significant increase in encoding complexity by using the quad-tree partition. A Multi-Layer Feature Transfer Convolutional Neural Network (MLFT-CNN) for Coding Unit (CU) division and characterization vector prediction in HEVC intra coding mode is proposed, which greatly reduces the complexity of video coding. Firstly, a reduced-resolution feature extraction module incorporating CU partition structure information is proposed. Then, the channel attention mechanism is improved for a better texture expression performance of the feature. After that, the feature transfer mechanism is designed to use the feature division of high-depth coding unit to guide the division of low-depth coding unit. Finally, the target loss function represented by the segmented feature is established, and the end-to-end CU division represents the vector prediction network. The experimental results show that the proposed algorithm effectively reduces the encoding complexity of HEVC without affecting the video coding quality. Specifically, compared to the standard method, the encoding complexity on the standard test sequence is reduced by 70.96% on average.
-
Key words:
- High Efficiency Video Coding(HEVC) /
- Complexity reduction /
- Deep learning /
- Intra coding
-
表 1 图像测试序列结果
训练集 分辨率 方法 BD-BR(%) BD-PSNR(dB) $\Delta T$(%) QP=22 QP=27 QP=32 QP=37 CPH-Intra 768×512 文献[9] 5.113 –0.343 –59.43 –54.70 –48.74 –44.83 文献[14] 2.885 –0.210 –54.97 –58.78 –61.78 –64.41 本文算法 1.71 –0.116 –65.18 –72.01 –72.07 –74.83 1536×1024 文献[9] 6.002 –0.374 –58.94 –54.85 –50.57 –50.95 文献[14] 3.134 –0.208 –55.84 –59.46 –62.43 –64.17 本文算法 1.63 –0.113 –66.98 –72.21 –71.10 –74.18 2880×1920 文献[9] 4.035 –0.207 –57.03 –52.79 –52.31 –59.51 文献[14] 2.130 –0.115 –59.95 –63.14 –68.07 –69.46 本文算法 1.3278 –0.075 –70.47 –74.67 –75.82 –77.59 4928×3264 文献[9] 4.630 –0.209 –58.02 –62.74 –65.30 –67.46 文献[14] 1.863 –0.086 –61.43 –65.27 –68.70 –71.00 本文算法 2.085 0.080 –71.42 –74.89 –78.11 –79.63 标准差 文献[9] 0.831 0.088 1.06 4.41 7.52 9.89 文献[14] 0.604 0.064 3.13 3.07 3.64 3.49 本文算法 0.312 0.093 2.93 1.55 3.27 2.53 最优值 文献[9] 4.035 –0.207 –59.43 –62.74 –65.30 –67.46 文献[14] 1.863 –0.086 –61.43 –65.27 –68.70 –71.00 本文算法 1.328 –0.075 –71.42 –74.89 –78.11 –79.63 平均值 文献[9] 4.945 –0.284 –58.36 –56.27 –54.23 –55.69 文献[14] 2.353 –0.155 –58.05 –61.66 –65.25 –67.26 本文算法 1.688 –0.096 –68.51 –73.45 –74.28 –76.56 表 2 HEVC标准测试序列结果
类别 序列名称 方法 BD-BR(%) BD-PSNR(dB) $\Delta T$(%) QP=22 QP=27 QP=32 QP=37 A PeopleOnStreeet 文献[9] 9.627 –0.492 –52.12 –50.63 –37.79 –34.81 文献[14] 3.969 –0.209 –50.79 –53.87 –56.58 –61.15 本文算法 3.679 –0.216 –63.91 –67.38 –68.78 –70.86 Traffic 文献[9] 6.411 –0.304 –37.11 –25.36 –19.63 –33.38 文献[14] 4.945 –0.240 –53.86 –59.08 –63.54 –66.88 本文算法 3.225 –0.178 –75.45 –77.96 –79.7 –81.12 B Cactus 文献[9] 7.533 –0.248 –38.37 –40.83 –43.61 –51.23 文献[14] 6.021 –0.208 –58.18 –61.01 –64.94 –67.78 本文算法 3.634 –0.141 –69.24 –74.67 –74.12 –73.69 ParkScene 文献[9] 3.630 –0.149 –41.69 –44.79 –59.98 –64.92 文献[14] 3.417 –0.135 –60.27 –65.10 –68.57 –70.16 本文算法 2.561 –0.113 –65.03 –70.62 –70.45 –71.46 C BQMall 文献[9] 9.646 –0.486 –52.62 –42.97 –35.52 –37.12 文献[14] 8.077 –0.468 –47.08 –51.15 –53.26 –57.05 本文算法 6.14 –0.395 –62.09 –65.89 –65.86 –69.1 RaceHorses 文献[9] 7.220 –0.379 –46.46 –40.13 –41.49 –50.28 文献[14] 4.422 –0.264 –50.52 –59.30 –59.81 –63.15 本文算法 3.228 –0.217 –64.44 –71.22 –70.17 –72.47 D BasketballPass 文献[9] 10.054 –0.546 –43.69 –41.03 –37.46 –36.69 文献[14] 8.401 –0.457 –60.24 –62.89 –64.31 –66.67 本文算法 4.489 –0.264 –74.99 –77.29 –77.81 –79.36 BlowingBubbles 文献[9] 6.178 –0.373 –57.15 –42.45 –25.73 –22.81 文献[14] 8.328 –0.463 –54.62 –60.45 –62.55 –65.48 本文算法 5.217 –0.315 –61.68 –65.97 –62.99 –66.43 E FourPeople 文献[9] 9.077 –0.480 –53.52 –40.88 –26.12 –24.34 文献[14] 8.002 –0.439 –54.79 –59.79 –64.39 –67.17 本文算法 4.298 –0.258 –65.21 –69.51 –70.94 –71.98 Johnny 文献[9] 12.182 –0.474 –58.29 –60.21 –63.98 –70.70 文献[14] 7.956 –0.307 –62.92 –65.51 –67.71 –70.05 本文算法 4.162 –0.176 –72.02 –74.84 –75.35 –76.12 方差 文献[9] 2.444 0.127 7.68 8.76 14.24 16.18 文献[14] 2.013 0.127 5.04 4.51 4.78 4.08 本文算法 1.048 0.084 5.16 4.48 5.20 4.50 最优值 文献[9] 3.63 –0.149 –58.29 –60.21 –63.98 –70.70 文献[14] 3.417 –0.135 –62.92 –65.51 –68.57 –70.16 本文算法 2.561 –0.113 –75.45 –77.96 –79.7 –81.12 平均值 文献[9] 8.156 –0.393 –48.10 –42.93 –39.13 –42.63 文献[14] 6.354 –0.319 –55.33 –59.82 –62.57 –65.56 本文算法 4.063 –0.227 –67.41 –71.54 –71.62 –73.26 -
[1] 王莉, 曹一凡, 杜高明, 等. 一种低延迟的3维高效视频编码中深度建模模式编码器[J]. 电子与信息学报, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798WANG Li, CAO Yifan, DU Gaoming, et al. A low-latency depth modelling mode-1 encoder in 3D-high efficiency video coding standard[J]. Journal of Electronics &Information Technology, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798 [2] WIEGAND T, SULLIVAN G J, BJONTEGAARD G, et al. Overview of the H.264/AVC video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 560–576. doi: 10.1109/tcsvt.2003.815165 [3] KIM I K, MIN J, LEE T, et al. Block partitioning structure in the HEVC standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1697–1706. doi: 10.1109/TCSVT.2012.2223011 [4] JCT-VC. HM Software[EB/OL]. https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.5/, 2014. [5] CORREA G, ASSUNCAO P, AGOSTINI L, et al. Performance and computational complexity assessment of high-efficiency video encoders[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1899–1909. doi: 10.1109/TCSVT.2012.2223411 [6] 齐美彬, 陈秀丽, 杨艳芳, 等. 高效率视频编码帧内预测编码单元划分快速算法[J]. 电子与信息学报, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148QI Meibin, CHEN Xiuli, YANG Yanfang, et al. Fast coding unit splitting algorithm for high efficiency video coding intra prediction[J]. Journal of Electronics &Information Technology, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148 [7] 汤进, 彭勇. 基于时空相关与纹理特性的HEVC编码单元快速划分算法[J]. 计算机与数字工程, 2019, 47(7): 1753–1756, 1782. doi: 10.3969/j.issn.1672-9722.2019.07.038TANG Jin and PENG Yong. Fast coding unit partition algorithm for HEVC based on temporal-spatial correlation and texture property[J]. Computer and Digital Engineering, 2019, 47(7): 1753–1756, 1782. doi: 10.3969/j.issn.1672-9722.2019.07.038 [8] BOUAAFIA S, KHEMIRI R, SAYADI F E, et al. Fast CU partition-based machine learning approach for reducing HEVC complexity[J]. Journal of Real-Time Image Processing, 2020, 17(1): 185–196. doi: 10.1007/s11554-019-00936-0 [9] LIU Deyuan, LIU Xingang, and LI Yayong. Fast CU size decisions for HEVC intra frame coding based on support vector machines[C]. 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress, Auckland, New Zealand, 2016: 594–597. [10] LIU Xingang, LI Yayong, LIU Deyuan, et al. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(1): 144–155. doi: 10.1109/TCSVT.2017.2777903 [11] FENG Zeqi, LIU Pengyu, JIA Kebin, et al. Fast intra CTU depth decision for HEVC[J]. IEEE Access, 2018, 6: 45262–45269. doi: 10.1109/ACCESS.2018.2864881 [12] LIU Zhenyu, YU Xianyu, CHEN Shaolin, et al. CNN oriented fast HEVC intra CU mode decision[C]. 2016 IEEE International Symposium on Circuits and Systems, Montreal, Canada, 2016: 2270–2273. [13] LI Xin and GONG Na. Run-time deep learning enhanced fast coding unit decision for high efficiency video coding[J]. Journal of Circuits, Systems and Computers, 2020, 29(3): 2050046. doi: 10.1142/S0218126620500462 [14] LIU Zhenyu, YU Xianyu, GAO Yuan, et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network[J]. IEEE Transactions on Image Processing, 2016, 25(11): 5088–5103. doi: 10.1109/tip.2016.2601264 [15] HU Jie, SHEN Li, SUN Gang, et al. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141. [16] XU Mai, DENG Xin, LI Shengxi, et al. Region-of-interest based conversational HEVC coding with hierarchical perception model of face[J]. IEEE Journal of Selected Topics in Signal Processing, 2014, 8(3): 475–489. doi: 10.1109/jstsp.2014.2314864 [17] XU Mai, LI Tianyi, WANG Zulin, et al. Reducing complexity of HEVC: A deep learning approach[J]. IEEE Transactions on Image Processing, 2018, 27(10): 5044–5059. doi: 10.1109/TIP.2018.2847035 [18] OHM J R, SULLIVAN G J, SCHWARZ H, et al. Comparison of the coding efficiency of video coding standards—including high efficiency video coding (HEVC)[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1669–1684. doi: 10.1109/tcsvt.2012.2221192 [19] BJONTEGARD G. Calculation of average PSNR differences between RD-curves[C]. The 13th Video Coding Experts Group Meeting, Austin, USA, 2001: VCEG-M33. [20] KINGMA D P and BA J. Adam: A method for stochastic optimization[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–15.