基于深层特征学习的高效率视频编码中帧内快速预测算法

贾克斌; 崔腾鹤; 刘鹏宇; 刘畅

doi:10.11999/JEIT200414

基于深层特征学习的高效率视频编码中帧内快速预测算法

doi: 10.11999/JEIT200414

1.
北京工业大学信息学部北京 100124
2.
先进信息网络北京实验室北京 100124

基金项目: 国家自然科学基金(61672064)，国家重点研发计划(2018YFF01010100)，青海省基础研究计划(2020-ZJ-709)

详细信息

作者简介:
贾克斌：男，1962年生，教授，研究方向为多媒体信息系统、模式识别

崔腾鹤：男，1996年生，硕士生，研究方向为视频编码

刘鹏宇：女，1979年生，副教授，研究方向为多媒体信息系统

刘畅：女，1994年生，博士生，研究方向为3D视频编码

通讯作者:
贾克斌　kebinj@bjut.edu.cn

中图分类号: TN919.81
计量
- 文章访问数: 1292
- HTML全文浏览量: 425
- PDF下载量: 97
- 被引次数: 0
出版历程
- 收稿日期: 2020-05-26
- 修回日期: 2020-12-15
- 网络出版日期: 2021-01-05
- 刊出日期: 2021-07-10

Fast Prediction Algorithm in High Efficiency Video Coding Intra-mode Based on Deep Feature Learning

1.
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
2.
Beijing Laboratory of Advanced Information Networks, Beijing 100124, China

Funds: The National Natural Science Foundation of China (61672064), The National Key Research and Development Project of China (2018YFF01010100), The Basic Research Program of Qinghai Province (2020-ZJ-709)

摘要

摘要: 高效视频编码(HEVC)标准相对于H.264/AVC标准提升了压缩效率，但由于引入的编码单元四叉树划分结构也使得编码复杂度大幅度提升。对此，该文提出一种针对HEVC帧内编码模式下编码单元(CU)划分表征矢量预测的多层特征传递卷积神经网络(MLFT-CNN)，大幅度降低了视频编码复杂度。首先，提出融合CU划分结构信息的降分辨率特征提取模块；其次，改进通道注意力机制以提升特征的纹理表达性能；再次，设计特征传递机制，用高深度编码单元划分特征指导低深度编码单元的划分；最后建立分段特征表示的目标损失函数，训练端到端的CU划分表征矢量预测网络。实验结果表明，在不影响视频编码质量的前提下，该文所提算法有效地降低了HEVC的编码复杂度，与标准方法相比，编码复杂度平均下降了70.96%。
- 高效视频编码 /
- 复杂度降低 /
- 深度学习 /
- 帧内编码
Abstract: Compared to H.264/AVC coding standard, High Efficiency Video Coding (HEVC) improves the compression efficiency, but the consequent disadvantage is the significant increase in encoding complexity by using the quad-tree partition. A Multi-Layer Feature Transfer Convolutional Neural Network (MLFT-CNN) for Coding Unit (CU) division and characterization vector prediction in HEVC intra coding mode is proposed, which greatly reduces the complexity of video coding. Firstly, a reduced-resolution feature extraction module incorporating CU partition structure information is proposed. Then, the channel attention mechanism is improved for a better texture expression performance of the feature. After that, the feature transfer mechanism is designed to use the feature division of high-depth coding unit to guide the division of low-depth coding unit. Finally, the target loss function represented by the segmented feature is established, and the end-to-end CU division represents the vector prediction network. The experimental results show that the proposed algorithm effectively reduces the encoding complexity of HEVC without affecting the video coding quality. Specifically, compared to the standard method, the encoding complexity on the standard test sequence is reduced by 70.96% on average.
- High Efficiency Video Coding(HEVC) /
- Complexity reduction /
- Deep learning /
- Intra coding

HTML全文

图 1 CTU划分结构示意图

下载: 全尺寸图片幻灯片

图 2 父CU和子CU之间率失真代价计算和比较过程

下载: 全尺寸图片幻灯片

图 3 CU划分表征矢量对应位置示意图

下载: 全尺寸图片幻灯片

图 4 整体网络模型图

下载: 全尺寸图片幻灯片

图 5 标准测试序列编码性能对比图

下载: 全尺寸图片幻灯片

表 1 图像测试序列结果

训练集	分辨率	方法	BD-BR(%)	BD-PSNR(dB)	$\Delta T$(%)
训练集	分辨率	方法	BD-BR(%)	BD-PSNR(dB)	QP=22	QP=27	QP=32	QP=37
CPH-Intra	768×512	文献[9]	5.113	–0.343	–59.43	–54.70	–48.74	–44.83
		文献[14]	2.885	–0.210	–54.97	–58.78	–61.78	–64.41
		本文算法	1.71	–0.116	–65.18	–72.01	–72.07	–74.83
	1536×1024	文献[9]	6.002	–0.374	–58.94	–54.85	–50.57	–50.95
		文献[14]	3.134	–0.208	–55.84	–59.46	–62.43	–64.17
		本文算法	1.63	–0.113	–66.98	–72.21	–71.10	–74.18
	2880×1920	文献[9]	4.035	–0.207	–57.03	–52.79	–52.31	–59.51
		文献[14]	2.130	–0.115	–59.95	–63.14	–68.07	–69.46
		本文算法	1.3278	–0.075	–70.47	–74.67	–75.82	–77.59
	4928×3264	文献[9]	4.630	–0.209	–58.02	–62.74	–65.30	–67.46
		文献[14]	1.863	–0.086	–61.43	–65.27	–68.70	–71.00
		本文算法	2.085	0.080	–71.42	–74.89	–78.11	–79.63
标准差		文献[9]	0.831	0.088	1.06	4.41	7.52	9.89
		文献[14]	0.604	0.064	3.13	3.07	3.64	3.49
		本文算法	0.312	0.093	2.93	1.55	3.27	2.53
最优值		文献[9]	4.035	–0.207	–59.43	–62.74	–65.30	–67.46
		文献[14]	1.863	–0.086	–61.43	–65.27	–68.70	–71.00
		本文算法	1.328	–0.075	–71.42	–74.89	–78.11	–79.63
平均值		文献[9]	4.945	–0.284	–58.36	–56.27	–54.23	–55.69
		文献[14]	2.353	–0.155	–58.05	–61.66	–65.25	–67.26
		本文算法	1.688	–0.096	–68.51	–73.45	–74.28	–76.56

下载: 导出CSV

表 2 HEVC标准测试序列结果

类别	序列名称	方法	BD-BR(%)	BD-PSNR(dB)	$\Delta T$(%)
类别	序列名称	方法	BD-BR(%)	BD-PSNR(dB)	QP=22	QP=27	QP=32	QP=37
A	PeopleOnStreeet	文献[9]	9.627	–0.492	–52.12	–50.63	–37.79	–34.81
		文献[14]	3.969	–0.209	–50.79	–53.87	–56.58	–61.15
		本文算法	3.679	–0.216	–63.91	–67.38	–68.78	–70.86
	Traffic	文献[9]	6.411	–0.304	–37.11	–25.36	–19.63	–33.38
		文献[14]	4.945	–0.240	–53.86	–59.08	–63.54	–66.88
		本文算法	3.225	–0.178	–75.45	–77.96	–79.7	–81.12
B	Cactus	文献[9]	7.533	–0.248	–38.37	–40.83	–43.61	–51.23
		文献[14]	6.021	–0.208	–58.18	–61.01	–64.94	–67.78
		本文算法	3.634	–0.141	–69.24	–74.67	–74.12	–73.69
	ParkScene	文献[9]	3.630	–0.149	–41.69	–44.79	–59.98	–64.92
		文献[14]	3.417	–0.135	–60.27	–65.10	–68.57	–70.16
		本文算法	2.561	–0.113	–65.03	–70.62	–70.45	–71.46
C	BQMall	文献[9]	9.646	–0.486	–52.62	–42.97	–35.52	–37.12
		文献[14]	8.077	–0.468	–47.08	–51.15	–53.26	–57.05
		本文算法	6.14	–0.395	–62.09	–65.89	–65.86	–69.1
	RaceHorses	文献[9]	7.220	–0.379	–46.46	–40.13	–41.49	–50.28
		文献[14]	4.422	–0.264	–50.52	–59.30	–59.81	–63.15
		本文算法	3.228	–0.217	–64.44	–71.22	–70.17	–72.47
D	BasketballPass	文献[9]	10.054	–0.546	–43.69	–41.03	–37.46	–36.69
		文献[14]	8.401	–0.457	–60.24	–62.89	–64.31	–66.67
		本文算法	4.489	–0.264	–74.99	–77.29	–77.81	–79.36
	BlowingBubbles	文献[9]	6.178	–0.373	–57.15	–42.45	–25.73	–22.81
		文献[14]	8.328	–0.463	–54.62	–60.45	–62.55	–65.48
		本文算法	5.217	–0.315	–61.68	–65.97	–62.99	–66.43
E	FourPeople	文献[9]	9.077	–0.480	–53.52	–40.88	–26.12	–24.34
		文献[14]	8.002	–0.439	–54.79	–59.79	–64.39	–67.17
		本文算法	4.298	–0.258	–65.21	–69.51	–70.94	–71.98
	Johnny	文献[9]	12.182	–0.474	–58.29	–60.21	–63.98	–70.70
		文献[14]	7.956	–0.307	–62.92	–65.51	–67.71	–70.05
		本文算法	4.162	–0.176	–72.02	–74.84	–75.35	–76.12
方差		文献[9]	2.444	0.127	7.68	8.76	14.24	16.18
		文献[14]	2.013	0.127	5.04	4.51	4.78	4.08
		本文算法	1.048	0.084	5.16	4.48	5.20	4.50
最优值		文献[9]	3.63	–0.149	–58.29	–60.21	–63.98	–70.70
		文献[14]	3.417	–0.135	–62.92	–65.51	–68.57	–70.16
		本文算法	2.561	–0.113	–75.45	–77.96	–79.7	–81.12
平均值		文献[9]	8.156	–0.393	–48.10	–42.93	–39.13	–42.63
		文献[14]	6.354	–0.319	–55.33	–59.82	–62.57	–65.56
		本文算法	4.063	–0.227	–67.41	–71.54	–71.62	–73.26

下载: 导出CSV

参考文献(20)

[1]	王莉, 曹一凡, 杜高明, 等. 一种低延迟的3维高效视频编码中深度建模模式编码器[J]. 电子与信息学报, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798 WANG Li, CAO Yifan, DU Gaoming, et al. A low-latency depth modelling mode-1 encoder in 3D-high efficiency video coding standard[J]. Journal of Electronics &Information Technology, 2019, 41(7): 1625–1632. doi: 10.11999/JEIT180798
[2]	WIEGAND T, SULLIVAN G J, BJONTEGAARD G, et al. Overview of the H.264/AVC video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 560–576. doi: 10.1109/tcsvt.2003.815165
[3]	KIM I K, MIN J, LEE T, et al. Block partitioning structure in the HEVC standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1697–1706. doi: 10.1109/TCSVT.2012.2223011
[4]	JCT-VC. HM Software[EB/OL]. https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.5/, 2014.
[5]	CORREA G, ASSUNCAO P, AGOSTINI L, et al. Performance and computational complexity assessment of high-efficiency video encoders[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1899–1909. doi: 10.1109/TCSVT.2012.2223411
[6]	齐美彬, 陈秀丽, 杨艳芳, 等. 高效率视频编码帧内预测编码单元划分快速算法[J]. 电子与信息学报, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148 QI Meibin, CHEN Xiuli, YANG Yanfang, et al. Fast coding unit splitting algorithm for high efficiency video coding intra prediction[J]. Journal of Electronics &Information Technology, 2014, 36(7): 1699–1705. doi: 10.3724/SP.J.1146.2013.01148
[7]	汤进, 彭勇. 基于时空相关与纹理特性的HEVC编码单元快速划分算法[J]. 计算机与数字工程, 2019, 47(7): 1753–1756, 1782. doi: 10.3969/j.issn.1672-9722.2019.07.038 TANG Jin and PENG Yong. Fast coding unit partition algorithm for HEVC based on temporal-spatial correlation and texture property[J]. Computer and Digital Engineering, 2019, 47(7): 1753–1756, 1782. doi: 10.3969/j.issn.1672-9722.2019.07.038
[8]	BOUAAFIA S, KHEMIRI R, SAYADI F E, et al. Fast CU partition-based machine learning approach for reducing HEVC complexity[J]. Journal of Real-Time Image Processing, 2020, 17(1): 185–196. doi: 10.1007/s11554-019-00936-0
[9]	LIU Deyuan, LIU Xingang, and LI Yayong. Fast CU size decisions for HEVC intra frame coding based on support vector machines[C]. 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress, Auckland, New Zealand, 2016: 594–597.
[10]	LIU Xingang, LI Yayong, LIU Deyuan, et al. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(1): 144–155. doi: 10.1109/TCSVT.2017.2777903
[11]	FENG Zeqi, LIU Pengyu, JIA Kebin, et al. Fast intra CTU depth decision for HEVC[J]. IEEE Access, 2018, 6: 45262–45269. doi: 10.1109/ACCESS.2018.2864881
[12]	LIU Zhenyu, YU Xianyu, CHEN Shaolin, et al. CNN oriented fast HEVC intra CU mode decision[C]. 2016 IEEE International Symposium on Circuits and Systems, Montreal, Canada, 2016: 2270–2273.
[13]	LI Xin and GONG Na. Run-time deep learning enhanced fast coding unit decision for high efficiency video coding[J]. Journal of Circuits, Systems and Computers, 2020, 29(3): 2050046. doi: 10.1142/S0218126620500462
[14]	LIU Zhenyu, YU Xianyu, GAO Yuan, et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network[J]. IEEE Transactions on Image Processing, 2016, 25(11): 5088–5103. doi: 10.1109/tip.2016.2601264
[15]	HU Jie, SHEN Li, SUN Gang, et al. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
[16]	XU Mai, DENG Xin, LI Shengxi, et al. Region-of-interest based conversational HEVC coding with hierarchical perception model of face[J]. IEEE Journal of Selected Topics in Signal Processing, 2014, 8(3): 475–489. doi: 10.1109/jstsp.2014.2314864
[17]	XU Mai, LI Tianyi, WANG Zulin, et al. Reducing complexity of HEVC: A deep learning approach[J]. IEEE Transactions on Image Processing, 2018, 27(10): 5044–5059. doi: 10.1109/TIP.2018.2847035
[18]	OHM J R, SULLIVAN G J, SCHWARZ H, et al. Comparison of the coding efficiency of video coding standards—including high efficiency video coding (HEVC)[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1669–1684. doi: 10.1109/tcsvt.2012.2221192
[19]	BJONTEGARD G. Calculation of average PSNR differences between RD-curves[C]. The 13th Video Coding Experts Group Meeting, Austin, USA, 2001: VCEG-M33.
[20]	KINGMA D P and BA J. Adam: A method for stochastic optimization[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–15.