A Multi-scale-multi-input Complementation Classification Network for Fast Coding Tree Unit Partition
-
摘要: 深度神经网络(DNN)已被广泛应用到高效视频编码(HEVC)编码树单元(CTU)的深度划分中,显著降低了编码复杂度。然而现有的基于DNN的CTU深度划分方法却忽略了不同尺度编码单元(CU)间的特征相关性和存在着分类错误累积等缺陷。基于此,该文提出一种多尺度多输入的互补分类网络(MCCN)来实现更高效且更准确的HEVC帧内CTU深度划分。首先,提出一种多尺度多输入的卷积神经网络(MMCNN),通过融合不同尺度CU的特征来建立CU间的关联,进一步提升网络的表达能力。然后,提出一种互补的分类策略(CCS),通过结合二分类和三分类,并采用投票机制来决定CTU中每个CU的最终深度值,有效避免了现有方法中存在的错误累积效应,实现了更准确的CTU深度划分。大量的实验结果表明,该文所提MCCN能够更大程度降低HEVC编码的复杂度,同时实现更准确的CTU深度划分: 仅以增加3.18%的平均增量比特率(BD-BR)为代价,降低了71.49%的平均编码复杂度。同时,预测32×32 CU和16×16 CU的深度准确率分别提升了0.65%~0.93%和2.14%~9.27%。Abstract: Deep Neural Networks (DNN) have been widely applied to Coding Tree Unit(CTU) partition of intra-mode High Efficiency Video Coding(HEVC) for reducing the HEVC encoding complexity, however, existing DNN-based CTU partition methods always neglect the correlation of features between Coding Units (CU) at different scales and suffer from the accumulation of classification errors. Therefore, in this paper, a Multi-scale-multi-input Complementation Classification Network (MCCN) for faster and more accurate CTU partition is proposed. First, a Multi-scale Multi-input Convolutional Neural Network (MMCNN) is proposed, which builds up the correlation of features between CUs at different scales by fusing multi-scale CU features. Therefore, our MMCNN possess more powerful representation abilities. Second, a Complementary Classification Strategy (CCS) is proposed, in which the final depth prediction results for each CU are determined by combining the results of multi-classification with the results of binary classification and triplex classification with the voting mechanism. The proposed CCS avoids the accumulation of classification errors and achieves more accurate CTU partition. Extensive experiments demonstrate that our MCCN achieves lower HEVC encoding complexity and more accurate CTU partition: reduce the average encoding complexity by 71.49% only at the cost of a 3.18% average Bjøntegaard Delta Bit-Rate(BD-BR). And the average accuracies of 32×32 CU depth prediction and 16×16 CU depth prediction are increased by 0.65%~0.93% and 2.14%~9.27% respectively.
-
表 1 MMCNN的有效性消融实验(%)
模型 平均BD-BR 平均ΔT MCCN 3.18 71.49 MCCN-NoQP 9.80 58.60 MCCN-OneScale 11.50 64.30 表 2 消融实验:不同大小CU的深度划分平均准确率比较(%)
MCCN MCCN-SBCS 64 × 64 CU 90.30 88.05 32× 32 CU 87.55 86.51 16× 16 CU 89.69 85.71 表 3 MCCN和MCCN-SBC的平均BD-BR和平均ΔT比较(%)
模型 平均BD-BR 平均ΔT MCCN 3.18 71.49 MCCN-SBCS 7.86 67.00 表 4 平均准确率比较(其中最好的性能已加粗标记)(%)
-
[1] SULLIVAN G J, OHM J R, HAN W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. doi: 10.1109/TCSVT.2012.2221191. [2] WIEGAND T, SULLIVAN G J, BJONTEGAARD G, et al. Overview of the H. 264/AVC video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 560–576. doi: 10.1109/TCSVT.2003.815165. [3] POURAZAD M T, DOUTRE C, AZIMI M, et al. HEVC: The new gold standard for video compression: How does HEVC compare with H. 264/AVC?[J]. IEEE Consumer Electronics Magazine, 2012, 1(3): 36–46. doi: 10.1109/MCE.2012.2192754. [4] ZHAO Liang, FAN Xiaopeng, MA Siwei, et al. Fast intra-encoding algorithm for high efficiency video coding[J]. Signal Processing: Image Communication, 2014, 29(9): 935–944. doi: 10.1016/j.image.2014.06.008. [5] KIM N, JEON S, SHIM H J, et al. Adaptive keypoint-based CU depth decision for HEVC intra coding[C]. 2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Nara, Japan, 2016: 1–3. doi: 10.1109/BMSB.2016.7521923. [6] ZHANG Tao, SUN Mingting, ZHAO Debin, et al. Fast intra-mode and CU size decision for HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(8): 1714–1726. doi: 10.1109/TCSVT.2016.2556518. [7] JAMALI M and COULOMBE S. Fast HEVC intra mode decision based on RDO cost prediction[J]. IEEE Transactions on Broadcasting, 2019, 65(1): 109–122. doi: 10.1109/TBC.2018.2847464. [8] AMNA M, IMEN W, NACIR O, et al. SVM-Based method to reduce HEVC CU partition complexity[C]. 2022 19th International Multi-Conference on Systems, Signals & Devices (SSD), Sétif, Algeria, 2022: 480–484. doi: 10.1109/SSD54932.2022.9955731. [9] WERDA I, MARAOUI A, SAYADI F E, et al. Fast CU partition and intra mode prediction method for HEVC[C]. 2022 IEEE 9th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), Hammamet, Tunisia, 2022: 562–566. doi: 10.1109/SETIT54465.2022.9875798. [10] YU Xianyu, LIU Zhenyu, LIU Junjie, et al. VLSI friendly fast CU/PU mode decision for HEVC intra encoding: Leveraging convolution neural network[C]. 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, Canada, 2015: 1285–1289. doi: 10.1109/ICIP.2015.7351007. [11] LI Tianyi, XU Mai, and DENG Xin. A deep convolutional neural network approach for complexity reduction on intra-mode HEVC[C]. 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, China, 2017: 1255–1260. doi: 10.1109/ICME.2017.8019316. [12] XU Mai, LI Tianyi, WANG Zulin, et al. Reducing complexity of HEVC: A deep learning approach[J]. IEEE Transactions on Image Processing, 2018, 27(10): 5044–5059. doi: 10.1109/TIP.2018.2847035. [13] LI Huayu, WEI Geng, WANG Ting, et al. Reducing video coding complexity based on CNN-CBAM in HEVC[J]. Applied Sciences, 2023, 13(18): 10135. doi: 10.3390/app131810135. [14] QIN Liming, ZHU Zhongjie, BAI Yongqiang, et al. A complexity-reducing HEVC intra-mode method based on VGGNet[J]. Journal of Computers, 2022, 33(4): 57–67. doi: 10.53106/199115992022083304005. [15] FENG Aolin, GAO Changsheng, LI Li, et al. Cnn-based depth map prediction for fast block partitioning in HEVC intra coding[C]. 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 2021: 1–6. doi: 10.1109/ICME51207.2021.9428069. [16] HARI P, JADHAV V, and RAO B K N S. CTU partition for intra-mode HEVC using convolutional neural network[C]. 2022 IEEE International Symposium on Smart Electronic Systems (ISES), Warangal, India, 2022: 548–551. doi: 10.1109/iSES54909.2022.00120. [17] LORKIEWICZ M, STANKIEWICZ O, DOMANSKI M, et al. Fast selection of INTRA CTU partitioning in HEVC encoders using artificial neural networks[C]. 2021 Signal Processing Symposium (SPSympo), LODZ, Poland, 2021: 177–182. doi: 10.1109/SPSympo51155.2020.9593483. [18] FENG Zeqi, LIU Pengyu, JIA Kebin, et al. HEVC fast intra coding based CTU depth range prediction[C]. 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 2018: 551–555. doi: 10.1109/ICIVC.2018.8492898. [19] LI Yixiao, LI Lixiang, FANG Yuan, et al. Bagged tree and ResNet-based joint end-to-end fast CTU partition decision algorithm for video intra coding[J]. Electronics, 2022, 11(8): 1264. doi: 10.3390/electronics11081264. [20] IMEN W, AMNA M, FATMA B, et al. Fast HEVC intra-CU decision partition algorithm with modified LeNet-5 and AlexNet[J]. Signal, Image and Video Processing, 2022, 16(7): 1811–1819. doi: 10.1007/s11760-022-02139-w. [21] YAO Chao, XU Chenming, and LIU Meiqin. RDNet: Rate–distortion-based coding unit partition network for intra-prediction[J]. Electronics, 2022, 11(6): 916. doi: 10.3390/electronics11060916. [22] LINCK I, GOMEZ A T, and ALAGHBAND G. CNN quadtree depth decision prediction for block partitioning in HEVC intra-mode[C]. 2023 Data Compression Conference (DCC), Snowbird, USA, 2023: 352. doi: 10.1109/DCC55655.2023.00054. [23] AMNA M, IMEN W, and EZAHRA S F. Deep learning for intra frame coding[C]. 2021 International Conference on Engineering and Emerging Technologies (ICEET), Istanbul, Turkey, 2021: 1–4. doi: 10.1109/ICEET53442.2021.9659742. [24] 贾克斌, 崔腾鹤, 刘鹏宇, 等. 基于深层特征学习的高效率视频编码中帧内快速预测算法[J]. 电子与信息学报, 2021, 43(7): 2023–2031. doi: 10.11999/JEIT200414.JIA Kebin, CUI Tenghe, LIU Pengyu, et al. Fast prediction algorithm in high efficiency video coding intra-mode based on deep feature learning[J]. Journal of Electronics & Information Technology, 2021, 43(7): 2023–2031. doi: 10.11999/JEIT200414. [25] ZUO Yanchen, GAO Changsheng, LIU Dong, et al. Learned rate-distortion cost prediction for ultrafast screen content intra coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(3): 1976–1980. doi: 10.1109/TCSVT.2023.3296515. [26] WU Yi and CHEN Lei. Fast algorithm for HEVC using frequency-based convolutional neural networks[C]. 2023 3rd International Conference on Electronic Information Engineering and Computer (EIECT), Shenzhen, China, 2023: 559–563. doi: 10.1109/EIECT60552.2023.10442731.