A Cross-Precision Motion Compensation Technique for Security Surveillance Video Coding

JIANG Wei; MA Wei; LU Jinghui; ZHANG Yue; ZHANG Yundong

doi:10.11999/JEIT251301

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2026 >

JIANG Wei, MA Wei, LU Jinghui, ZHANG Yue, ZHANG Yundong. A Cross-Precision Motion Compensation Technique for Security Surveillance Video Coding[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251301

Citation:

JIANG Wei, MA Wei, LU Jinghui, ZHANG Yue, ZHANG Yundong. A Cross-Precision Motion Compensation Technique for Security Surveillance Video Coding[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251301

Citation:

PDF( 4758 KB)

A Cross-Precision Motion Compensation Technique for Security Surveillance Video Coding

doi: 10.11999/JEIT251301 cstr: 32379.14.JEIT251301

JIANG Wei^{1, 2, 3},
MA Wei^{2, 3},
LU Jinghui^{2, 3},
ZHANG Yue^{1, 3},
ZHANG Yundong^{1, 2, 3
,
,}

1.
School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
2.
Beijing Vimicro Artificial Intelligence Chip Technology Co., Ltd, Beijing 100191, China
3.
National Key Laboratory of Digital Perception Chip Technology, Beijing 100191, China

Funds: Item1, Item2, Item3

Accepted Date: 2026-03-27
Rev Recd Date: 2026-03-27

Available Online: 2026-04-21

Abstract

Abstract

Objective In the field of modern security surveillance, high-altitude dome cameras are often deployed at critical locations such as bridges and tower tops that are susceptible to external interference, resulting in problems such as jitter and blurring in captured videos, which pose great challenges to video coding. In video compression coding, high-precision motion compensation is the key to improving coding efficiency. The existing Ultimate Motion Vector Expression (UMVE) technique suffers from insufficient precision and lack of flexibility in adaptive adjustment. Although high-precision coding tools such as Registration-Based Coding Mode (RCM) and Affine Motion Compensation Prediction (AFFINE) can improve compensation accuracy, they have disadvantages of high computational complexity and hardware cost, making it difficult to meet the multiple requirements of coding efficiency, power consumption and real-time performance in high-altitude surveillance scenarios. Therefore, aiming at the core pain points of video coding for high-altitude dome cameras, it is of important academic value and practical application significance to design an optimized UMVE scheme that combines high-precision motion compensation, low computational complexity and scene adaptability, so as to improve coding efficiency and balance resource consumption. Methods This study proposes an Ultimate Motion Vector Expression technique supporting Cross-Precision Motion Compensation (UMVE_CPMC). Its core is to improve motion compensation accuracy by constructing an extended Up-Precision Motion Vector (UPMV), whose mathematical expression is UPMV = BaseMV + MMV(p, angle), where BaseMV is the basic motion vector obtained by the existing UMVE method, and MMV is the refined fine-tuning motion vector based on specific precision p and angle, with incremental candidates only provided at the 1/8 precision level to balance computational complexity and compression efficiency. For step-size adaptive adjustment, an improved scheme with six modes is proposed, covering enhanced UMVE, conventional UMVE and four precision-improved modes, allowing the encoder to switch flexibly according to scene characteristics. The average image gradient is adopted as an objective evaluation index; test scenes are divided into Class A (high-definition motion scenes) and Class B (low-definition scenes), and different coding configurations, sequences and parameters are set to compare coding gains and computational efficiency under different modes. Results and Discussions Experiments show that UMVE_CPMC achieves effective performance improvement in various scenes and modes. In Class A high-definition motion scenes, with the adaptive strategy disabled and RCM disabled, the average gains of Y, U and V components in Fusion Mode 1 reach -2.912%, -1.656% and -1.654% respectively, and the average coding time is reduced to 94.55% of the baseline; the average gain of the Y component in Independent Mode 1 reaches -2.925%, with coding time reduced to 91.91% of the baseline. Compared with traditional UMVE, when CPMC Independent Mode 1 is enabled under the scenario where RCM is enabled and other tools work collaboratively, the gain is improved from -0.276% to -1.310%, showing significantly higher cost performance. In Class B low-definition scenes, after enabling adaptive adjustment, the gain losses of Fusion Mode 1 and Mode 0 are significantly reduced, with average gain losses controlled at 0.071% and 0.108% respectively, successfully maintaining the original coding gain. In multi-scene comprehensive tests, when RCM and AFFINE are disabled, 9 out of 10 test sequences in adaptive Fusion Mode 1 show positive gains, including a Y-component gain of -10.691% for the yuxuedaolu sequence and -11.400% for the BQTerrace sequence. When all existing coding tools are enabled, the Y-component gains of dianjing, yuxuedaolu and BQTerrace sequences reach -1.29%, -2.05% and -1.21% respectively, with coding time reduced to 94%–96% of the baseline. In addition, correlation analysis between average image gradient and gain reveals a significant positive correlation: images with high average gradient (high definition) achieve greater gains from UMVE_CPMC, while those with low average gradient (low definition) hardly benefit. Principle analysis indicates that pixel changes in low-definition images are gentle, and high-precision interpolation fails to generate effective pixel values, resulting in insignificant compensation effects. Performance differences among modes match computational complexity: the fusion mode balances gain and stability, while the independent mode further reduces computation. The six step-size adaptive modes can meet real-time and precision requirements of different scenes. Conclusions The proposed UMVE_CPMC technique, by integrating cross-precision motion compensation with the UMVE algorithm, effectively solves the core problems of insufficient precision in traditional UMVE and high computational complexity of high-precision coding tools, achieving a favorable balance among coding efficiency, computational complexity and scene adaptability. This technique delivers remarkable coding gains in Class A high-definition motion scenes, with gains exceeding 10% for some sequences without other high-precision compensation tools and 1%–2% when cooperating with other tools. In Class B low-definition scenes, the original coding gain can be maintained through frame-level adaptive adjustment interfaces. Meanwhile, the fusion mode does not increase hardware complexity, and the independent mode significantly reduces coding time, suitable for encoder designs with limited resources or simplified requirements. UMVE_CPMC provides a new effective approach to solving the low coding efficiency caused by jitter and blurring in high-altitude dome camera video coding, enriches the video coding toolset, and offers important practical guidance for the optimization of video coding technologies in the security surveillance field. Future work can further optimize the adaptive strategy, explore integration with other advanced coding technologies, develop personalized coding schemes, and improve performance in complex scenarios.
- Video Coding and Decoding,
- Cross-Precision Motion Compensation,
- Motion Vector Expression,
- High-Altitude Dome Camera

FullText(HTML)

References(18)

References

[1]	GAO Wen and MA Siwei. Video coding optimization and application system[M]. GAO Wen and MA Siwei. Advanced Video Coding Systems. Switzerland: Springer, 2014: 161–176. doi: 10.1007/978-3-319-14243-2_9.
[2]	ZHU Xizhong, XIANG Guoqing, ZHANG Peng, et al. A hardware-efficient unified motion estimation for video coding[C]. Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, Canada, 2023: 9042–9050. doi: 10.1145/3581783.3613816.
[3]	HUANG Qian, LU Hao, LIU Wenting, et al. Scalable motion estimation and temporal context reinforcement for video compression using RGB sensors[J]. IEEE Sensors Journal, 2025, 25(10): 18323–18333. doi: 10.1109/JSEN.2025.3550525.
[4]	MARPE D, WIEGAND T, and SULLIVAN G J. The H. 264/MPEG4 advanced video coding standard and its applications[J]. IEEE Communications Magazine, 2006, 44(8): 134–143. doi: 10.1109/MCOM.2006.1678121.
[5]	申滨, 李旋, 赖雪冰, 等. 基于Swin Transformer的宽带无线图传语义联合编解码方法[J]. 电子与信息学报, 2025, 47(8): 2665–2674. doi: 10.11999/JEIT250039. SHEN Bin, LI Xuan, LAI Xuebing, et al. Swin Transformer-based wideband wireless image transmission semantic joint encoding and decoding method[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2665–2674. doi: 10.11999/JEIT250039.
[6]	CHIEN W J, ZHANG Li, WINKEN M, et al. Motion vector coding and block merging in the versatile video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(10): 3848–3861. doi: 10.1109/TCSVT.2021.3101212.
[7]	BROSS B, WANG Yekui, YE Yan, et al. Overview of the versatile video coding (VVC) standard and its applications[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(10): 3736–3764. doi: 10.1109/TCSVT.2021.3101953.
[8]	KAJI S and OCHIAI H. A concise parametrization of affine transformation[J]. SIAM Journal on Imaging Sciences, 2016, 9(3): 1355–1373. doi: 10.1137/16M1056936.
[9]	LI Li, LI Houqiang, LIU Dong, et al. An efficient four-parameter affine motion model for video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(8): 1934–1948. doi: 10.1109/TCSVT.2017.2699919.
[10]	MEUEL H, FERENZ S, LIU Yiqun, et al. Rate-distortion theory for affine global motion compensation in video coding[C]. 2018 25th IEEE International Conference on Image Processing, Athens, Greece, 2018: 3593–3597. doi: 10.1109/ICIP.2018.8451136.
[11]	VIANA R, LOOSE M, FERREIRA R, et al. A hardware-friendly acceleration of VVC affine motion estimation using decision trees[C]. 2024 37th SBC/SBMicro/IEEE Symposium on Integrated Circuits and Systems Design, Joao Pessoa, Brazil, 2024: 1–5. doi: 10.1109/SBCCI62366.2024.10703987.
[12]	ZHOU Chuan, LV Zhuoyi, PIAO Yinji, et al. Adaptive motion vector resolution in AVS3 Standard[C]. 2020 IEEE International Conference on Multimedia & Expo Workshops, London, UK, 2020: 1–4. doi: 10.1109/ICMEW46912.2020.9106046.
[13]	SULLIVAN G J, OHM J R, HAN W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. doi: 10.1109/TCSVT.2012.2221191.
[14]	CHEN Shushi, HUANG Leilei, ZAN Zhao, et al. Affine motion estimation hardware implementation with 51.7%/67.5% internal bandwidth reduction for versatile video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2025, 35(4): 3837–3852. doi: 10.1109/TCSVT.2024.3507375.
[15]	CHEN Shushi, HUANG Leilei, LIU Jiahao, et al. An error-surface-based fractional motion estimation algorithm and hardware implementation for VVC[C]. 2023 IEEE International Symposium on Circuits and Systems, Monterey, USA, 2023: 1–5. doi: 10.1109/ISCAS46773.2023.10182170.
[16]	ZHU Xizhong, XIANG Guoqing, HUANG Xiaofeng, et al. A hardware-friendly CTU-level IME Algorithm for VVC[C]. 2023 Data Compression Conference, Snowbird, USA, 2023: 110–119. doi: 10.1109/DCC55655.2023.00019.
[17]	盛庆华, 陶泽浩, 黄小芳, 等. 一种面向AV1粗模式决策的高吞吐量硬件设计方法[J]. 电子与信息学报, 2025, 47(4): 1202–1214. doi: 10.11999/JEIT240823. SHENG Qinghua, TAO Zehao, HUANG Xiaofang, et al. A high-throughput hardware design for AV1 rough mode decision[J]. Journal of Electronics & Information Technology, 2025, 47(4): 1202–1214. doi: 10.11999/JEIT240823.
[18]	宋赛, 崔昭, 詹尹僧, 等. 面向深度神经网络图像压缩的高性能算术编码硬件设计[J]. 电子与信息学报, 2025, 47(9): 3230–3240. doi: 10.11999/JEIT250509. SONG Sai, CUI Zhao, ZHAN Yinseng, et al. High-performance hardware design of arithmetic coding for deep neural network-based image compression[J]. Journal of Electronics & Information Technology, 2025, 47(9): 3230–3240. doi: 10.11999/JEIT250509.