一种面向安防监控视频编解码的跨精度运动补偿技术

姜伟; 马伟; 卢京辉; 张悦; 张韵东

doi:10.11999/JEIT251301

一种面向安防监控视频编解码的跨精度运动补偿技术

doi: 10.11999/JEIT251301 cstr: 32379.14.JEIT251301

姜伟^{1, 2, 3},
马伟^{2, 3},
卢京辉^{2, 3},
张悦^{1, 3},
张韵东^{1, 2, 3, ,}

1.
北京航空航天大学北京 100191
2.
北京中星微人工智能芯片技术有限公司北京 100191
3.
数字感知芯片技术全国重点实验室北京 100191

基金项目: 基金1，基金2，基金3 (国防科工、军事、装备预研等基金不要注明)

详细信息

作者简介:
姜伟：男，博士生，研究方向为视频编解码算法与体系结构优化

马伟：男，研究方向为视频分析算法、视频编解码算法

卢京辉：男，工程师，研究方向为数字信号处理、视音频编解码

张悦：男，教授，研究方向为自旋电子学、自旋存算一体器件、超低功耗集成电路设计、新型计算逻辑系统

张韵东：男，正高级工程师，研究方向为数字感知芯片技术

通讯作者:
张韵东　raymond@vimicro.com

中图分类号: TN919.81
计量
- 文章访问数: 135
- HTML全文浏览量: 62
- PDF下载量: 8
- 被引次数: 0
出版历程
- 修回日期: 2026-03-27
- 录用日期: 2026-03-27
- 网络出版日期: 2026-04-21

A Cross-Precision Motion Compensation Technique for Security Surveillance Video Coding

JIANG Wei^{1, 2, 3},
MA Wei^{2, 3},
LU Jinghui^{2, 3},
ZHANG Yue^{1, 3},
ZHANG Yundong^{1, 2, 3
, ,}

1.
School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
2.
Beijing Vimicro Artificial Intelligence Chip Technology Co., Ltd, Beijing 100191, China
3.
National Key Laboratory of Digital Perception Chip Technology, Beijing 100191, China

Funds: Item1, Item2, Item3

摘要

摘要: 在现代安防监控领域，高空球型摄像机因部署位置易受外部干扰，导致视频画面出现抖动、模糊等问题，严重影响监控效果与后期分析精度。视频压缩算法中，高精度运动补偿对提升编码效率至关重要，而当前的终极运动矢量表达(UMVE)技术存在精度和自适应调整不足等问题，图像配准编码模式(RCM)等虽能实现高精度运动补偿，但计算量和成本过高。针对这些问题，该研究提出了支持跨精度运动补偿的终极运动矢量表达技术(UMVE_CPMC)，该技术融合基础运动矢量(BaseMV)与精细化微调运动矢量(MMV)，通过构造扩展的升精度运动矢量(UPMV)提升运动补偿精度，且仅在1/8精度级别提供增量候选，实现计算复杂度与压缩效率的平衡。在步长自适应调整方面，提出6种模式的改进方案，编码器可根据场景灵活切换，以适应不同应用需求。实验表明，UMVE_CPMC在A类高清晰度运动场景下，编码增益显著，同时开启其他高精度运动补偿工具时，部分序列增益达1%–2%，在无其它高精度运动补偿工具时，部分序列增益超10%；在B类低清晰度场景下，通过帧级别自适应调整接口维持原有增益。此外，该技术在计算效率与资源占用间实现良好平衡，为解决高空球型摄像机视频编码问题提供了新的有效途径。
- 视频编解码 /
- 跨精度运动补偿 /
- 运动矢量表达 /
- 高空球形摄像机
Abstract: Objective In the field of modern security surveillance, high-altitude dome cameras are often deployed at critical locations such as bridges and tower tops that are susceptible to external interference, resulting in problems such as jitter and blurring in captured videos, which pose great challenges to video coding. In video compression coding, high-precision motion compensation is the key to improving coding efficiency. The existing Ultimate Motion Vector Expression (UMVE) technique suffers from insufficient precision and lack of flexibility in adaptive adjustment. Although high-precision coding tools such as Registration-Based Coding Mode (RCM) and Affine Motion Compensation Prediction (AFFINE) can improve compensation accuracy, they have disadvantages of high computational complexity and hardware cost, making it difficult to meet the multiple requirements of coding efficiency, power consumption and real-time performance in high-altitude surveillance scenarios. Therefore, aiming at the core pain points of video coding for high-altitude dome cameras, it is of important academic value and practical application significance to design an optimized UMVE scheme that combines high-precision motion compensation, low computational complexity and scene adaptability, so as to improve coding efficiency and balance resource consumption. Methods This study proposes an Ultimate Motion Vector Expression technique supporting Cross-Precision Motion Compensation (UMVE_CPMC). Its core is to improve motion compensation accuracy by constructing an extended Up-Precision Motion Vector (UPMV), whose mathematical expression is UPMV = BaseMV + MMV(p, angle), where BaseMV is the basic motion vector obtained by the existing UMVE method, and MMV is the refined fine-tuning motion vector based on specific precision p and angle, with incremental candidates only provided at the 1/8 precision level to balance computational complexity and compression efficiency. For step-size adaptive adjustment, an improved scheme with six modes is proposed, covering enhanced UMVE, conventional UMVE and four precision-improved modes, allowing the encoder to switch flexibly according to scene characteristics. The average image gradient is adopted as an objective evaluation index; test scenes are divided into Class A (high-definition motion scenes) and Class B (low-definition scenes), and different coding configurations, sequences and parameters are set to compare coding gains and computational efficiency under different modes. Results and Discussions Experiments show that UMVE_CPMC achieves effective performance improvement in various scenes and modes. In Class A high-definition motion scenes, with the adaptive strategy disabled and RCM disabled, the average gains of Y, U and V components in Fusion Mode 1 reach -2.912%, -1.656% and -1.654% respectively, and the average coding time is reduced to 94.55% of the baseline; the average gain of the Y component in Independent Mode 1 reaches -2.925%, with coding time reduced to 91.91% of the baseline. Compared with traditional UMVE, when CPMC Independent Mode 1 is enabled under the scenario where RCM is enabled and other tools work collaboratively, the gain is improved from -0.276% to -1.310%, showing significantly higher cost performance. In Class B low-definition scenes, after enabling adaptive adjustment, the gain losses of Fusion Mode 1 and Mode 0 are significantly reduced, with average gain losses controlled at 0.071% and 0.108% respectively, successfully maintaining the original coding gain. In multi-scene comprehensive tests, when RCM and AFFINE are disabled, 9 out of 10 test sequences in adaptive Fusion Mode 1 show positive gains, including a Y-component gain of -10.691% for the yuxuedaolu sequence and -11.400% for the BQTerrace sequence. When all existing coding tools are enabled, the Y-component gains of dianjing, yuxuedaolu and BQTerrace sequences reach -1.29%, -2.05% and -1.21% respectively, with coding time reduced to 94%–96% of the baseline. In addition, correlation analysis between average image gradient and gain reveals a significant positive correlation: images with high average gradient (high definition) achieve greater gains from UMVE_CPMC, while those with low average gradient (low definition) hardly benefit. Principle analysis indicates that pixel changes in low-definition images are gentle, and high-precision interpolation fails to generate effective pixel values, resulting in insignificant compensation effects. Performance differences among modes match computational complexity: the fusion mode balances gain and stability, while the independent mode further reduces computation. The six step-size adaptive modes can meet real-time and precision requirements of different scenes. Conclusions The proposed UMVE_CPMC technique, by integrating cross-precision motion compensation with the UMVE algorithm, effectively solves the core problems of insufficient precision in traditional UMVE and high computational complexity of high-precision coding tools, achieving a favorable balance among coding efficiency, computational complexity and scene adaptability. This technique delivers remarkable coding gains in Class A high-definition motion scenes, with gains exceeding 10% for some sequences without other high-precision compensation tools and 1%–2% when cooperating with other tools. In Class B low-definition scenes, the original coding gain can be maintained through frame-level adaptive adjustment interfaces. Meanwhile, the fusion mode does not increase hardware complexity, and the independent mode significantly reduces coding time, suitable for encoder designs with limited resources or simplified requirements. UMVE_CPMC provides a new effective approach to solving the low coding efficiency caused by jitter and blurring in high-altitude dome camera video coding, enriches the video coding toolset, and offers important practical guidance for the optimization of video coding technologies in the security surveillance field. Future work can further optimize the adaptive strategy, explore integration with other advanced coding technologies, develop personalized coding schemes, and improve performance in complex scenarios.
- Video Coding and Decoding /
- Cross-Precision Motion Compensation /
- Motion Vector Expression /
- High-Altitude Dome Camera

HTML全文

图 1 UMVE_CPMC核心原理

下载: 全尺寸图片幻灯片

图 2 局部16×16宏块对应的亮度数据块示例

下载: 全尺寸图片幻灯片

表 1 模式步长自适应调整改进方案

模式	现有方案		改进方案
模式	步长范围	总候选数目	步长范围	总候选数目
增强UMVE	[1/4, 32]	64	[1/4, 32]	64
普通UMVE	[1/4, 4]	40	[1/4, 4]	40
精度提升模式(融合模式1)	不支持	不支持	[1/8, 4]	40
精度提升模式(融合模式0)	不支持	不支持	[1/8, 2]	40
精度提升模式(独立模式1)	不支持	不支持	[1/8, 1/8]	16
精度提升模式(独立模式0)	不支持	不支持	[1/8, 1/8]	8

下载: 导出CSV

表 2 平均梯度作为图像清晰度评价指标的有效性

序列名称	图像平均梯度	重建图像平均梯度					CPMC增益(%)
序列名称	input	rec(q33)	rec(q40)	rec(q47)	rec(q52)	平均	CPMC增益(%)
dianjing	13.6	11.6	10.7	9.7	8.8	10.2	–1.29
yuxuedaolu	23.8	20.2	18.3	15.7	13.4	16.9	–2.05
BQTerrace	13.1	8.1	7.5	7	6.6	7.3	–1.21
qiaoxialuduan1	18.3	14.6	13	11.2	9.9	12.2	–0.55
qiaoxialuduan2	10.1	6.8	6.3	5.6	5	5.9	–0.49
tingchechang	10.7	7.7	7.1	6.6	6.2	6.9	–0.23
beihaihumian	7.8	4.4	3.6	2.6	1.9	3.1	0.05
MarketPlace	6.2	3.3	2.9	2.4	2.1	2.7	0.13
Cactus	10.4	5.7	5.2	4.6	4.1	4.9	–0.09
huochezhan	9.3	5.4	4.7	4.2	3.8	4.5	0.00
lijiaoqiao	8.3	5.4	4.7	4.2	3.8	4.5	0.00
DaylightRoad	8.8	3.1	2.9	2.7	2.4	2.8	0.00

下载: 导出CSV

表 3 SD5.0加入CPMC的增益(融合模式1)(%)

RCM关闭	Y	U	V	Enc_time
Qiaoxia1	–1.622	–1.571	–3.208	97.59
tingchechang	–1.823	–1.759	–2.229	93.75
dianjing	–2.502	–1.318	–1.710	93.67
yuxuedaolu	–4.910	–2.113	–0.847	92.70
BQTerrace	–3.700	–1.520	–0.275	95.05
average：	–2.912	–1.656	–1.654	94.55

下载: 导出CSV

表 4 SD5.0加入CPMC增益(融合模式0)(%)

RCM关闭	Y	U	V	Enc_time
Qiaoxia1	–1.427	–1.714	–2.705	95.17
tingchechang	–1.504	–1.358	–1.870	111.30
dianjing	–2.368	–1.327	–1.819	86.07
yuxuedaolu	–3.822	–1.735	–0.350	94.47
BQTerrace	–2.715	–2.199	–0.194	98.97
average：	–2.367	–1.667	–1.388	97.20

下载: 导出CSV

表 5 SD5.0加入CPMC增益(独立模式1)(%)

RCM关闭	Y	U	V	Enc_time
Qiaoxia1	–1.778	–1.661	–3.277	88.84
tingchechang	–1.768	–2.058	–2.170	98.69
dianjing	–2.346	–1.359	–1.431	84.40
yuxuedaolu	–4.894	–2.258	–1.639	91.38
BQTerrace	–3.840	–0.645	0.007	96.26
average：	–2.925	–1.596	–1.702	91.91

下载: 导出CSV

表 6 SD5.0加入CPMC增益(独立模式1)(%)

RCM开启	Y	U	V	Enc_time
Qiaoxia1	–0.479%	–0.998%	–1.307%	94.778%
tingchechang	–0.440%	–0.229%	0.421%	93.640%
dianjing	–1.463%	–0.618%	–0.907%	94.938%
yuxuedaolu	–2.484%	–1.140%	–1.673%	90.148%
BQTerrace	–1.683%	–1.065%	1.613%	90.759%
average：	–1.310%	–0.810%	–0.371%	92.853%

下载: 导出CSV

表 7 SD5.0加入CPMC增益(独立模式0)(%)

RCM关闭	Y	U	V	Enc_time
Qiaoxia1	–1.495	–1.103	–2.421	84.68
-tingchechang	–1.532	–1.659	–2.288	101.56
dianjing	–2.309	–1.789	–1.610	85.84
yuxuedaolu	–3.774	–1.767	–2.438	75.24
BQTerrace	–2.806	–1.156	–1.116	111.89
average：	–2.383	–1.495	–1.975	91.84

下载: 导出CSV

表 8 SD5.0加入CPMC增益(独立模式0)(%)

RCM开启	Y	U	V	Enc_time
Qiaoxia1	–0.434	–0.909	–0.378	96.226
tingchechang	–0.200	0.150	–0.004	95.931
dianjing	–1.463	–0.618	–0.907	94.938
yuxuedaolu	–1.887	–1.192	–1.341	93.063
BQTerrace	–1.398	–2.266	–0.984	93.708
average：	–1.076	–0.967	–0.723	94.773

下载: 导出CSV

表 9 SD5.0 加入UMVE的增益(%)

RCM开启	Y	U	V
Qiaoxia1	–0.234	0.103	0.527
tingchechang	–0.639	–1.211	–0.669
dianjing	–0.315	–0.909	–0.708
yuxuedaolu	–0.073	0.024	–0.538
BQTerrace	–0.119	1.029	–0.680
average：	–0.276	–0.193	–0.414

下载: 导出CSV

表 10 关闭自适应调节的CPMC增益(%)

	非自适应独立模式1			非自适应独立模式0
	Y	U	V	Y	U	V
beihaihumian	0.605	0.868	0.861	0.427	0.163	0.686
market	0.938	0.131	1.146	0.795	0.264	0.462
average：	0.772	0.499	1.004	0.611	0.214	0.574

下载: 导出CSV

表 11 开启自适应调节的CPMC增益(%)

	自适应融合模式1			自适应融合模式0
	Y	U	V	Y	U	V
beihaihumian	–0.018	0.354	1.232	0.079	–0.043	0.481
market	0.160	0.705	0.438	0.136	–0.060	–0.361
average：	0.071	0.529	0.835	0.108	–0.051	0.060

下载: 导出CSV

表 12 RCM、AFFINE关闭,自适应融合模式1的增益(%)

	Y	U	V
Beihaihumian	–0.422	0.718	1.017
Qiaoxia1	–2.162	–2.187	–0.884
Qiaoxia2	–0.757	–1.703	–0.543
tingchechang	–2.737	–3.025	–2.741
catus	–0.221	–0.304	0.087
maket	0.271	–0.215	1.309
NightTraffic3	–0.654	–1.273	–0.868
dianjing	–4.131	–3.834	3.751
yuxuedaolu	–10.691	–12.265	6.169
BQTerrace	–11.400	–12.248	–11.008
average：	–3.290	–3.634	–0.371

下载: 导出CSV

表 13 多场景综合测试结果(%)

		V1.1				V1.2
		Y	U	V	Enc_time	Y	U	V	Enc_time
4K	huochezhan	0.25	–0.56	–0.03	-	0.00	0.00	0.00
	lijiaoqiao	0.28	0.41	0.72	-	0.00	0.00	0.00
	DaylightRoad	–0.07	–0.38	–0.53	-	0.00	0.00	0.00
1080P	beihaihumian	0.01	–0.11	–0.91	-	0.05	–0.18	–0.85	100
	qiaoxialuduan1	–0.54	–0.62	0.79	-	–0.55	–0.98	0.11	95
	qiaoxialuduan2	–0.54	–0.46	–0.41	-	–0.49	–0.37	–0.72	95
	tingchechang	–0.26	–0.47	–0.19	-	–0.23	–0.54	0.08	96
	Cactus	–0.08	0.25	–0.24	-	–0.09	0.48	–0.37	98
	MarketPlace	0.01	0.33	0.54	-	0.13	0.30	0.17	98
	NightTraffic3	–0.21	–0.09	–0.24	-	–0.43	–0.28	–0.53	94
补充	dianjing	–1.29	–1.14	2.84	93	–1.29	–1.30	–0.76	95
	yuxuedaolu	–2.04	–0.40	9.22	94	–2.05	–0.35	–1.07	93
	BQTerrace	–1.19	–1.01	–1.57	94	–1.21	–0.56	–1.87	95
	average：	–0.44	–0.33	0.77	94	–0.47	–0.29	–0.45	96

下载: 导出CSV

1 SATD计算程序示例

下载: 导出CSV

2 com_mc_cu_UMVE_CPMC调用程序示例

下载: 导出CSV

3 com_mc_cu_UMVE_CPMC的实现逻辑

下载: 导出CSV

4 encode_umve_idx函数的实现逻辑

下载: 导出CSV

参考文献(18)

[1]	GAO Wen and MA Siwei. Video coding optimization and application system[M]. GAO Wen and MA Siwei. Advanced Video Coding Systems. Switzerland: Springer, 2014: 161–176. doi: 10.1007/978-3-319-14243-2_9.
[2]	ZHU Xizhong, XIANG Guoqing, ZHANG Peng, et al. A hardware-efficient unified motion estimation for video coding[C]. Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, Canada, 2023: 9042–9050. doi: 10.1145/3581783.3613816.
[3]	HUANG Qian, LU Hao, LIU Wenting, et al. Scalable motion estimation and temporal context reinforcement for video compression using RGB sensors[J]. IEEE Sensors Journal, 2025, 25(10): 18323–18333. doi: 10.1109/JSEN.2025.3550525.
[4]	MARPE D, WIEGAND T, and SULLIVAN G J. The H. 264/MPEG4 advanced video coding standard and its applications[J]. IEEE Communications Magazine, 2006, 44(8): 134–143. doi: 10.1109/MCOM.2006.1678121.
[5]	申滨, 李旋, 赖雪冰, 等. 基于Swin Transformer的宽带无线图传语义联合编解码方法[J]. 电子与信息学报, 2025, 47(8): 2665–2674. doi: 10.11999/JEIT250039. SHEN Bin, LI Xuan, LAI Xuebing, et al. Swin Transformer-based wideband wireless image transmission semantic joint encoding and decoding method[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2665–2674. doi: 10.11999/JEIT250039.
[6]	CHIEN W J, ZHANG Li, WINKEN M, et al. Motion vector coding and block merging in the versatile video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(10): 3848–3861. doi: 10.1109/TCSVT.2021.3101212.
[7]	BROSS B, WANG Yekui, YE Yan, et al. Overview of the versatile video coding (VVC) standard and its applications[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(10): 3736–3764. doi: 10.1109/TCSVT.2021.3101953.
[8]	KAJI S and OCHIAI H. A concise parametrization of affine transformation[J]. SIAM Journal on Imaging Sciences, 2016, 9(3): 1355–1373. doi: 10.1137/16M1056936.
[9]	LI Li, LI Houqiang, LIU Dong, et al. An efficient four-parameter affine motion model for video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(8): 1934–1948. doi: 10.1109/TCSVT.2017.2699919.
[10]	MEUEL H, FERENZ S, LIU Yiqun, et al. Rate-distortion theory for affine global motion compensation in video coding[C]. 2018 25th IEEE International Conference on Image Processing, Athens, Greece, 2018: 3593–3597. doi: 10.1109/ICIP.2018.8451136.
[11]	VIANA R, LOOSE M, FERREIRA R, et al. A hardware-friendly acceleration of VVC affine motion estimation using decision trees[C]. 2024 37th SBC/SBMicro/IEEE Symposium on Integrated Circuits and Systems Design, Joao Pessoa, Brazil, 2024: 1–5. doi: 10.1109/SBCCI62366.2024.10703987.
[12]	ZHOU Chuan, LV Zhuoyi, PIAO Yinji, et al. Adaptive motion vector resolution in AVS3 Standard[C]. 2020 IEEE International Conference on Multimedia & Expo Workshops, London, UK, 2020: 1–4. doi: 10.1109/ICMEW46912.2020.9106046.
[13]	SULLIVAN G J, OHM J R, HAN W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. doi: 10.1109/TCSVT.2012.2221191.
[14]	CHEN Shushi, HUANG Leilei, ZAN Zhao, et al. Affine motion estimation hardware implementation with 51.7%/67.5% internal bandwidth reduction for versatile video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2025, 35(4): 3837–3852. doi: 10.1109/TCSVT.2024.3507375.
[15]	CHEN Shushi, HUANG Leilei, LIU Jiahao, et al. An error-surface-based fractional motion estimation algorithm and hardware implementation for VVC[C]. 2023 IEEE International Symposium on Circuits and Systems, Monterey, USA, 2023: 1–5. doi: 10.1109/ISCAS46773.2023.10182170.
[16]	ZHU Xizhong, XIANG Guoqing, HUANG Xiaofeng, et al. A hardware-friendly CTU-level IME Algorithm for VVC[C]. 2023 Data Compression Conference, Snowbird, USA, 2023: 110–119. doi: 10.1109/DCC55655.2023.00019.
[17]	盛庆华, 陶泽浩, 黄小芳, 等. 一种面向AV1粗模式决策的高吞吐量硬件设计方法[J]. 电子与信息学报, 2025, 47(4): 1202–1214. doi: 10.11999/JEIT240823. SHENG Qinghua, TAO Zehao, HUANG Xiaofang, et al. A high-throughput hardware design for AV1 rough mode decision[J]. Journal of Electronics & Information Technology, 2025, 47(4): 1202–1214. doi: 10.11999/JEIT240823.
[18]	宋赛, 崔昭, 詹尹僧, 等. 面向深度神经网络图像压缩的高性能算术编码硬件设计[J]. 电子与信息学报, 2025, 47(9): 3230–3240. doi: 10.11999/JEIT250509. SONG Sai, CUI Zhao, ZHAN Yinseng, et al. High-performance hardware design of arithmetic coding for deep neural network-based image compression[J]. Journal of Electronics & Information Technology, 2025, 47(9): 3230–3240. doi: 10.11999/JEIT250509.