视频编码参数对目标识别性能影响的研究

吴泽民; 刘涛; 姜青竹; 胡磊

doi:10.11999/JEIT141613

视频编码参数对目标识别性能影响的研究

doi: 10.11999/JEIT141613 cstr: 32379.14.JEIT141613

基金项目:

航空科学基金(18265)

计量
- 文章访问数: 1743
- HTML全文浏览量: 204
- PDF下载量: 934
- 被引次数: 0
出版历程
- 收稿日期: 2014-12-18
- 修回日期: 2015-01-22
- 刊出日期: 2015-08-19

Video Coding Parameters Effect on Object Recognition

摘要

摘要: 国内外研究人员对图像目标分类识别和视频编码传输问题都分别进行了大量研究，但是对于视频编码参数对目标识别性能影响的定量关系，还没有公开的文献报导。针对这一问题，该文选择典型的目标识别算法可变部件模型(DPM)和最常用的视频编码方法H.264/AVC作用测试对象，通过设计的编码和检测实验，研究了码率和分辨率参数对视频目标识别性能的影响，并拟合了识别性能随码率和分辨率变化的函数关系。通过选取编码器合适的码率和分辨率工作参数，可以获得信道带宽与视频目标识别性能的折中，为设计不同视频应用的编码优化目标函数提供了依据。
- 计算机视觉 /
- 目标识别 /
- 视频编码 /
- 码率 /
- 分辨率
Abstract: Researchers have done a great number of studies on the object recognition and the video coding transmission respectively. However, there are still no public reports about the influence on the object recognition raised by the video encoding parameters. For this issue, the Deformable Part Model (DPM), a typical object recognition algorithm and the most commonly-used video encoding methods-H.264/AVC are chosen as the test objects. In order to study how the code rates and the resolution affect the performance of video object recognition, the coding and detection experiments are designed and the function of recognition performance changes caused by the code rates and the resolution is fitted. The result shows that the compromise can be achieved between the channel bandwidth and the video object recognition performance through selecting the appropriate the code rates and the resolution parameters for the encoder which provides basis for encoding optimization object function of different video applications.
- Computer vision /
- Object recognition /
- Video code /
- Code rates /
- Resolution

HTML全文

参考文献(23)

Li L J and Li F F. What, where and who? classifying events by scene and object recognition[C]. Proceedings of the IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007: 1-8.

Lei B, Wang T, Chen S, et al.. Object recognition based on adapative bag of feature and discriminative learning[C]. Proceedings of the 20th IEEE International Conference on Image Processing, Melbourne, Australia, 2013: 3390-3393.

Dalal N and Triggs B. Histograms of oriented gradients for human detection[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005, 1: 886-893.

Wei D, Zhao Y, Cheng R, et al.. An enhanced histogram of oriented gradient for pedestrian detection[C]. Proceedings of the 4th IEEE International Conference on Intelligent Control and Information Processing, Beijing, China, 2013: 459-463.

Felzenszwalb P F, Girshick R B, McAllester D, et al.. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645.

Ding Y, Zhang J, Li J, et al.. A bag-of-feature model for video semantic annotation[C]. Proceedings of the 6th IEEE International Conference on Image and Graphics, Hefei, China, 2011: 696-701.

Huang D K, Chen K Y, and Cheng S C. Video object detection by model-based tracking[C]. Proceedings of the 20th IEEE International Symposium on Circuits and Systems, Beijing, China, 2013: 2384-2387.

Blair C, Robertson N M, and Hume D. Characterizing a heterogeneous system for person detection in video using histograms of oriented gradients: power versus speed versus accuracy[J]. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2013, 3(2): 236-247.

Liu Y, Jang Y, Woo W, et al.. Video-based object recognition using novel set-of-sets representations[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, USA, 2014: 533-540.

Sharma P, Huang C, and Nevatia R. Unsupervised incremental learning for improved object detection in a video[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 3298-3305.

Wu Q and Li H. Mode dependent down-sampling and interpolation scheme for high efficiency video coding[J]. Signal Processing: Image Communication, 2013, 28(6): 581-596.

Wang T, Chen Y, He Y, et al.. A real-time rate control scheme and hardware implementation for H. 264/AVC

encoders[C]. Proceedings of the 5th IEEE International Congress on Image and Signal Processing, Chongqing, China, 2012: 5-9.

Felzenszwalb P F and Huttenlocher D P. Pictorial structures for object recognition[J]. International Journal of Computer Vision, 2005, 61(1): 55-79.

Felzenszwalb P F, Girshick R B, and McAllester D. Cascade object detection with deformable part models[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 2241-2248.

Girshick R B, Felzenszwalb P F, and Mcallester D A. Object detection with grammar models[C]. Proceedings of the 25th IEEE Conference on Advances in Neural Information Processing Systems, Granada, Spain, 2011: 442-450.

袁武, 林守勋, 牛振东, 等. H. 264/AVC 码率控制优化算法[J]. 计算机学报, 2008, 31(2): 329-339.

Yuan W, Lin S X, Niu Z D, et al.. Efficient rate control schemes for H.264/AVC[J]. Chinese Journal of Computers, 2008, 31(2): 329-339.

魏江, 刘迪. 基于DM642的X.264编码器优化[J]. 现代电子技术, 2011, 34(14): 68-70.

Wei J and Liu D. Optimization of X.264 encoder based on DM642 platform[J]. Modern Electronics Technique, 2011, 34(14): 68-70.

Huang Y H, Ou T S, and Su P Y. Perceptual rate distortion optimization using structural similarity index as quality metric[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2010, 20(11): 16141624.

Ou T S, Huang Y H, and Chen H H. SSIM-based perceptual rate control for video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011, 21(5): 682691.

Wang R, Huang C, and Chang P. Adaptive downsampling video coding with spatially scalable rate-distortion modeling [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(11): 1957-1968.

施引文献

资源附件(0)

访问统计