高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于难分样本挖掘的快速区域卷积神经网络目标检测研究

张烨 许艇 冯定忠 蒋美仙 吴光华

张烨, 许艇, 冯定忠, 蒋美仙, 吴光华. 基于难分样本挖掘的快速区域卷积神经网络目标检测研究[J]. 电子与信息学报, 2019, 41(6): 1496-1502. doi: 10.11999/JEIT180702
引用本文: 张烨, 许艇, 冯定忠, 蒋美仙, 吴光华. 基于难分样本挖掘的快速区域卷积神经网络目标检测研究[J]. 电子与信息学报, 2019, 41(6): 1496-1502. doi: 10.11999/JEIT180702
Ye ZHANG, Ting XU, Dingzhong FENG, Meixian JIANG, Guanghua WU. Research on Faster RCNN Object Detection Based on Hard Example Mining[J]. Journal of Electronics & Information Technology, 2019, 41(6): 1496-1502. doi: 10.11999/JEIT180702
Citation: Ye ZHANG, Ting XU, Dingzhong FENG, Meixian JIANG, Guanghua WU. Research on Faster RCNN Object Detection Based on Hard Example Mining[J]. Journal of Electronics & Information Technology, 2019, 41(6): 1496-1502. doi: 10.11999/JEIT180702

基于难分样本挖掘的快速区域卷积神经网络目标检测研究

doi: 10.11999/JEIT180702
基金项目: 国家自然科学基金(51605442),浙江省科技厅公益项目(LGN18G010002)
详细信息
    作者简介:

    张烨:男,1973年生,副教授,硕士生导师,研究方向为物联网、深度学习、无线传感器网络的设计与仿真等

    许艇:男,1993年生,硕士生,研究方向为计算机视觉、深度学习、物联网技术等

    冯定忠:男,1963年生,教授,博士生导师,研究方向为企业智能物流、工业工程技术及应用等

    蒋美仙:女,1973年生,副教授,硕士生导师,研究方向为企业物流、系统工程等

    吴光华:男,1983年生,讲师,博士,研究方向为智能物流、物联网技术等

    通讯作者:

    蒋美仙 1056294025@qq.com

  • 中图分类号: TP391.41

Research on Faster RCNN Object Detection Based on Hard Example Mining

Funds: The National Natrual Science Foundation of China (51605442), Science Technology Department of Zhejiang Province (LGN18G010002)
  • 摘要: 针对经典的快速区域卷积神经网络(Faster RCNN)训练过程存在太多难训练样本、召回率低等问题,该文采用一种基于在线难分样本挖掘技术(OHEM)与负难分样本挖掘(HNEM)技术相结合的方法,通过训练中实时筛选的最大损失值难分样本进行误差传递,解决了模型对难分样本检测率低问题,提高模型训练效率;为更好地提高模型的召回率和模型的泛化性,该文改进了非极大值抑制(NMS)算法,设置了置信度阈值罚函数,又引入多尺度、数据增强等训练方法。最后通过比较改进前后的结果,经敏感性实验分析表明,该算法在VOC2007数据集上取得了较好效果,平均精度均值从69.9%提升到了74.40%,在VOC2012上从70.4%提升到79.3%,验证了该算法的优越性。
  • 图  1  增设的OHEM模块

    图  2  经典非极大值抑制存在的问题

    图  3  改进前后损失曲线与召回率的表现

    图  4  敏感性分析实验

    表  1  负难分样本挖掘参数设置

    参数名称代表含义参数取值
    FG_THRESH正样本IoU阈值[0.7, 1.0]
    BG_THRESH_LO负样本IoU阈值[0, 0.5)
    HNEM_NMS_THRESH非极大值抑制阈值0.7
    HNEM_BATCHSIZE图片目标批次大小64
    RPN_FG_FRACTION正样本比例0.25
    RPN_BG_FRACTION负样本比例0.75
    下载: 导出CSV

    表  2  在线样本挖掘参数设置

    参数名称代表含义参数取值
    ITERS每次迭代个数1
    OHEM_ROI_POOL5在线样本兴趣池化7×7
    OHEM_FC6在线样本全连接层4096
    OHEM_RELU6在线样本激活操作
    OHEM_FC7在线样本全连接层4096
    OHEM_RELU7在线样本激活操作
    OHEM_CLS_SCORE在线样本分类数21
    OHEM_CLS_PRED在线样本边框矩阵84
    OHEM在线样本处理模块OHEMData
    下载: 导出CSV

    表  3  改进的非极大值抑制算法

     输入:候选边框集合$B = \left\{ {{{{b}}_1}, {{{b}}_2}, ·\!·\!·, {{{b}}_{{N}}}} \right\}$,置信度集合
    $S = \left\{ {{{{s}}_1}, {{{s}}_2}, ·\!·\!·, {{{s}}_{{N}}}} \right\}$, IoU阈值${N_{\rm t}}$
     循环操作:
     最优框$D \leftarrow \left\{ {} \right\}$
     While $B \ne {\rm Null}$ do
      $m \leftarrow \arg {\rm Max}\ \left( S \right)$
      $M \leftarrow {b_m}$
      $D \leftarrow D \cup M;B \leftarrow B - M$
      for ${{{b}}_{{i}}}$ in $B$ do
       If ${\rm{IoU}}\left( {M, {{{b}}_{{i}}}} \right) \ge {N_{\rm t}}$ then
         ${\rm weight} = {\rm Method}\left( {1 - 3} \right)$
         ${{{s}}_{{i}}} \leftarrow {{{s}}_{{i}}} * {\rm weight}$
         If ${{{s}}_{{i}}} \le {\rm threshold}$
           $B \leftarrow B - {{{b}}_{{i}}}$
         End
       End
      End
     End
     输出最终结果:$D$, $S$
    下载: 导出CSV

    表  4  在线样本挖掘等实验mAP指标结果

    类别birdboatbottlebuscarchaircowtabledoghorsepersonplantsheepsofatrainmAP
    FRCNN68.554.750.678.180.250.774.665.581.383.775.738.370.667.180.769.9
    ohem_fc69.257.946.581.879.147.976.268.983.280.872.739.967.566.275.669.9
    ohem1: 171.154.652.379.781.350.374.366.880.783.776.740.970.068.277.670.4
    ohem1: 1071.858.553.279.382.952.281.270.081.483.277.943.771.967.175.071.7
    ohem1: 372.257.856.680.884.053.877.568.082.284.077.643.270.968.479.472.1
    数据增强69.862.055.280.283.654.580.367.280.785.078.044.670.869.479.072.5
    NMS-线74.564.457.880.084.357.480.870.183.283.781.348.371.968.479.474.1
    NMS-高74.764.058.580.584.556.981.570.183.884.281.547.871.569.179.674.3
    NMS-指73.763.756.979.683.956.580.769.482.882.780.848.070.566.879.273.3
    Lr-调整75.863.357.681.184.756.583.170.684.885.281.247.871.668.679.174.4
    12+ohem76.864.861.485.084.159.982.661.988.585.286.956.779.567.585.477.5
    12+ohem*78.165.055.484.984.062.183.667.391.388.985.654.783.877.388.379.3
    下载: 导出CSV
  • 吕博云. 数字图像处理技术及应用研究[J]. 科技与创新, 2018(2): 146–147. doi: 10.15913/j.cnki.kjycx.2018.02.146

    LÜ Boyun. Research on the technology and application of digital image processing[J]. Science and Technology &Innovation, 2018(2): 146–147. doi: 10.15913/j.cnki.kjycx.2018.02.146
    王湘新, 时洋, 文梅. CNN卷积计算在移动GPU上的加速研究[J]. 计算机工程与科学, 2018, 40(1): 34–39. doi: 10.3969/j.issn.1007-130X.2018.01.005

    WANG Xiangxin, SHI Yang, and WEN Mei. Accelerating CNN on mobile GPU[J]. Computer Engineering &Science, 2018, 40(1): 34–39. doi: 10.3969/j.issn.1007-130X.2018.01.005
    胡炎, 单子力, 高峰. 基于Faster-RCNN和多分辨率SAR的海上舰船目标检测[J]. 无线电工程, 2018, 48(2): 96–100. doi: 10.3969/j.issn.1003-3106.2018.02.04

    HU Yan, SHAN Zili, and GAO Feng. Ship detection based on faster-RCNN and multiresolution SAR[J]. Radio Engineering, 2018, 48(2): 96–100. doi: 10.3969/j.issn.1003-3106.2018.02.04
    GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587. doi: 10.1109/CVPR.2014.81.
    REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
    FELZENSZWALB P, MCALLESTER D, and RAMANAN D. A discriminatively trained, multiscale, deformable part model[C]. Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, USA, 2008: 1–8. doi: 10.1109/CVPR.2008.4587597.
    YAN Junjie, LEI Zhen, WEN Longyin, et al. The fastest deformable part model for object detection[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 2497–2504.
    FORSYTH D. Object detection with discriminatively trained part-based models[J]. Computer, 2014, 47(2): 6–7. doi: 10.1109/MC.2014.42
    DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893. doi: 10.1109/CVPR.2005.177.
    WANG Xiaoyu, HAN T X, and YAN Shuicheng. An HOG-LBP human detector with partial occlusion handling[C]. Proceedings of 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 2009: 32–39. doi: 10.1109/ICCV.2009.5459207.
    ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 2155–2162. doi: 10.1109/CVPR.2014.276.
    NEUBECK A and VAN GOOL L. Efficient non-maximum suppression[C]. Proceedings of the 18th International Conference on Pattern Recognition, Hongkong, China, 2006: 850–855. doi: 10.1109/ICPR.2006.479.
    李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012: 18–23.

    LI Hang. Statistical Learning Method[M]. Beijing: Tsinghua University Press, 2012: 18–23.
    周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 23–35.

    ZHOU Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016: 23–35.
    SUN Changming and VALLOTTON P. Fast linear feature detection using multiple directional non-maximum suppression[J]. Journal of Microscopy, 2009, 234(2): 147–157. doi: 10.1111/jmi.2009.234.issue-2
  • 加载中
图(4) / 表(4)
计量
  • 文章访问数:  2864
  • HTML全文浏览量:  1766
  • PDF下载量:  88
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-07-13
  • 修回日期:  2019-01-28
  • 网络出版日期:  2019-02-18
  • 刊出日期:  2019-06-01

目录

    /

    返回文章
    返回