高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双分支特征融合的无锚框目标检测算法

侯志强 郭浩 马素刚 程环环 白玉 范九伦

侯志强, 郭浩, 马素刚, 程环环, 白玉, 范九伦. 基于双分支特征融合的无锚框目标检测算法[J]. 电子与信息学报, 2022, 44(6): 2175-2183. doi: 10.11999/JEIT210344
引用本文: 侯志强, 郭浩, 马素刚, 程环环, 白玉, 范九伦. 基于双分支特征融合的无锚框目标检测算法[J]. 电子与信息学报, 2022, 44(6): 2175-2183. doi: 10.11999/JEIT210344
HOU Zhiqiang, GUO Hao, MA Sugang, CHENG Huanhuan, BAI Yu, FAN Jiulun. Anchor-free Object Detection Algorithm Based on Double Branch Feature Fusion[J]. Journal of Electronics & Information Technology, 2022, 44(6): 2175-2183. doi: 10.11999/JEIT210344
Citation: HOU Zhiqiang, GUO Hao, MA Sugang, CHENG Huanhuan, BAI Yu, FAN Jiulun. Anchor-free Object Detection Algorithm Based on Double Branch Feature Fusion[J]. Journal of Electronics & Information Technology, 2022, 44(6): 2175-2183. doi: 10.11999/JEIT210344

基于双分支特征融合的无锚框目标检测算法

doi: 10.11999/JEIT210344
基金项目: 国家自然科学基金(62072370)
详细信息
    作者简介:

    侯志强:男,1973年生,教授,博士生导师,研究方向为图像处理、计算机视觉

    郭浩:男,1997年生,硕士生,研究方向为计算机视觉、目标检测和深度学习

    马素刚:男,1982年生,博士,研究方向为计算机视觉、机器学习

    通讯作者:

    郭浩 HaoGuo@stu.xupt.edu.cn

  • 中图分类号: TN911.73; TP391.4

Anchor-free Object Detection Algorithm Based on Double Branch Feature Fusion

Funds: The National Natural Science Foundation of China (62072370)
  • 摘要: 针对无锚框目标检测算法CenterNet中,目标特征利用程度不高、检测结果不够准确的问题,该文提出一种双分支特征融合的改进算法。在算法中,一个分支包含了特征金字塔增强模块和特征融合模块,以对主干网络输出的多层特征进行融合处理。同时,为利用更多的高级语义信息,在另一个分支中仅对主干网络的最后一层特征进行上采样。其次,对主干网络添加了基于频率的通道注意力机制,以增强特征提取能力。最后,采用拼接和卷积操作对两个分支的特征进行融合。实验结果表明,在公开数据集PASCAL VOC上的检测精度为82.3%,比CenterNet算法提高了3.6%,在KITTI数据集上精度领先其6%,检测速度均满足实时性要求。该文提出的双分支特征融合方法将不同层的特征进行处理,更好地利用浅层特征中的空间信息和深层特征中的语义信息,提升了算法的检测性能。
  • 图  1  CenterNet算法结构图

    图  2  特征金字塔增强模块FPEM

    图  3  特征融合模块FFM

    图  4  基于DCT频率域的通道注意力机制

    图  5  通道削减模块RCM结构图

    图  6  DB-CenterNet算法结构图

    图  7  不同编码网络的检测性能

    图  8  PASCAL VOC上的实验对比结果

    图  9  KITTI上的实验对比结果

    表  1  不同特征增强次数的消融实验结果(%)

    增强次数PASCAL VOCKITTI
    ResNet-101Res101-FcaNetResNet-101Res101-FcaNet
    181.5782.0273.9776.52
    281.6381.9373.9576.48
    381.6682.2874.0976.45
    481.5982.0874.0376.45
    581.8782.3174.0476.53
    681.6682.2274.0776.49
    781.6782.2774.0376.51
    下载: 导出CSV

    表  2  PASCAL VOC2007和KITTI数据集的消融实验结果(%)

    ResNet-101上采样
    分支
    多特征融
    合分支
    FcaNetmAP
    (VOC)
    mAP
    (KITTI)
    $ \surd $$ \surd $78.770.5
    $ \surd $$ \surd $80.773.7
    $ \surd $$ \surd $$ \surd $81.874.0
    $ \surd $$ \surd $$ \surd $$ \surd $82.376.5
    下载: 导出CSV

    表  3  PASCAL VOC2007数据集测试结果

    AlgorithmNetworkResolution(ppi)mAP(%)fps
    Faster-RCNN(2015)ResNet-101600×100076.45
    SSD(2016)VGG-16512×51276.819
    R-FCN(2016)ResNet-101600×100080.59
    DSSD(2017)ResNet-101513×51381.55.5
    Yolov3(2018)DarkNet-53544×54479.326
    ExtremeNet(2019)Hourglass-104512×51279.53
    FCOS(2019)ResNet-101800×80080.216
    CenterNet(2019)ResNet-18512×51275.7100
    CenterNet(2019)ResNet-101512×51278.730
    CenterNet(2019)DLA-34512×51280.733
    CenterNet(2019)Hourglass-104512×51280.96
    CenterNet-DHRNet(2020)DHRNet512×51281.918
    本文Res101-FcaNet512×51282.327.6
    下载: 导出CSV

    表  4  本文算法和其他算法在PASCAL VOC2007数据集上各类的检测结果(%)

    Class本文CenterNet-DLAFaster R-CNNMask R-CNNR-FCNSSD512
    aero88.785.076.573.774.570.2
    bike87.886.079.084.487.284.7
    bird85.081.470.978.581.578.4
    boat73.872.865.570.872.073.8
    bottle73.968.452.168.569.853.2
    bus88.586.083.188.086.886.2
    car88.488.484.785.988.587.5
    cat88.586.586.487.889.886.0
    chair66.765.052.060.367.057.8
    cow87.186.381.985.288.183.1
    table75.077.665.773.774.570.2
    dog88.185.284.887.289.884.9
    horse89.487.084.686.590.685.2
    mbike85.786.177.585.079.983.9
    person83.885.076.776.481.279.7
    plant58.258.138.848.553.750.3
    sheep88.383.473.676.381.877.9
    sofa76.579.673.975.581.573.9
    train88.085.083.085.085.982.5
    tv84.880.372.681.079.975.3
    下载: 导出CSV

    表  5  KITTI数据集上综合的检测结果

    AlgorithmNetworkResolution(ppi)Pedestrian(%)Car(%)Cyclist(%)mAP(%)fps
    SSD(2016)VGG-16512×51248.085.150.661.228.9
    RFBNet(2018)VGG-16512×51261.786.472.273.439
    SqueezeDet(2017)ResNet-501242×37561.586.780.076.122.5
    Yolov3(2018)DarkNet-53544×54465.888.773.175.826
    FCOS(2019)ResNet-101800×80069.589.370.176.319
    CenterNet(2019)ResNet-18512×51250.680.859.563.6100
    CenterNet(2019)ResNet-101512×51260.581.369.770.536
    CenterNet(2019)DLA-34512×51262.385.673.873.933
    CenterNet(2019)Hourglass-104512×51265.586.173.274.96
    本文Res101-FcaNet512×51262.088.978.776.527.6
    下载: 导出CSV
  • [1] 孙怡峰, 吴疆, 黄严严, 等. 一种视频监控中基于航迹的运动小目标检测算法[J]. 电子与信息学报, 2019, 41(11): 2744–2751. doi: 10.11999/JEIT181110

    SUN Yifeng, WU Jiang, HUANG Yanyan, et al. A small moving object detection algorithm based on track in video surveillance[J]. Journal of Electronics &Information Technology, 2019, 41(11): 2744–2751. doi: 10.11999/JEIT181110
    [2] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
    [3] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37.
    [4] LAW H and DENG Jie. CornerNet: Detecting objects as paired keypoints[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 765–781.
    [5] ZHOU Xingyi, ZHUO Jiacheng, and KRÄHENBÜHL P. Bottom-up object detection by grouping extreme and center points[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 850–859.
    [6] TIAN Zhi, SHEN Chunhua, CHEN Hao, et al. FCOS: Fully convolutional one-stage object detection[C]. The IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 9626–9635.
    [7] ZHOU Xingyi, WANG Dequan, and KRÄHENBÜHL P. Objects as points[EB/OL]. https://arxiv.org/abs/1904.07850, 2019.
    [8] WANG Wenhai, XIE Enze, SONG Xiaoge, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]. The IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 8439–8448.
    [9] HOWARD A G, ZHU Menglong, CHEN Bo, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. https://arxiv.org/abs/1704.04861, 2021.
    [10] QIN Zequn, ZHANG Pengyi, WU Fei, et al. FcaNet: Frequency channel attention networks[EB/OL]. https://arxiv.org/abs/2012.11879v4, 2020.
    [11] 王新, 李喆, 张宏立. 一种迭代聚合的高分辨率网络Anchor-free目标检测方法[J]. 北京航空航天大学学报, 2021, 47(12): 2533–2541. doi: 10.13700/j.bh.1001-5965.2020.0484

    WANG Xin, LI Zhe, and ZHANG Hongli. High-resolution network Anchor-free object detection method based on iterative aggregation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(12): 2533–2541. doi: 10.13700/j.bh.1001-5965.2020.0484
    [12] LIU Songtao, HUANG Di, and WANG Yunhong. Receptive field block net for accurate and fast object detection[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 404–419.
    [13] WU Bichen, WAN A, IANDOLA F, et al. Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving[C]. The IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 446–454.
  • 加载中
图(9) / 表(5)
计量
  • 文章访问数:  1109
  • HTML全文浏览量:  443
  • PDF下载量:  142
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-04-23
  • 修回日期:  2021-12-19
  • 录用日期:  2022-01-12
  • 网络出版日期:  2022-02-02
  • 刊出日期:  2022-06-21

目录

    /

    返回文章
    返回