高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于图像偏移角和多分支卷积神经网络的旋转不变模型设计

张萌 李响 张经纬

张萌, 李响, 张经纬. 基于图像偏移角和多分支卷积神经网络的旋转不变模型设计[J]. 电子与信息学报. doi: 10.11999/JEIT240417
引用本文: 张萌, 李响, 张经纬. 基于图像偏移角和多分支卷积神经网络的旋转不变模型设计[J]. 电子与信息学报. doi: 10.11999/JEIT240417
ZHANG Meng, LI Xiang, ZHANG Jingwei. Design of Rotation Invariant Model Based on Image Offset Angle and Multibranch Convolutional Neural Networks[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240417
Citation: ZHANG Meng, LI Xiang, ZHANG Jingwei. Design of Rotation Invariant Model Based on Image Offset Angle and Multibranch Convolutional Neural Networks[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240417

基于图像偏移角和多分支卷积神经网络的旋转不变模型设计

doi: 10.11999/JEIT240417
基金项目: 广东省重点领域研发计划(2021B1101270006)
详细信息
    作者简介:

    张萌:男,教授,博士生导师,研究方向为人工智能算法及硬件加速器协同设计、FPGA系统设计及应用等

    李响:男,硕士生,研究方向为人工智能及图像处理算法

    张经纬:男,博士生,研究方向为FPGA智能计算,高层次综合设计及人工智能编译器

    通讯作者:

    李响 220220938961@lzu.edu.cn

  • 中图分类号: TN911.73; TP391

Design of Rotation Invariant Model Based on Image Offset Angle and Multibranch Convolutional Neural Networks

Funds: The Key-Area Research and Development Program of Guangdong Province (2021B1101270006)
  • 摘要: 卷积神经网络(CNN)具有平移不变性,但缺乏旋转不变性。近几年,为卷积神经网络进行旋转编码已成为解决这一技术痛点的主流方法,但这需要大量的参数和计算资源。鉴于图像是计算机视觉的主要焦点,该文提出一种名为图像偏移角和多分支卷积神经网络(OAMC)的模型用于实现旋转不变。首先检测输入图像的偏移角,并根据偏移角反向旋转图像;将旋转后的图像输入无旋转编码的多分支结构卷积神经网络,优化响应模块,以输出最佳分支作为模型的最终预测。OAMC模型在旋转后的手写数字数据集上以最少的8 k参数量实现了96.98%的最佳分类精度。与在遥感数据集上的现有研究相比,模型仅用前人模型的1/3的参数量就可将精度最高提高8%。
  • 图  1  偏移角的检测与旋转模块整体流程图

    图  2  构建直角坐标系示意图

    图  3  OAMC-B模型的整体结构

    图  4  36个旋转子集的测试精度曲线

    表  1  旋转MNIST数据集测试精度

    模型参数量 (k)精度 (%)
    ORN-8 (Align)[10]96983.76
    ORN-8 (ORPooling)[10]39783.33
    RotEqNet[5]10080.10
    Spherical CNN[15]6894.00
    E(2)-CNN[16]206894.37
    RIC-CNN[1]28995.52
    OAMC-1 (本文)863.18
    OAMC-2 (本文)885.06
    OAMC-4 (本文)896.98
    OAMC-8 (本文)893.70
    下载: 导出CSV

    表  2  遥感数据集测试精度

    模型 参数量 (k) 精度 (%)
    NWPU-10 MTARSI AID
    VGG16[20] 3372 82.33 60.15 54.59
    RIC-VGG16[1] 3372 91.65 72.21 66.22
    OAMC-4 981 92.91 75.69 74.31
    下载: 导出CSV
  • [1] MO Hanlin and ZHAO Guoying. RIC-CNN: Rotation-invariant coordinate convolutional neural network[J]. Pattern Recognition, 2024, 146: 109994. doi: 10.1016/j.patcog.2023.109994.
    [2] ZHU Tianyu, FERENCZI B, PURKAIT P, et al. Knowledge combination to learn rotated detection without rotated annotation[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 15518–15527. doi: 10.1109/CVPR52729.2023.01489.
    [3] HAN Jiaming, DING Jian, XUE Nan, et al. ReDet: A rotation-equivariant detector for aerial object detection[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 2785–2794. doi: 10.1109/CVPR46437.2021.00281.
    [4] LI Feiran, FUJIWARA K, OKURA F, et al. A closer look at rotation-invariant deep point cloud analysis[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 16198–16207. doi: 10.1109/ICCV48922.2021.01591.
    [5] MARCOS D, VOLPI M, KOMODAKIS N, et al. Rotation equivariant vector field networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5058–5067. doi: 10.1109/ICCV.2017.540.
    [6] EDIXHOVEN T, LENGYEL A, and VAN GEMERT J C. Using and abusing equivariance[C]. Proceedings of 2023 IEEE/CVF International Conference on Computer Vision Workshops, Paris, France, 2023: 119–128. doi: 10.1109/ICCVW60793.2023.00019.
    [7] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278–2324. doi: 10.1109/5.726791.
    [8] JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]. The 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 2017–2025.
    [9] LAPTEV D, SAVINOV N, BUHMANN J M, et al. TI-POOLING: Transformation-invariant pooling for feature learning in convolutional neural networks[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 289–297. doi: 10.1109/CVPR.2016.38.
    [10] ZHOU Yanzhao, YE Qixiang, QIU Qiang, et al. Oriented response networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4961–4970. doi: 10.1109/CVPR.2017.527.
    [11] WORRALL D E, GARBIN S J, TURMUKHAMBETOV D, et al. Harmonic networks: Deep translation and rotation equivariance[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 7168–7177. doi: 10.1109/CVPR.2017.758.
    [12] WEILER M, HAMPRECHT F A, and STORATH M. Learning steerable filters for rotation equivariant CNNs[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 849–858. doi: 10.1109/CVPR.2018.00095.
    [13] FIRAT H. Classification of microscopic peripheral blood cell images using multibranch lightweight CNN-based model[J]. Neural Computing and Applications, 2024, 36(4): 1599–1620. doi: 10.1007/s00521-023-09158-9.
    [14] WEI Xuan, SU Shixiang, WEI Yun, et al. Rotational convolution: Rethinking convolution for downside fisheye images[J]. IEEE Transactions on Image Processing, 2023, 32: 4355–4364. doi: 10.1109/TIP.2023.3298475.
    [15] COHEN T S, GEIGER M, KOEHLER J, et al. Spherical CNNs[C]. The Sixth International Conference on Learning Representations, Vancouver, Canada, 2018.
    [16] WEILER M and CESA G. General e(2)-equivariant steerable cnns[J]. Advances in Neural Information Processing Systems, 2019, 32.
    [17] CHENG Gong, HAN Junwei, ZHOU Peicheng, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 98: 119–132. doi: 10.1016/j.isprsjprs.2014.10.002.
    [18] WU Zhize, WAN Shouhong, WANG Xiaofeng, et al. A benchmark data set for aircraft type recognition from remote sensing images[J]. Applied Soft Computing, 2020, 89: 106132. doi: 10.1016/j.asoc.2020.106132.
    [19] XIA Guisong, HU Jingwen, HU Fan, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965–3981. doi: 10.1109/TGRS.2017.2685945.
    [20] SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015.
  • 加载中
图(4) / 表(2)
计量
  • 文章访问数:  51
  • HTML全文浏览量:  17
  • PDF下载量:  4
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-05-29
  • 修回日期:  2024-11-08
  • 网络出版日期:  2024-11-18

目录

    /

    返回文章
    返回