高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于任务感知关系网络的少样本图像分类

郭礼华 王广飞

郭礼华, 王广飞. 基于任务感知关系网络的少样本图像分类[J]. 电子与信息学报, 2024, 46(3): 977-985. doi: 10.11999/JEIT230162
引用本文: 郭礼华, 王广飞. 基于任务感知关系网络的少样本图像分类[J]. 电子与信息学报, 2024, 46(3): 977-985. doi: 10.11999/JEIT230162
GUO Lihua, WANG Guangfei. Few-shot Image Classification Based on Task-Aware Relation Network[J]. Journal of Electronics & Information Technology, 2024, 46(3): 977-985. doi: 10.11999/JEIT230162
Citation: GUO Lihua, WANG Guangfei. Few-shot Image Classification Based on Task-Aware Relation Network[J]. Journal of Electronics & Information Technology, 2024, 46(3): 977-985. doi: 10.11999/JEIT230162

基于任务感知关系网络的少样本图像分类

doi: 10.11999/JEIT230162
基金项目: 广东省基础与应用基础研究基金(2022A1515011549, 2023A1515011104)
详细信息
    作者简介:

    郭礼华:男,副教授,研究方向为图像识别、机器学习和图像质量评价

    王广飞:男,硕士生,研究方向为少样本图像分类

    通讯作者:

    郭礼华 guolihua@scut.edu.cn

  • 11 https://plato.stanford.edu/entries/perceptual-learning/
  • 中图分类号: TN911.73; TP391.41

Few-shot Image Classification Based on Task-Aware Relation Network

Funds: Guangdong Basic and Applied Basic Research Foundation (2022A1515011549, 2023A1515011104)
  • 摘要: 针对关系网络(RN)模型缺乏对分类任务整体相关信息的感知能力的问题,该文提出基于任务感知关系网络(TARN)的小样本学习(FSL)算法。引入模糊C均值(FCM)聚类生成基于任务全局分布的类别原型,同时设计任务相关注意力机制(TCA),改进RN中的1对1度量方式,使得在与类别原型对比时,局部特征聚合了任务全局信息。和RN比,在数据集Mini-ImageNet上,5-way 1-shot和5-way 5-shot设置中的分类准确率分别提高了8.15%和7.0%,在数据集Tiered-ImageNet上,5-way 1-shot和5-way 5-shot设置中的分类准确率分别提高了7.81%和6.7%。与位置感知的关系网络模型比,在数据集Mini-ImageNet上,5-way 1-shot设置中分类准确率也提高了1.24%。与其他小样本图像分类算法性能比较,TARN模型在两个数据集上都获得了最佳的识别精度。该方法将任务相关信息和度量网络模型进行结合可以有效提高小样本图像分类准确率。
  • 图  1  TARN模型整体框架

    图  2  FCM模块计算任务相关类别原型的示意图

    图  3  TCA算法流程

    图  4  TARN与RN, PARN的对照Grad-CAM热力图

    图  5  5-way 1-shot模式下,查询样本和类别原型经过t-SNE降维后可视化图

    表  1  Mini-ImageNet数据集上小样本分类准确率(%)

    模型特征提取网络5-way 1-shot5-way 5-shot
    ReptileConv449.9765.99
    RN50.4465.32
    BOIL49.6166.45
    SNAL55.7168.88
    OVE50.0264.58
    FEAT55.1571.61
    PARN55.2271.55
    TARN(本文)56.4671.77
    FEATResNet1262.9678.49
    RN56.6773.73
    TADAM58.5076.70
    DSN64.6079.51
    NCA62.5578.27
    Meta-Baseline63.1779.26
    PSST64.0580.24
    P-Transfer64.2180.38
    TARN(本文)64.8280.73
    下载: 导出CSV

    表  2  Tiered-ImageNet数据集上小样本分类准确率(%)

    模型特征提取网络5-way1-shot5-way 5-shot
    ProtoNetsConv453.3172.69
    RN53.1869.65
    BOIL49.3569.37
    MELR56.373.22
    TARN(本文)57.9574.68
    FEATResNet1270.8084.79
    DSN66.2282.79
    RN66.1880.15
    Meta-Baseline68.6283.74
    NCA68.3583.20
    UniSiam67.0184.47
    MCL72.0186.02
    BaseTransformer72.4684.96
    MELR72.1487.01
    TARN(本文)73.9986.85
    下载: 导出CSV

    表  3  3种模型的训练时间和测试时间对比

    模型特征提取网络训练时间(min)测试时间(ms)
    RN
    Conv4
    215.244.95
    PARN250.263.57
    TARN251.163.93
    RN
    ResNet12
    485.8150.85
    PARN857.9280.57
    TARN861.3281.71
    下载: 导出CSV

    表  4  Mini-ImageNet数据集的消融实验(%)

    RNHCMFCMTCA5-way 1-shot5-way 5-shot
    51.2165.97
    52.3967.12
    54.0067.90
    55.2270.62
    56.4671.77
    下载: 导出CSV
  • [1] SUNG F, YANG Fongxin, ZHANG Li, et al. Learning to compare: Relation network for few-shot learning[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1199–1208.
    [2] WU Ziyang, LI Yuwei, GUO Lihua, et al. PARN: Position-aware relation networks for few-shot learning[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 6658–6666.
    [3] ORESHKIN B N, RODRIGUEZ P, and LACOSTE A. TADAM: Task dependent adaptive metric for improved few-shot learning[C]. The 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 719–729.
    [4] MANIPARAMBIL M, MCGUINNESS K, and O'CONNOR N E. BaseTransformers: Attention over base data-points for One Shot Learning[C]. The 33rd British Machine Vision Conference, London, UK, 2022: 482. doi: arxiv-2210.02476.
    [5] LIU Yang, ZHANG Weifeng, XIANG Chao, et al. Learning to affiliate: Mutual centralized learning for few-shot classification[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 14391–14400.
    [6] FINN C, ABBEEL P, and LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]. The 34th International Conference on Machine Learning, Sydney, Australia, 2017: 1126–1135. doi: 10.5555/3305381.3305498.
    [7] NICHOL A, ACHIAM J, and SCHULMAN J. On first-order meta-learning algorithms[EB/OL]. https://arxiv.org/abs/1803.02999, 2018.
    [8] OH J, YOO H, KIM C, et al. BOIL: Towards representation change for few-shot learning[C]. The 9th International Conference on Learning Representations, Vienna, Austria, 2021: 1–24.doi: 10.48550/arXiv.2008.08882.
    [9] CHEN Yinbo, LIU Zhuang, XU Huijuan, et al. Meta-baseline: Exploring simple meta-learning for few-shot learning[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 9042–9051. doi: 10.1109/ICCV48922.2021.00893.
    [10] SHEN Zhiqiang, LIU Zechun, QIN Jie, et al. Partial is better than all: Revisiting fine-tuning strategy for few-shot learning[C]. The 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2021: 9594–9602.
    [11] SNELL J and ZEMEL R. Bayesian few-shot classification with one-vs-each pólya-gamma augmented Gaussian processes[C]. The 9th International Conference on Learning Representations, Vienna, Austria, 2021: 1–34. doi: 10.48550/arXiv.2007.10417.
    [12] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 248–255.
    [13] REN Mengye, TRIANTAFILLOU E, RAVI S, et al. Meta-learning for semi-supervised few-shot classification[EB/OL]. https://arxiv.org/abs/1803.00676, 2018.
    [14] MISHRA N, ROHANINEJAD M, CHEN Xi, et al. A simple neural attentive meta-learner[C]. The 6th International Conference on Learning Representations, Vancouver, Canada, 2018: 1–17. doi: 10.48550/arXiv.1707.03141.
    [15] YE Hanjia, HU Hexiang, ZHAN Dechuan, et al. Few-shot learning via embedding adaptation with set-to-set functions[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 8805–8814.
    [16] FEI Nanyi, LU Zhiwu, XIANG Tao, et al. MELR: Meta-learning via modeling episode-level relationships for few-shot learning[C]. The 9th International Conference on Learning Representations, Vienna, Austria, 2021: 1–20.
    [17] SIMON C, KONIUSZ P, NOCK R, et al. Adaptive subspaces for few-shot learning[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 4135–4144.
    [18] LAENEN S and BERTINETTO L. On episodes, prototypical networks, and few-shot learning[C]. The 35th International Conference on Neural Information Processing Systems, 2021: 24581–24592. doi: 10.48550/arXiv.2012.09831.
    [19] LU Yuning, WEN Liangjian, LIU Jianzhuang, et al. Self-supervision can be a good few-shot learner[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 740–758.
    [20] CHEN Zhengyu, GE Jixie, ZHAN Heshen, et al. Pareto self-supervised training for few-shot learning[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13658–13667.
    [21] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626.
  • 加载中
图(5) / 表(4)
计量
  • 文章访问数:  308
  • HTML全文浏览量:  201
  • PDF下载量:  65
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-03-16
  • 修回日期:  2023-08-17
  • 网络出版日期:  2023-08-21
  • 刊出日期:  2024-03-27

目录

    /

    返回文章
    返回