高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向遥感图像场景分类的双知识蒸馏模型

李大湘 南艺璇 刘颖

李大湘, 南艺璇, 刘颖. 面向遥感图像场景分类的双知识蒸馏模型[J]. 电子与信息学报, 2023, 45(10): 3558-3567. doi: 10.11999/JEIT221017
引用本文: 李大湘, 南艺璇, 刘颖. 面向遥感图像场景分类的双知识蒸馏模型[J]. 电子与信息学报, 2023, 45(10): 3558-3567. doi: 10.11999/JEIT221017
LI Daxiang, NAN Yixuan, LIU Ying. A Double Knowledge Distillation Model for Remote Sensing Image Scene Classification[J]. Journal of Electronics & Information Technology, 2023, 45(10): 3558-3567. doi: 10.11999/JEIT221017
Citation: LI Daxiang, NAN Yixuan, LIU Ying. A Double Knowledge Distillation Model for Remote Sensing Image Scene Classification[J]. Journal of Electronics & Information Technology, 2023, 45(10): 3558-3567. doi: 10.11999/JEIT221017

面向遥感图像场景分类的双知识蒸馏模型

doi: 10.11999/JEIT221017
基金项目: 国家自然科学基金(62071379),陕西省自然科学基金(2017KW-013), 西安邮电大学创新基金(CXJJYL2021055, YJGJ201902)
详细信息
    作者简介:

    李大湘:男,博士,副教授,研究方向为遥感图像分类、目标检测与跟踪、医学图像识别、多实例学习与深度学习等

    南艺璇:女,硕士生,研究方向为遥感图像分类、图像分割、图像检索、机器学习与模式识别

    刘颖:女,博士,高级工程师,研究方向为图像识别与机器学习等

    通讯作者:

    南艺璇 1010367243@qq.com

  • 中图分类号: TN911.73; TP751

A Double Knowledge Distillation Model for Remote Sensing Image Scene Classification

Funds: The National Natural Science Foundation of China (62071379), The Natural Science Foundation of Shaanxi Province (2017KW-013), The Innovation Foundation of Xi’an University of Posts and Telecommunications (CXJJYL2021055, YJGJ201902)
  • 摘要: 为了提高轻型卷积神经网络(CNN)在遥感图像(RSI)场景分类任务中的精度,该文设计一个双注意力(DA)与空间结构(SS)相融合的双知识蒸馏(DKD)模型。首先,构造新的DA模块,将其嵌入到ResNet101与设计的轻型CNN,分别作为教师与学生网络;然后,构造DA蒸馏损失函数,将教师网络中的DA知识迁移到学生网络之中,从而增强其对RSI的局部特征提取能力;最后,构造SS蒸馏损失函数,将教师网络中的语义提取能力以空间结构的形式迁移到学生网络,以增强其对RSI的高层语义表示能力。基于两个标准数据集AID和NWPU-45的对比实验结果表明,在训练比例为20%的情况下,经知识蒸馏之后的学生网络性能分别提高了7.69%和7.39%,且在参量更少的情况下性能也优于其他方法。
  • 图  1  设计的DKD模型框架结构示意图

    图  2  双注意力(DA)模块架构示意图

    图  3  教师网络训练3元孪生框架示意图

    图  4  SS知识蒸馏

    图  5  AID数据集训练比例为20%时的混淆矩阵

    图  6  NWPU-45数据集训练比例为20%时的混淆矩阵

    图  7  使用Grad-CAM进行可视化对比

    表  1  学生网络具体参数设计

    网络层名输出尺寸计算方法
    Conv1112×1127×7,64,stride=2
    DA112×112DA模块
    Conv2_x56×563×3 max pool, stride=2
    [3×3, 64; 3×3,64]
    Conv3_x28×28[3×3, 128; 3×3,64]
    Conv4_x14×14[3×3, 256; 3×3,64]
    Conv5_x7×7[3×3, 512; 3×3,64]
    1×1average pool,45-d fc, softmax
    下载: 导出CSV
    算法1 双知识蒸馏(DKD)学生网络训练及测试
     输入:训练图像$ D = \{ ({\text{IM}}{{\text{G}}_n},{y_n}):n = 1,2, \cdots ,N\} $,网络超参
        (Epoches, BS与lr),测试图像
        $ {\text{Tst}} = \{ ({\text{IM}}{{\text{G}}_m},{y_m}):m = 1,2, \cdots ,M\} $
     输出:学生网络参数${\varOmega _{\text{S} } }$及测试图像分类精度
     准备:将D中的训练图像组成3元组,采用图3所示孪生框架训练
        教师网络${\varOmega ^{ {\text{TE} } } }$;
     For epoch in Epoches:
       (1) 根据批大小BS,对D中的训练图像进行分批;
       (2) 每批图像送入教师网络${\varOmega ^{ {\text{TE} } } }$,得到的高层语义特征
         ${\text{Tb}} = \{ {t_i}|i = 1,2, \cdots ,{\text{BS}}\} $;
       (3) 每批图像送入学生网络${\varOmega _{\text{S} } }$,得到的高层语义特征
         ${\text{Sb}} = \{ {s_i}|i = 1,2, \cdots ,{\text{BS}}\} $及预测标签$\{ {\tilde y_i}\} _{i = 1}^{{\text{BS}}}$;
       (4) 用式(15)计算${L_{{\text{HTL}}}}$,优化器通过反向传播更新学生网络
         参数${\varOmega _{\text{S} } }$;
       (5) 采用余弦衰减策略更新学习率lr。
     End for
       (6) 对 $ \forall {\text{IM}}{{\text{G}}_m} \in {\text{Tst}} $,将${\text{IM}}{{\text{G}}_m}$输入学生网络${\varOmega _{\text{S} } }$,得到其
         类别预测结果${ { {\tilde y} }_{{m} } }$;
       (7) 根据$ \{ ({\bar y_m},{y_m}):m = 1,2, \cdots ,M\} $,统计分类精度且输出。
    下载: 导出CSV

    表  2  不同训练比例下消融实验的OA值(%)

    AID训练比例(%)NWPU-45训练比例(%)
    20501020
    基线87.5289.4386.2788.48
    +DA93.0894.3691.6893.65
    +SS93.9294.6392.9194.12
    +DKD95.2197.0493.8895.87
    教师95.9397.6394.4796.52
    下载: 导出CSV

    表  3  教师与学生网络性能比较(以AID数据集(50%)为例)

    ModelParameters (MB)AvgTime (ms)Accuracy (%)
    教师42.5614.897.63
    学生(DKD)4.924.3797.04
    ResNet50[9]25.568.5395.49
    VGG-16[17]138.3616.4392.63
    SCViT[18]85.6120.189.23
    下载: 导出CSV

    表  4  基于AID与NWPU-45数据集的综合对比实验结果(%)

    方法AID训练比例(%)NWPU-45训练比例(%)
    20501020
    VGG16+MSCP[22]91.5294.4288.3291.56
    ARCNet-VGG[19]88.7593.1085.6090.87
    CNN-CapsNet[23]93.7996.3289.0389.03
    SCCov[24]93.1296.1089.3092.10
    GBNet[25]92.2095.4890.0392.35
    MF2Net[26]93.8295.9390.1792.73
    MobileNet[20]88.5390.9180.3283.26
    ViT-B-16[21]93.8195.9090.9693.36
    XU et al.[27]94.1796.1990.2393.25
    DKD (本文)95.2197.0493.8895.87
    下载: 导出CSV
  • [1] 马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(3): 29–35. doi: 10.12052/gdutxb.200153

    MA Shaopeng, LIANG Lu, and TENG Shaohua. A lightweight hyperspectral remote sensing image classification method[J]. Journal of Guangdong University of Technology, 2021, 38(3): 29–35. doi: 10.12052/gdutxb.200153
    [2] PAN Deng, ZHANG Meng, and ZHANG Bo. A generic FCN-based approach for the road-network extraction from VHR remote sensing images–using OpenStreetMap as benchmarks[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 2662–2673. doi: 10.1109/JSTARS.2021.3058347
    [3] 姜亚楠, 张欣, 张春雷, 等. 基于多尺度LBP特征融合的遥感图像分类[J]. 自然资源遥感, 2021, 33(3): 36–44. doi: 10.6046/zrzyyg.2020303

    JIANG Yanan, ZHANG Xin, ZHANG Chunlei, et al. Classification of remote sensing images based on multi-scale feature fusion using local binary patterns[J]. Remote Sensing for Natural Resources, 2021, 33(3): 36–44. doi: 10.6046/zrzyyg.2020303
    [4] CHAIB S, GU Yanfeng, and YAO Hongxun. An informative feature selection method based on sparse PCA for VHR scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(2): 147–151. doi: 10.1109/LGRS.2015.2501383
    [5] 李彦甫, 范习健, 杨绪兵, 等. 基于自注意力卷积网络的遥感图像分类[J]. 北京林业大学学报, 2021, 43(10): 81–88. doi: 10.12171/j.1000-1522.20210196

    LI Yanfu, FAN Xijian, YANG Xubing, et al. Remote sensing image classification framework based on self-attention convolutional neural network[J]. Journal of Beijing Forestry University, 2021, 43(10): 81–88. doi: 10.12171/j.1000-1522.20210196
    [6] XU Kejie, HUANG Hong, DENG Peifang, et al. Deep feature aggregation framework driven by graph convolutional network for scene classification in remote sensing[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(10): 5751–5765. doi: 10.1109/TNNLS.2021.3071369
    [7] CHEN Sibao, WEI Qingsong, WANG Wenzhong, et al. Remote sensing scene classification via multi-branch local attention network[J]. IEEE Transactions on Image Processing, 2021, 31: 99–109. doi: 10.1109/TIP.2021.3127851
    [8] CHEN Xi, XING Zhiqiang, and CHENG Yuyang. Introduction to model compression knowledge distillation[C]. 2021 6th International Conference on Intelligent Computing and Signal Processing, Xi'an, China, 2021: 1464–1467.
    [9] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [10] LUO Yana and WANG Zhongsheng. An improved ResNet algorithm based on CBAM[C]. 2021 International Conference on Computer Network, Electronic and Automation, Xi'an, China, 2021: 121–125.
    [11] KE Xiao, ZHANG Xiaoling, ZHANG Tianwen, et al. SAR ship detection based on an improved faster R-CNN using deformable convolution[C]. 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 2021: 3565–3568.
    [12] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 11531–11539.
    [13] ZENG Weiyu, WANG Tianlei, CAO Jiuwen, et al. Clustering-guided pairwise metric triplet loss for person reidentification[J]. IEEE Internet of Things Journal, 2022, 9(16): 15150–15160. doi: 10.1109/JIOT.2022.3147950
    [14] PARK W, KIM D, LU Yan, et al. Relational knowledge distillation[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 3962–3971.
    [15] XIA Guisong, HU Jingwen, HU Fan, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965–3981. doi: 10.1109/TGRS.2017.2685945
    [16] CHENG Gong, HAN Junwei, and LU Xiaoqiang. Remote sensing image scene classification: Benchmark and State of the Art[J]. Proceedings of the IEEE, 2017, 105(10): 1865–1883. doi: 10.1109/JPROC.2017.2675998
    [17] TUN N L, GAVRILOV A, TUN N M, et al. Remote sensing data classification using A hybrid pre-trained VGG16 CNN-SVM classifier[C]. 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, St. Petersburg, Russia, 2021: 2171–2175.
    [18] LV Pengyuan, WU Wenjun, ZHONG Yanfei, et al. SCViT: A spatial-channel feature preserving vision transformer for remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 4409512. doi: 10.1109/TGRS.2022.3157671
    [19] WANG Qi, LIU Shaoteng, CHANUSSOT J, et al. Scene classification with recurrent attention of VHR remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(2): 1155–1167. doi: 10.1109/TGRS.2018.2864987
    [20] PAN Haihong, PANG Zaijun, WANG Yaowei, et al. A new image recognition and classification method combining transfer learning algorithm and mobilenet model for welding defects[J]. IEEE Access, 2020, 8: 119951–119960. doi: 10.1109/ACCESS.2020.3005450
    [21] DOSOVITSKI A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C/OL]. The 9th International Conference on Learning Representations, 2021.
    [22] HE Nanjun, FANG Leyuan, LI Shutao, et al. Remote sensing scene classification using multilayer stacked covariance pooling[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(12): 6899–6910. doi: 10.1109/TGRS.2018.2845668
    [23] ZHANG Wei, TANG Ping, and ZHAO Lijun. Remote sensing image scene classification using CNN-CapsNet[J]. Remote Sensing, 2019, 11(5): 494. doi: 10.3390/rs11050494
    [24] HE Nanjun, FANG Leyuan, LI Shutao, et al. Skip-connected covariance network for remote sensing scene classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(5): 1461–1474. doi: 10.1109/TNNLS.2019.2920374
    [25] SUN Hao, LI Siyuan, ZHENG Xiangtao, et al. Remote sensing scene classification by gated bidirectional network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(1): 82–96. doi: 10.1109/TGRS.2019.2931801
    [26] XU Kejie, HUANG Hong, LI Yuan, et al. Multilayer feature fusion network for scene classification in remote sensing[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(11): 1894–1898. doi: 10.1109/LGRS.2019.2960026
    [27] XU Chengjun, ZHU Guobin, and SHU Jingqian. A lightweight intrinsic mean for remote sensing classification with lie group kernel function[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 18(10): 1741–1745. doi: 10.1109/LGRS.2020.3007775
    [28] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. The 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626.
  • 加载中
图(7) / 表(5)
计量
  • 文章访问数:  538
  • HTML全文浏览量:  371
  • PDF下载量:  116
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-08-03
  • 修回日期:  2023-01-15
  • 网络出版日期:  2023-02-22
  • 刊出日期:  2023-10-31

目录

    /

    返回文章
    返回