高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向高分辨遥感图像的熵驱动自适应融合网络构建与场景分类研究

宋婉莹 刘毓琛 王杰 王安义

宋婉莹, 刘毓琛, 王杰, 王安义. 面向高分辨遥感图像的熵驱动自适应融合网络构建与场景分类研究[J]. 电子与信息学报. doi: 10.11999/JEIT251147
引用本文: 宋婉莹, 刘毓琛, 王杰, 王安义. 面向高分辨遥感图像的熵驱动自适应融合网络构建与场景分类研究[J]. 电子与信息学报. doi: 10.11999/JEIT251147
SONG Wanying, LIU Yuchen, WANG Jie, WANG Anyi. Construction and Scene Classification Research of Entropy-Driven Adaptive Fusion Networks for High-Resolution Remote Sensing Images[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251147
Citation: SONG Wanying, LIU Yuchen, WANG Jie, WANG Anyi. Construction and Scene Classification Research of Entropy-Driven Adaptive Fusion Networks for High-Resolution Remote Sensing Images[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251147

面向高分辨遥感图像的熵驱动自适应融合网络构建与场景分类研究

doi: 10.11999/JEIT251147 cstr: 32379.14.JEIT251147
基金项目: 国家自然科学基金 (61901358),基金2:陕西省自然科学基础研究计划面上项目 (2025JC-YBMS-701),基金3:西安科技大学优秀青年科技基金 (2020YQ3-09),基金4:中国博士后科学基金面上项目 (2020M673347)
详细信息
    作者简介:

    宋婉莹:女;博士;研究方向为遥感图像分析与智能解译、特征挖掘与融合、分类识别等

    刘毓琛:男;硕士生;研究方向为高分辨遥感图像场景分类

    王杰:男;硕士生;研究方向为知识蒸馏、深度学习

    王安义:男,教授,研究方向为宽带数字移动通信与智能信息处理技术及煤矿智能化

    通讯作者:

    宋婉莹 wanyingsong@hotmail.com

  • 中图分类号: TP751.2

Construction and Scene Classification Research of Entropy-Driven Adaptive Fusion Networks for High-Resolution Remote Sensing Images

Funds: Item1: Natural Science Foundation of China (61901358), Item2: Natural Science Basic Research Plan in Shaanxi Province of China (2025JC-YBMS-701), Item3: Outstanding Youth Science Fund of Xi’an University of Science and Technology (2020YQ3-09),; Item4: China Postdoctoral Science Foundation (2020M673347)
  • 摘要: 高分辨遥感图像场景分类因复杂背景、多样成像条件及类内差异大等因素面临显著挑战,而传统卷积神经网络(CNN)方法在全局上下文建模方面存在局限,Swin Transformer在跨窗口特征交互、细粒度局部特征提取以及多层次特征自适应融合方面仍存在不足。针对上述问题,该文提出一种面向高分辨遥感图像场景分类的熵驱动自适应融合网络,主要创新与贡献概括如下:(1)设计注意力引导的区域筛选与特征优化模块(ASO),通过跨窗口稀疏注意力增强全局建模能力并筛选关键区域,结合递归优化强化局部特征表示,增强了模型跨窗口交互能力与细粒度局部特征判别性;(2)构建熵驱动门控融合模块(EGF),利用熵指导的门控机制对Swin特征、全局上下文与优化后的局部特征进行自适应融合,克服多层次特征简单融合易引入冗余的问题;(3)在AID与NWPU-RESISC45公开数据集上的实验表明,所提方法在分类精度上优于多种现有先进方法,展现出良好的鲁棒性与泛化能力。
  • 图  1  熵驱动自适应融合网络总体架构

    图  2  核心模块结构

    图  3  E-AF-ST网络AID数据集场景分类结果混淆矩阵

    图  4  E-AF-ST网络NWPU-RESISC45数据集场景分类结果混淆矩阵

    图  5  Grad-CAM可视化对比

    1  E-AF-ST网络前向传播过程

     输入:高分辨遥感图像$ \boldsymbol{I}\in {\boldsymbol{R}}^{\boldsymbol{H}\times \boldsymbol{W}\times \boldsymbol{C}} $
     参数:ASO递归次数$ \boldsymbol{T} $,EGF递归次数$ \boldsymbol{K} $
     输出:场景分类概率分布$ \boldsymbol{P} $
     1://第一阶段:多层特征提取
     2:$ {\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}}\leftarrow \text{SwinTiny}\left(\boldsymbol{I}\right) $
     3://第二阶段:区域筛选与特征优化
     4:$ {\boldsymbol{F}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}}\leftarrow \text{SparseAttention}\left({\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}}\right) $
     5:$ \boldsymbol{\alpha }\leftarrow \text{Sigmoid}\left(\text{MLP}\left(\text{Entropy}\left({\boldsymbol{F}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}}\right)\right)\right) $
     6:$ \boldsymbol{M}\leftarrow \text{TopK}\left(\boldsymbol{\alpha }\right) $
     7:$ {\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{e}\boldsymbol{l}}\leftarrow {\boldsymbol{F}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}}\odot \boldsymbol{M} $
     8:$ {\boldsymbol{S}}_{\mathbf{0}}\leftarrow {\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{e}\boldsymbol{l}} $
     9:For $ \boldsymbol{t}=\mathbf{1} $ to $ \boldsymbol{T} $ do
     10: $ {\boldsymbol{S}}_{\boldsymbol{t}}\leftarrow \text{LayerNorm}\left({\boldsymbol{S}}_{\boldsymbol{t}-\mathbf{1}}\right)+\text{MHSA}\left({\boldsymbol{S}}_{\boldsymbol{t}-\mathbf{1}}\right) $
     11:End For
     12:$ {\boldsymbol{F}}_{\boldsymbol{l}\boldsymbol{o}\boldsymbol{c}\boldsymbol{a}\boldsymbol{l}}\leftarrow \text{Reshape}\left({\boldsymbol{S}}_{\boldsymbol{T}}\right) $
     13://第三阶段:熵驱动门控融合
     14:$ {\boldsymbol{P}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}},{\boldsymbol{P}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}},{\boldsymbol{P}}_{\boldsymbol{l}\boldsymbol{o}\boldsymbol{c}\boldsymbol{a}\boldsymbol{l}}\leftarrow \text{EnergyNorm}\left({\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}},\right. $
      $\left. {\boldsymbol{F}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}},{\boldsymbol{F}}_{\boldsymbol{l}\boldsymbol{o}\boldsymbol{c}\boldsymbol{a}\boldsymbol{l}}\right) $
     15:$ {\boldsymbol{H}}_{\boldsymbol{s}},{\boldsymbol{H}}_{\boldsymbol{g}},{\boldsymbol{H}}_{\boldsymbol{l}}\leftarrow \text{CalcEntropy}\left({\boldsymbol{P}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}},{\boldsymbol{P}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}},{\boldsymbol{P}}_{\boldsymbol{l}\boldsymbol{o}\boldsymbol{c}\boldsymbol{a}\boldsymbol{l}}\right) $
     16:$ \boldsymbol{w}\leftarrow \text{Softmax}\left(\boldsymbol{\alpha }\cdot \left[\mathbf{1}-{\boldsymbol{H}}_{\boldsymbol{s}},\mathbf{1}-{\boldsymbol{H}}_{\boldsymbol{g}},\mathbf{1}-{\boldsymbol{H}}_{\boldsymbol{l}}\right]\right) $
     17:$ {\boldsymbol{F}}_{\boldsymbol{m}\boldsymbol{i}\boldsymbol{x}}\leftarrow {\boldsymbol{w}}_{\mathbf{1}}{\boldsymbol{F}}_{\boldsymbol{s}\boldsymbol{w}\boldsymbol{i}\boldsymbol{n}}+{\boldsymbol{w}}_{\mathbf{2}}{\boldsymbol{F}}_{\boldsymbol{g}\boldsymbol{l}\boldsymbol{o}\boldsymbol{b}\boldsymbol{a}\boldsymbol{l}}+{\boldsymbol{w}}_{\mathbf{3}}{\boldsymbol{F}}_{\boldsymbol{l}\boldsymbol{o}\boldsymbol{c}\boldsymbol{a}\boldsymbol{l}} $
     18:$ \boldsymbol{F}_{\boldsymbol{r}\boldsymbol{e}\boldsymbol{f}}^{\mathbf{0}}\leftarrow {\boldsymbol{F}}_{\boldsymbol{m}\boldsymbol{i}\boldsymbol{x}} $
     19:For $ \boldsymbol{k}=\mathbf{1} $ to $ \boldsymbol{K} $ do
     20: $ \boldsymbol{F}_{\boldsymbol{r}\boldsymbol{e}\boldsymbol{f}}^{\boldsymbol{k}}\leftarrow \text{LayerNorm}\left(\text{Conv}\left(\text{GELU}\left(\boldsymbol{F}_{\boldsymbol{r}\boldsymbol{e}\boldsymbol{f}}^{\boldsymbol{k}-\mathbf{1}}\right)\right)\right)+\boldsymbol{F}_{\boldsymbol{r}\boldsymbol{e}\boldsymbol{f}}^{\boldsymbol{k}-\mathbf{1}} $
     21:End For
     22:$ {\boldsymbol{F}}_{\boldsymbol{f}\boldsymbol{u}\boldsymbol{s}\boldsymbol{e}\boldsymbol{d}}\leftarrow \boldsymbol{F}_{\boldsymbol{r}\boldsymbol{e}\boldsymbol{f}}^{\boldsymbol{K}}+{\boldsymbol{F}}_{\boldsymbol{m}\boldsymbol{i}\boldsymbol{x}} $
     23://第四阶段:分类输出
     24:$ \boldsymbol{y}\leftarrow \text{GlobalAvgPool}\left({\boldsymbol{F}}_{\boldsymbol{f}\boldsymbol{u}\boldsymbol{s}\boldsymbol{e}\boldsymbol{d}}\right) $
     25:$ \boldsymbol{P}\leftarrow \text{Softmax}\left(\text{Classifier}\left(\boldsymbol{y}\right)\right) $
     26:Return $ \boldsymbol{P} $
    下载: 导出CSV

    表  1  不同训练比例下消融实验OA值(%)

    AID NWPU-RESISC45
    20%训练集 50%训练集 10%训练集 20%训练集
    基线 94.56 96.92 90.84 93.18
    +ASO 95.12 97.05 91.52 93.97
    +EGF 94.78 96.89 91.2 93.65
    E-AF-ST 95.56 97.21 92.45 94.59
    下载: 导出CSV

    表  2  不同方法在AID与在NWPU-RESISC45数据集上的分类准确率对比(%)

    方法 AID数据集 NWPU-RESISC45数据集 Params
    (M)
    FLOPs
    (G)
    20%训练集 50%训练集 10%训练集 20%训练集
    Swin-Tiny[14] 94.56±0.14 96.92±0.12 90.84±0.09 93.18±0.15 29 4.5
    ResNet101+EAM[22] 94.26±0.11 97.06±0.19 91.91±0.22 94.29±0.09 - -
    MGS-Net[23] 95.46±0.21 97.18±0.16 92.40±0.16 94.57±0.12 - -
    SAGN[24] 95.17±0.12 96.77±0.18 91.73±0.18 93.49±0.10 - -
    CSCA-Net[6] 94.67±0.20 96.83±0.14 91.27±0.11 93.72±0.10 - -
    MBAF-Net[7] 93.98±0.15 96.93±0.16 91.61±0.14 94.01±0.08 24.48 4.51
    EMTCAL[23] 94.69±0.14 96.41±0.23 91.63±0.19 93.65±0.12 - -
    AC-Net[24] 93.33±0.29 95.38±0.29 91.09±0.13 92.42±0.16 - -
    E-AF-ST(ours) 95.56±0.19 97.21±0.16 92.45±0.15 94.59±0.11 30.45 4.72
    下载: 导出CSV
  • [1] 李大湘, 南艺璇, 刘颖. 面向遥感图像场景分类的双知识蒸馏模型[J]. 电子与信息学报, 2023, 45(10): 3558–3567. doi: 10.11999/JEIT221017.

    LI Daxiang, NAN Yixuan, and LIU Ying. A double knowledge distillation model for remote sensing image scene classification[J]. Journal of Electronics & Information Technology, 2023, 45(10): 3558–3567. doi: 10.11999/JEIT221017.
    [2] 吴倩倩, 倪康, 郑志忠. 基于双阶段高阶Transformer的遥感图像场景分类[J]. 遥感学报, 2025, 29(3): 792–807. doi: 10.11834/jrs.20233332.

    WU Qianqian, NI Kang, and ZHENG Zhizhong. Remote sensing image scene classification on the basis of a two-stage high-order Transformer[J]. National Remote Sensing Bulletin, 2025, 29(3): 792–807. doi: 10.11834/jrs.20233332.
    [3] CHEN Jianlai, XIONG Rongqi, YU Hanwen, et al. Microwave photonic synthetic aperture radar: Systems, experiments, and imaging processing[J]. IEEE Geoscience and Remote Sensing Magazine, 2025, 13(2): 314–328. doi: 10.1109/MGRS.2024.3444777.
    [4] 尹文昕, 于海琛, 刁文辉, 等. 遥感场景理解中视觉Transformer的参数高效微调[J]. 电子与信息学报, 2024, 46(9): 3731–3738. doi: 10.11999/JEIT240218.

    YIN Wenxin, YU Haichen, DIAO Wenhui, et al. Parameter efficient fine-tuning of vision transformers for remote sensing scene understanding[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3731–3738. doi: 10.11999/JEIT240218.
    [5] CHENG Gong, HAN Junwei, and LU Xiaoqiang. Remote sensing image scene classification: Benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865–1883. doi: 10.1109/JPROC.2017.2675998.
    [6] HOU Yan’e, YANG Kang, DANG Lanxue, et al. Contextual spatial-channel attention network for remote sensing scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 6008805. doi: 10.1109/LGRS.2023.3304645.
    [7] SHI Jiacheng, LIU Wei, SHAN Haoyu, et al. Remote sensing scene classification based on multibranch fusion attention network[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 3001505. doi: 10.1109/LGRS.2023.3262407.
    [8] PAN Wenwen, SUN Xiaofei, WANG Yilun, et al. Enhanced photovoltaic panel defect detection via adaptive complementary fusion in YOLO-ACF[J]. Scientific Reports, 2024, 14(1): 26425. doi: 10.1038/s41598-024-75772-9.
    [9] 徐从安, 吕亚飞, 张筱晗, 等. 基于双重注意力机制的遥感图像场景分类特征表示方法[J]. 电子与信息学报, 2021, 43(3): 683–691. doi: 10.11999/JEIT200568.

    XU Congan, LÜ Yafei, ZHANG Xiaohan, et al. A discriminative feature representation method based on dual attention mechanism for remote sensing image scene classification[J]. Journal of Electronics & Information Technology, 2021, 43(3): 683–691. doi: 10.11999/JEIT200568.
    [10] SONG Jiayin, FAN Yiming, SONG Wenlong, et al. SwinHCST: A deep learning network architecture for scene classification of remote sensing images based on improved CNN and transformer[J]. International Journal of Remote Sensing, 2023, 44(23): 7439–7463. doi: 10.1080/01431161.2023.2285739.
    [11] HUANG Xinyan, LIU Fang, CUI Yuanhao, et al. Faster and better: A lightweight transformer network for remote sensing scene classification[J]. Remote Sensing, 2023, 15(14): 3645. doi: 10.3390/rs15143645.
    [12] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. Proceedings of the 9th International Conference on Learning Representations (ICLR), 2021: 1–21. (查阅网上资料, 未找到出版地信息, 请确认).
    [13] LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021: 10012–10022. doi: 10.1109/ICCV48922.2021.00986.
    [14] JANNAT F E and WILLIS A R. Improving classification of remotely sensed images with the Swin transformer[C]. SoutheastCon 2022, Mobile, USA, 2022: 611–618. doi: 10.1109/SoutheastCon48659.2022.9764016.
    [15] CHANG Jing, HE Xiaohui, SONG Dingjun, et al. A multi-scale attention network for building extraction from high-resolution remote sensing images[J]. Scientific Reports, 2025, 15(1): 24938. doi: 10.1038/s41598-025-09086-9.
    [16] YE Zhipin, LIU Yingqian, JING Teng, et al. A high-resolution network with strip attention for retinal vessel segmentation[J]. Sensors, 2023, 23(21): 8899. doi: 10.3390/s23218899.
    [17] YU Shihai, ZHANG Xu, and SONG Huihui. Sparse mix-attention transformer for multispectral image and hyperspectral image fusion[J]. Remote Sensing, 2024, 16(1): 144. doi: 10.3390/rs16010144.
    [18] XIA Guisong, HU Jingwen, HU Fan, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965–3981. doi: 10.1109/TGRS.2017.2685945.
    [19] CHEN Jianlai, XIONG Rongqi, JIANG Nan, et al. High phase-preserving autofocus imaging for squinted airborne synthetic aperture radar[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5215315. doi: 10.1109/TGRS.2025.3587539.
    [20] ZHAO Zhicheng, LI Jiaqi, LUO Ze, et al. Remote sensing image scene classification based on an enhanced attention module[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(11): 1926–1930. doi: 10.1109/LGRS.2020.3011405.
    [21] WANG Junjie, LI Wei, ZHANG Mengmeng, et al. Remote-sensing scene classification via multistage self-guided separation network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5615312. doi: 10.1109/TGRS.2023.3295797.
    [22] YANG Yuqun, TANG Xu, CHEUNG Y M, et al. SAGN: Semantic-aware graph network for remote sensing scene classification[J]. IEEE Transactions on Image Processing, 2023, 32: 1011–1025. doi: 10.1109/TIP.2023.3238310.
    [23] TANG Xu, LI Mingteng, MA Jingjing, et al. EMTCAL: Efficient multiscale transformer and cross-level attention learning for remote sensing scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5626915. doi: 10.1109/TGRS.2022.3194505.
    [24] TANG Xu, MA Qinshuo, ZHANG Xiangrong, et al. Attention consistent network for remote sensing scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 2030–2045. doi: 10.1109/JSTARS.2021.3051569.
  • 加载中
图(5) / 表(3)
计量
  • 文章访问数:  9
  • HTML全文浏览量:  2
  • PDF下载量:  2
  • 被引次数: 0
出版历程
  • 修回日期:  2026-03-03
  • 录用日期:  2026-03-03
  • 网络出版日期:  2026-03-15

目录

    /

    返回文章
    返回