高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

结合多模态多尺度融合与Mamba的遥感地物分类

谢雯 朱潮涛 王瑾 马晓萌

谢雯, 朱潮涛, 王瑾, 马晓萌. 结合多模态多尺度融合与Mamba的遥感地物分类[J]. 电子与信息学报. doi: 10.11999/JEIT251303
引用本文: 谢雯, 朱潮涛, 王瑾, 马晓萌. 结合多模态多尺度融合与Mamba的遥感地物分类[J]. 电子与信息学报. doi: 10.11999/JEIT251303
XIE Wen, ZHU Chaotao, WANG Jin, MA Xiaomeng. Remote Sensing Land-Cover Classification Combining Multi-Modal and Multi-Scale Fusion with Mamba[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251303
Citation: XIE Wen, ZHU Chaotao, WANG Jin, MA Xiaomeng. Remote Sensing Land-Cover Classification Combining Multi-Modal and Multi-Scale Fusion with Mamba[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251303

结合多模态多尺度融合与Mamba的遥感地物分类

doi: 10.11999/JEIT251303 cstr: 32379.14.JEIT251303
基金项目: 国家自然科学基金(61901365, 62071379),陕西省自然科学基础研究计划 (项目编号2025JC-YBQN-936),陕西省教育厅青年创新团队科研计划项目(25JP175),陕西高校青年创新团队,西安邮电大学西邮新星团队项目(xyt2016-01)
详细信息
    作者简介:

    谢雯:女,副教授,研究方向为遥感图像处理、深度学习、机器学习

    朱潮涛:男,硕士生,研究方向为深度学习和遥感图像分类

    王瑾:女,副研究员,研究方向为卫星导航信号处理、导航与通信融合定位、边缘计算及高精度时间同步

    马晓萌:女,高级工程师,研究方向为雷达对抗系统设计及信号处理

    通讯作者:

    谢雯 xiewen@xupt.edu.cn

  • 中图分类号: TN911.73; TP751.1

Remote Sensing Land-Cover Classification Combining Multi-Modal and Multi-Scale Fusion with Mamba

Funds: The National Natural Science Foundation of China(61901365, 62071379), The Natural Science Basic Research Plan in Shaanxi Province of China (Program No. 2025JC-YBQN-936), Scientific Research Program Funded by Education Department of Shaanxi Provincial Government (Program No.25JP175), The Youth Innovation Team of Shaanxi Universities, The New Star Team of Xi’an University of Posts and Telecommunication (xyt2016-01)
  • 摘要: 遥感成像技术的迅速发展为遥感地物分类带来了海量且多元的数据,如何利用多模态数据的互补性提升分类性能成为研究热点。近年来Mamba模型凭借其独特的架构与强大的全局建模能力在图像处理领域得到成功应用,其中多尺度视觉Mamba模型善于应对复杂的空间分布,契合遥感地物尺度差异大、朝向复杂等特点。为充分发挥Mamba模型提取与融合遥感数据特征的优势,该文提出基于Mamba的多模态多尺度融合模型用于遥感地物分类(M3RS)。首先,该模型采用多尺度空间编码器提取光探测与测距(LiDAR)图像和合成孔径雷达(SAR)图像的特征,并基于高光谱图像(HSI)独特的数据结构,提出多尺度空谱编码器提取其复杂的空间光谱特征。然后提出由交叉Mamba和通道拼接Mamba相结合得到的多模态特征融合模块,其中交叉Mamba通过交互状态空间参数高效融合多模态空间特征,通道拼接Mamba通过构造四种通道扫描方式充分融合多模态特征。最后该模型采用改进的多尺度特征融合模块逐层融合多尺度特征并提取具有高判别性的分类依据,可有效提升遥感地物分类的准确率。该文通过在Muufl、Houston2013和Augsburg三个数据集上展开的分类实验验证了该分类模型M3RS的有效性。
  • 图  1  Mamba模型

    图  2  M3RS流程图

    图  3  空间Mamba

    图  4  谱Mamba

    图  5  交叉SS2D

    图  6  通道拼接SS1D

    图  7  多尺度特征融合模块

    图  8  Muufl数据集分类结果可视化

    图  9  Houston2013数据集Mamba与Transformer架构对比

    表  1  Muufl数据集分类结果对比(%)

    类别(训练/测试)CCRNetMFTExVitHCTM2FNetCross-HLMSFMambaM3RS
    1:树木(150/23096)89.9787.9089.5992.2090.9291.3988.5192.02
    2:草地(150/4120)79.3075.2779.4276.0772.8485.7082.1487.26
    3:混合地表(150/6732)80.5076.0079.6384.5182.1384.2180.2185.01
    4:土壤沙地(150/1676)94.0994.4595.2996.7896.3096.9694.5797.85
    5:公路(150/6537)87.3378.7576.8989.0285.0790.1287.3093.21
    6:水体(150/316)100.00100.00100.00100.00100.00100.00100.00100.00
    7:建筑阴影(150/2083)87.4791.5589.3488.5792.8991.2190.6492.17
    8:建筑(150/6090)96.3195.1493.3597.1194.9895.5395.5097.34
    9:人行道(150/1235)77.1760.3266.6476.4473.2882.0272.0685.26
    10:黄色路缘(150/33)96.9793.94100.0090.91100.00100.0093.9496.97
    11:防护布(150/119)97.4899.1699.1699.1697.4899.1699.1699.16
    OA88.1284.8686.0689.7988.0090.3687.5991.61
    AA89.6986.5988.1290.0789.6392.3989.4693.30
    Kappa84.4780.3481.8286.5784.2787.3383.8188.96
    下载: 导出CSV

    表  2  Houston2013数据集分类结果对比(%)

    类别(训练/测试)CCRNetMFTExVitHCTM2FNetCross-HLMSFMambaM3RS
    1:健康草地(198/1053)72.7476.5479.3975.5082.2876.5480.0676.92
    2:压力草地(190/1064)83.9393.3377.9195.9687.4185.1598.5996.15
    3:人工草地(192/505)91.8898.0297.8298.2295.2597.8296.6388.32
    4:树木(188/1056)89.3089.9687.9781.7292.9088.9293.2889.68
    5:土壤(186/1056)100.0099.9199.62100.0098.01100.00100.00100.00
    6:水体(182/143)95.8095.8095.8095.8095.8095.8100.00100.00
    7:住宅区(196/1072)72.4882.6587.0371.8378.9276.7785.6373.13
    8:商业区(191/1053)84.4377.4096.6893.5492.2174.5593.4594.4
    9:道路(193/1059)84.3288.2079.8989.5281.4977.4378.0083.95
    10:高速公路(193/1059)63.7170.8565.5466.5178.0968.5359.6596.81
    11:铁路(181/1054)99.1593.7495.6496.3997.9196.1183.1196.02
    12:停车场1(192/1041)97.5098.1798.5698.7594.43100.0095.0099.52
    13:停车场2(184/285)70.8876.1476.8484.5682.8172.2882.4680.70
    14:网球场(181/247)100.00100.00100.00100.0099.19100.00100.00100.00
    15:跑道(187/473)99.7999.3799.15100.00100.00100.00100.00100.00
    OA85.7588.1387.9188.2689.2885.7387.9790.95
    AA87.0689.3489.1989.8990.4687.3389.7291.71
    Kappa84.5287.1086.8787.2588.3684.4986.9490.17
    下载: 导出CSV

    表  3  Augsburg数据集分类结果对比(%)

    类别(训练/测试)CCRNetMFTExVitHCTM2FNetCross-HLMSFMambaM3RS
    1:森林(146/13361)93.5188.9093.6596.1694.7892.4494.9395.39
    2:住宅区(264/30065)99.0497.4395.9299.2497.3897.9599.4999.01
    3:工业区(21/3830)66.1133.9940.2638.2221.5761.853.6669.43
    4:低矮植物(248/26609)92.3787.5091.6893.2191.0989.9794.5192.82
    5:待开发地(52/523)61.9551.0548.9564.6337.4853.7330.5958.51
    6:商业区(7/1638)9.5212.7614.225.191.893.720.187.63
    7:水域(23/1507)48.9737.6217.3947.9111.7551.0913.0148.64
    OA91.0586.1587.7590.4186.9489.2888.0291.62
    AA67.3558.4757.4463.5150.8564.3948.0567.35
    Kappa87.1780.0582.2686.0480.7584.6882.0487.94
    下载: 导出CSV

    表  4  Houston2013数据集上的消融实验(%)

    模块OAAAKappa
    仅空间Mamba86.8789.0085.73
    添加谱Mamba89.0790.7588.13
    添加交叉Mamba89.2890.6888.36
    添加通道拼接Mamba90.1491.4789.29
    添加多尺度特征融合模块90.9591.7190.17
    下载: 导出CSV

    表  5  Houston2013数据集上空间Mamba层数和特征维度的超参数实验(%)

    VSSBlock层数特征维度OAAAKappa
    (2,2,9)(64,128,256)90.9591.7190.17
    (2,2,9,2)(64,128,256,512)89.5291.0088.62
    (2,2,27)(64,128,256)89.8891.3589.01
    (2,2,9)(96,192,384)87.2988.8586.20
    下载: 导出CSV
  • [1] 李树涛, 李聪妤, 康旭东. 多源遥感图像融合发展现状与未来展望[J]. 遥感学报, 2021, 25(1): 148–166. doi: 10.11834/jrs.20210259.

    LI Shutao, LI Congyu, and KANG Xudong. Development status and future prospects of multi-source remote sensing image fusion[J]. National Remote Sensing Bulletin, 2021, 25(1): 148–166. doi: 10.11834/jrs.20210259.
    [2] HANG Renlong, LI Zhu, GHAMISI P, et al. Classification of hyperspectral and LiDAR data using coupled CNNs[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(7): 4939–4950. doi: 10.1109/TGRS.2020.2969024.
    [3] REN Bo, HUA Chaoyue, HOU Biao, et al. PDCNet: A Polarimetric data-enhanced contrastive learning network for PolSAR land cover classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025, 18: 10010–10025. doi: 10.1109/JSTARS.2025.3557252.
    [4] REN Bo, WANG Zhao, GE Hanyuan, et al. Incremental land cover classification via soft label and subregion distillation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5647322. doi: 10.1109/TGRS.2025.3615670.
    [5] LI Shutao, SONG Weiwei, FANG Leyuan, et al. Deep learning for hyperspectral image classification: An overview[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(9): 6690–6709. doi: 10.1109/TGRS.2019.2907932.
    [6] MA Xianping, ZHANG Xiaokang, and PUN M Q. RS3Mamba: Visual state space model for remote sensing image semantic segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6011405. doi: 10.1109/LGRS.2024.3414293.
    [7] 刘晓敏, 余梦君, 乔振壮, 等. 面向多源遥感数据分类的尺度自适应融合网络[J]. 电子与信息学报, 2024, 46(9): 3693–3702. doi: 10.11999/JEIT240178.

    LIU Xiaomin, YU Mengjun, QIAO Zhenzhuang, et al. Scale adaptive fusion network for multimodal remote sensing data classification[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3693–3702. doi: 10.11999/JEIT240178.
    [8] 廖帝灵, 赖涛, 黄海风, 等. LightMamba: 一种轻量级Mamba用于高光谱图形和激光雷达数据联合分类网络[J]. 电子与信息学报, 2025, 47(12): 4937–4947. doi: 10.11999/JEIT250981.

    LIAO Diling, LAI Tao, HUANG Haifeng, et al. LightMamba: A lightweight mamba network for the joint classification of HSI and LiDAR data[J]. Journal of Electronics & Information Technology, 2025, 47(12): 4937–4947. doi: 10.11999/JEIT250981.
    [9] LAPARRA V, MALO J, and CAMPS-VALLS G. Dimensionality reduction via regression in hyperspectral imagery[J]. IEEE Journal of Selected Topics in Signal Processing, 2015, 9(6): 1026–1036. doi: 10.1109/JSTSP.2015.2417833.
    [10] MELGANI F and BRUZZONE L. Support vector machines for classification of hyperspectral remote-sensing images[C]. IEEE International Geoscience and Remote Sensing Symposium, Toronto, Canada, 2002: 506–508. doi: 10.1109/IGARSS.2002.1025088.
    [11] ZHOU Hao, LUO Fulin, ZHUANG Huiping, et al. Attention multihop graph and multiscale convolutional fusion network for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5508614. doi: 10.1109/TGRS.2023.3265879.
    [12] ZHAO Linying and JI Shunping. CNN, RNN, or VIT? An evaluation of different deep learning architectures for spatio-temporal representation of sentinel time series[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 44–56. doi: 10.1109/JSTARS.2022.3219816.
    [13] LU Ting, DING Kexin, FU Wei, et al. Coupled adversarial learning for fusion classification of hyperspectral and LiDAR data[J]. Information Fusion, 2023, 93: 118–131. doi: 10.1016/j.inffus.2022.12.020.
    [14] XU Xiaodong, LI Wei, RAN Qiong, et al. Multisource remote sensing data classification based on convolutional neural network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(2): 937–949. doi: 10.1109/TGRS.2017.2756851.
    [15] WANG Jinzhe, ZHANG Junping, GUO Qingle, et al. WANG Jinzhe, ZHANG Junping, GUO Qingle, et al. Fusion of hyperspectral and LiDAR data based on dual-branch convolutional neural network[C]. Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 2019: 3388–3391. doi: 10.1109/IGARSS.2019.8899332.
    [16] WU Xin, HONG Danfeng, and CHANUSSOT J. Convolutional neural networks for multimodal remote sensing data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5517010. doi: 10.1109/TGRS.2021.3124913.
    [17] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010.
    [18] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16 × 16 words: Transformers for image recognition at scale[C]. Proceedings of the 9th International Conference on Learning Representations, 2021. (查阅网上资料, 未找到对应的出版地及页码信息, 请确认补充).
    [19] XUE Zhixiang, TAN Xiong, YU Xuchu, et al. Deep hierarchical vision transformer for hyperspectral and LiDAR data classification[J]. IEEE Transactions on Image Processing, 2022, 31: 3095–3110. doi: 10.1109/TIP.2022.3162964.
    [20] ROY S K, DERIA A, HONG Danfeng, et al. Multimodal fusion transformer for remote sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5515620. doi: 10.1109/TGRS.2023.3286826.
    [21] YAO Jing, ZHANG Bing, LI Chenyu, et al. Extended Vision Transformer (ExViT) for land use and land cover classification: A multimodal deep learning framework[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5514415. doi: 10.1109/TGRS.2023.3284671.
    [22] ZHAO Guangrui, YE Qiaolin, SUN Le, et al. Joint classification of hyperspectral and LiDAR data using a hierarchical CNN and transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5500716. doi: 10.1109/TGRS.2022.3232498.
    [23] ROY S K, SUKUL A, JAMALI A, et al. Cross hyperspectral and LiDAR attention transformer: An extended self-attention for land use and land cover classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5512815. doi: 10.1109/TGRS.2024.3374324.
    [24] SUN Le, WANG Xinyu, ZHENG Yuhui, et al. Multiscale 3-D–2-D mixed CNN and lightweight attention-free transformer for hyperspectral and LiDAR classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 2100116. doi: 10.1109/TGRS.2024.3367374.
    [25] SMITH J T H, WARRINGTON A, and LINDERMAN S W. Simplified state space layers for sequence modeling[C]. Proceedings of the 11th International Conference on Learning Representations, Kigali, Rwanda, 2023: 1–13.
    [26] GU A and DAO T. Mamba: Linear-time sequence modeling with selective state spaces[EB/OL]. https://arxiv.org/abs/2312.00752, 2024.
    [27] ZHU Lianghui, LIAO Bencheng, ZHANG Qian, et al. Vision mamba: Efficient visual representation learning with bidirectional state space model[C]. Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 2024.
    [28] LIU Yue, TIAN Yunjie, ZHAO Yuzhong, et al. VMamba: Visual state space model[C]. Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 3273.
    [29] CHEN Keyan, CHEN Bowen, LIU Chenyang, et al. RSMamba: Remote sensing image classification with state space model[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 8002605. doi: 10.1109/LGRS.2024.3407111.
    [30] LIAO Diling, WANG Qingsong, LAI Tao, et al. Joint classification of hyperspectral and LiDAR data based on mamba[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5530915. doi: 10.1109/TGRS.2024.3459709.
    [31] GAO Feng, JIN Xuepeng, ZHOU Xiaowei, et al. MSFMamba: Multiscale feature fusion state space model for multisource remote sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5504116. doi: 10.1109/TGRS.2025.3535622.
    [32] 刁文辉, 龚铄, 辛林霖, 等. 针对多模态遥感数据的自监督策略模型预训练方法[J]. 电子与信息学报, 2025, 47(6): 1658–1668. doi: 10.11999/JEIT241016.

    DIAO Wenhui, GONG Shuo, XIN Linlin, et al. A model pre-training method with self-supervised strategies for multimodal remote sensing data[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1658–1668. doi: 10.11999/JEIT241016.
  • 加载中
图(9) / 表(5)
计量
  • 文章访问数:  3
  • HTML全文浏览量:  1
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 修回日期:  2026-04-17
  • 录用日期:  2026-04-17
  • 网络出版日期:  2026-05-03

目录

    /

    返回文章
    返回