高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

空间信息引导扩散的遥感图像领域自适应语义分割

梁燕 李俊范 邵凯 胡林

梁燕, 李俊范, 邵凯, 胡林. 空间信息引导扩散的遥感图像领域自适应语义分割[J]. 电子与信息学报. doi: 10.11999/JEIT260031
引用本文: 梁燕, 李俊范, 邵凯, 胡林. 空间信息引导扩散的遥感图像领域自适应语义分割[J]. 电子与信息学报. doi: 10.11999/JEIT260031
LIANG Yan, LI Jun-Fan, SHAO Kai, HU Lin. Spatial Information-Guided Diffusion for Remote Sensing Image Domain Adaptation Semantic Segmentation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260031
Citation: LIANG Yan, LI Jun-Fan, SHAO Kai, HU Lin. Spatial Information-Guided Diffusion for Remote Sensing Image Domain Adaptation Semantic Segmentation[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT260031

空间信息引导扩散的遥感图像领域自适应语义分割

doi: 10.11999/JEIT260031 cstr: 32379.14.JEIT260031
基金项目: 重庆市自然科学基金 (CSTB2025NSCQ-GPX1253)
详细信息
    作者简介:

    梁燕:女,高级工程师,主要研究领域为嵌入式人工智能、遥感图像处理

    李俊范:男,硕士研究生,主要研究领域为扩散模型、遥感图像处理、领域自适应、语义分割

    邵凯:男,副教授,主要研究领域为智能感知与信息系统、信号与信息智能处理

    胡林:男,讲师,主要研究领域为嵌入式人工智能、遥感图像处理、信号与信息智能处理

    通讯作者:

    梁 燕 liangyan@cqupt.edu.cn

  • 中图分类号: TP192.14

Spatial Information-Guided Diffusion for Remote Sensing Image Domain Adaptation Semantic Segmentation

Funds: Natural Science Foundation of Chongqing (CSTB2025NSCQ-GPX1253)
  • 摘要: 为提高遥感图像域自适应语义分割Domain Adaptation Semantic Segmentation(DASS)的跨域适应效果,本文提出基于协同训练与空间引导扩散模型的域自适应语义分割框架Co-Training Spatial-Guided DASS(CoSG-DASS)。CoSG-DASS使用空间引导扩散模型构成图像翻译减少域间差异,并利用协同训练提升模型对目标域的自适应能力:在图像翻译阶段,设计新型空间信息引导扩散模型,通过向潜在扩散模型注入水平语义分布伪标签与垂直语义分布深度估计重构空间引导信息,实现源域到目标域的语义无偏转换。其中针对伪标签质量问题,提出基于熵的引导强度自适应模块,通过熵值筛选高置信度区域特征以抑制噪声干扰,有效提升跨域成像差异下的语义对齐精度;在协同训练阶段,提出融合深度信息与对抗损失的训练策略,通过增强多维知识表征缩减类内差异并增大类间差异,提升模型的跨域自适应能力。通过在三类典型遥感跨域差异任务(跨地理环境、跨成像模式、标签语义异质)中仿真验证,本文所提CoSG-DASS表现优异,相较于已有方法在平均交并比(mIoU)分别有1.14%、3.78%和2.49%的提升。
  • 图  1  不同数据集之间的域差异

    图  2  CoSG-DASS总体框架图

    图  3  残差特征提取模块RFEM

    图  4  基于信息熵的引导强度自适应模块结构EAGIM

    图  5  自适应分割训练策略

    图  6  图像翻译质量结果可视化

    图  7  对比实验模型语义分割结果可视化

    图  8  UMAP 语义分割输出特征可视化

    图  9  协同训练策略在训练稳定性中的贡献

    表  1  图像翻译质量结果量化对比

    生成模型 场景(1) 场景(2) 场景(3)
    FID FID FID FID
    CycleGAN[11] 74.25 74.6 111.12 96.87
    DiscoGAN[20] 102.82 72.77 103.71 95.55
    UNI-Diff[14] 64.72 59.83 90.16 80.57
    CRS-Diff[15] 74.49 74.46 94.71 79.66
    COSG-DASS 72.61 69.73 89.91 82.76
    下载: 导出CSV

    表  2  跨地理区域场景Vaihingen IRRG(源域)至Potsdam IRRG(目标域)语义分割对比(*%)

    任务自适应方法不透水层建筑物低矮植被树木汽车其它mIoU
    Vaihingen IRRG

    Potsdam IRRG
    未适应57.8244.8222.1828.6169.6023.3144.61
    仅源域86.4990.7173.7073.6274.2072.0179.74
    CycleGAN[11]58.0167.4638.6139.1957.0222.0652.06
    DiscoGAN[20]59.0866.6144.9434.7854.2316.3551.93
    CyCADA[12]12.5529.2447.3851.273.539.8328.79
    CLAN[21]66.2972.6642.7245.4867.4621.4458.92
    FADA[22]65.7373.9643.3646.7259.0319.0357.76
    CorDA[10]67.3969.4354.8348.6768.529.3061.77
    AdaptSegNet[23]66.3370.8052.0344.6170.1021.8060.77
    DAFormer[24]68.4576.1853.6749.5268.9320.1262.85
    DIFF[25]67.8273.5452.5149.0368.2117.8462.10
    UNI-Diff[14]69.8178.7351.9550.2867.8423.5763.72
    CRS-Diff[15]67.8180.4558.4050.7469.4418.5665.37
    OUR(COSG-DASS)70.2779.7854.9955.7071.8121.7166.51
    下载: 导出CSV

    表  3  跨成像模式场景Vaihingen IRRG(源域)至Potsdam RGB(目标域)语义分割对比(*%)

    任务自适应方法不透水层建筑物低矮植被树木汽车其它mIoU
    Vaihingen IRRG

    Potsdam RGB
    未适应50.3643.601.812.3569.6411.8933.55
    仅源域86.4990.7173.7073.6274.2072.0179.74
    CycleGAN[11]53.8064.0332.3533.0550.7817.6846.80
    DiscoGAN[20]65.0869.9349.1341.1659.4918.8956.92
    CyCADA[12]35.2942.2426.5124.706.867.8227.12
    CLAN[21]62.9871.9147.5147.7766.1910.9159.27
    FADA[22]63.9973.5637.1139.5759.0115.2754.65
    CorDA[10]65.9372.5251.8046.4764.6315.7860.27
    AdaptSegNet[23]58.6565.3137.9539.3469.7014.3454.19
    DAFormer[24]63.2571.3848.7646.8265.7414.8558.80
    DIFF[25]64.8774.0650.2447.1566.8215.9359.85
    UNI-Diff[14]60.2772.6342.5145.6468.6711.4757.94
    CRS-Diff[15]59.6376.0144.3743.3869.4113.2258.56
    OUR(COSG-DASS)66.8079.8050.2044.7870.156.5662.34
    下载: 导出CSV

    表  4  语义异质性场景(DFC25与LoveDA数据集互相迁移)语义分割对比(*%)

    任务 自适应方法 背景 建筑 道路 水体 裸地/开发区区 森林/树木木 耕地 mIoU
    DFC25

    LoveDA
    未适应 24.83 53.47 41.57 64.90 1.13 32.88 18.30 36.46
    仅源域 76.15 81.78 68.86 88.50 71.74 83.60 72.91 78.44
    CycleGAN[11] 23.19 53.31 43.69 65.50 3.79 32.53 21.85 37.00
    DiscoGAN[20] 23.45 50.34 42.72 56.92 1.35 28.57 12.43 33.89
    CyCADA[12] 26.30 46.72 41.11 52.93 0.78 32.78 14.04 33.43
    CLAN[21] 26.99 52.08 39.79 68.76 0.77 31.85 15.61 36.71
    FADA[22] 24.85 52.41 42.07 61.32 1.14 33.04 18.22 35.81
    CorDA[10] 37.50 38.77 23.25 63.56 3.32 64.07 7.69 38.41
    AdaptSegNet[23] 25.69 52.80 46.38 57.42 1.63 36.33 19.47 36.71
    DAFormer[24] 24.97 53.11 48.25 70.12 2.18 34.86 20.35 38.25
    DIFF[25] 23.86 52.94 47.53 69.87 1.95 34.22 19.87 37.75
    UNI-Diff[14] 21.44 53.07 45.70 70.69 2.36 34.75 20.68 38.00
    CRS-Diff[15] 20.86 52.29 46.61 73.46 2.54 35.72 18.23 38.58
    OUR(COSG-DASS) 25.82 53.24 49.67 71.64 1.27 33.94 20.46 39.26
    LoveDA

    DFC25
    未适应 22.28 48.03 37.42 45.85 9.13 47.57 3.43 35.05
    仅源域 40.42 63.82 61.46 71.09 33.08 43.99 43.17 52.31
    CycleGAN[11] 22.07 53.48 29.43 58.74 4.95 29.17 8.69 32.97
    DiscoGAN[20] 24.10 48.54 30.85 55.80 6.20 52.43 5.16 36.32
    CyCADA[12] 21.60 50.31 32.12 39.50 4.05 49.37 6.99 32.82
    CLAN[21] 22.24 49.10 33.25 41.58 6.26 46.73 11.66 33.19
    FADA[22] 28.23 48.40 32.32 53.69 4.61 45.78 10.64 35.50
    CorDA[10] 24.07 39.07 31.09 49.78 7.74 22.77 26.23 29.09
    AdaptSegNet[23] 21.59 47.48 29.72 51.83 11.05 54.99 6.21 36.11
    DAFormer[24] 30.08 49.82 33.56 65.43 5.17 59.04 11.3 39.2
    DIFF[25] 29.54 48.76 32.18 64.05 3.82 58.47 10.68 38.5
    UNI-Diff[14] 29.22 48.27 31.03 63.32 2.63 58.32 3.22 38.80
    CRS-Diff[15] 30.63 50.65 31.01 60.90 5.04 59.62 10.57 39.64
    OUR(COSG-DASS) 31.55 50.19 35.47 70.07 6.20 59.32 14.40 42.13
    下载: 导出CSV

    表  5  全局模块消融实验

    任务基线深度控制伪标签EAGIM协同
    训练
    mIoU
    (%)
    增量(%)
    Vaihingen
    IRRG

    Potsdam
    RGB
    33.55----
    55.6422.09
    50.1616.61
    58.2324.68
    35.760.21
    62.3428.8
    DFC25

    LoveDA
    36.46----
    37.731.27
    37.190.73
    38.762.30
    37.360.90
    39.262.80
    下载: 导出CSV

    表  6  EAGIM系数对模型精度的影响(mIoU*%)

    EAGIM系数跨地理区域跨成像模式标签语义异质
    0.364.8660.1237.06
    0.566.5162.0842.13
    0.765.7260.6541.95
    0.965.1359.4339.63
    下载: 导出CSV

    表  7  $ {\lambda }_{\text{adv}} $参数对模型精度影响实验

    $ {\lambda }_{\text{adv}} $0.050.100.150.200.50
    V IR→P R62.2062.3462.3361.3161.74
    V IR→P IR66.3466.5165.5664.1763.12
    D→L39.6539.2639.1139.0739.12
    L→D41.1542.1342.1440.4239.07
    下载: 导出CSV
  • [1] 宋淼, 陈志强, 王培松, 等. DetDiffRS: 面向细节优化的遥感图像超分辨率扩散模型[J]. 电子与信息学报, 2025, 47(12): 4763–4778. doi: 10.11999/JEIT250995.

    SONG Miao, CHEN Zhiqiang, WANG Peisong, et al. DetDiffRS: A detail-enhanced diffusion model for remote sensing image super-resolution[J]. Journal of Electronics & Information Technology, 2025, 47(12): 4763–4778. doi: 10.11999/JEIT250995.
    [2] 刁文辉, 龚铄, 辛林霖, 等. 针对多模态遥感数据的自监督策略模型预训练方法[J]. 电子与信息学报, 2025, 47(6): 1658–1668. doi: 10.11999/JEIT241016.

    DIAO Wenhui, GONG Shuo, XIN Linlin, et al. A model pre-training method with self-supervised strategies for multimodal remote sensing data[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1658–1668. doi: 10.11999/JEIT241016.
    [3] 余翔, 庞志濠. 融合FEB的YOLOX遥感图像目标检测算法[J]. 重庆邮电大学学报: 自然科学版, 2024, 36(2): 319–327. doi: 10.3979/j.issn.1673-825X.202302120032.

    YU Xiang and PANG Zhihao. YOLOX remote sensing image object detection algorithm based on FEB[J]. Journal of Chongqing University of Posts and Telecommunications: Natural Science Edition, 2024, 36(2): 319–327. doi: 10.3979/j.issn.1673-825X.202302120032.
    [4] 厉行, 樊养余, 郭哲, 等. 基于边缘领域自适应的立体匹配算法[J]. 电子与信息学报, 2024, 46(7): 2970–2980. doi: 10.11999/JEIT231113.

    LI Xing, FAN Yangyu, GUO Zhe, et al. Edge domain adaptation for stereo matching[J]. Journal of Electronics & Information Technology, 2024, 46(7): 2970–2980. doi: 10.11999/JEIT231113.
    [5] TEE Y Y, HONG Xuenong, CHENG Deruo, et al. Unsupervised domain adaptation with pseudo shape supervision for IC image segmentation[C]. 2024 IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA), Singapore, Singapore, 2024: 1–6. doi: 10.1109/IPFA61654.2024.10690992.
    [6] HOFFMAN J, WANG Dequan, YU F, et al. FCNs in the wild: Pixel-level adversarial and constraint-based adaptation[EB/OL]. https://arxiv.org/abs/1612.02649, 2016.
    [7] VU T H, JAIN H, BUCHER M, et al. ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 2512–2521. doi: 10.1109/CVPR.2019.00262.
    [8] ZOU Yang, YU Zhiding, VIJAYA KUMAR B V K, et al. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training[C]. 15th European Conference on Computer Vision, Munich, Germany, 2018: 297–313. doi: 10.1007/978-3-030-01219-9_18.
    [9] VU T H, JAIN H, BUCHER M, et al. DADA: Depth-aware domain adaptation in semantic segmentation[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, 2019: 7363–7372. doi: 10.1109/ICCV.2019.00746.
    [10] WANG Qin, DAI Dengxin, HOYER L, et al. Domain adaptive semantic segmentation with self-supervised depth estimation[C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 2021: 8495–8505. doi: 10.1109/ICCV48922.2021.00840.
    [11] ZHU Junyan, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2242–2251. doi: 10.1109/ICCV.2017.244.
    [12] HOFFMAN J, TZENG E, PARK T, et al. CyCADA: Cycle-consistent adversarial domain adaptation[C]. 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 1989–1998.
    [13] ZHANG Lvmin, RAO Anyi, and AGRAWALA M. Adding conditional control to text-to-image diffusion models[C]. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023: 3813–3824. doi: 10.1109/ICCV51070.2023.00355.
    [14] DONG Xiao, HUANG Runhui, WEI Xiaoyong, et al. UniDiff: Advancing vision-language models with generative and discriminative learning[EB/OL]. https://arxiv.org/abs/2306.00813, 2023.
    [15] TANG Datao, CAO Xiangyong, HOU Xingsong, et al. CRS-Diff: Controllable remote sensing image generation with diffusion model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5638714. doi: 10.1109/TGRS.2024.3453414.
    [16] KINGMA D P and WELLING M. Auto-encoding variational Bayes[EB/OL]. https://arxiv.org/abs/1312.6114v11, 2022.
    [17] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]. 38th International Conference on Machine Learning, 2021: 8748–8763. (查阅网上资料, 未找到本条文献出版地信息, 请确认).
    [18] PARK T, LIU Mingyu, WANG Tingchun, et al. Semantic image synthesis with spatially-adaptive normalization[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 2332–2341. doi: 10.1109/CVPR.2019.00244.
    [19] 梁燕, 易春霞, 王光宇, 等. 基于多尺度语义编解码网络的遥感图像语义分割[J]. 电子学报, 2023, 51(11): 3199–3214. doi: 10.12263/DZXB.20220503.

    LIANG Yan, YI Chunxia, WANG Guangyu, et al. Semantic segmentation of remote sensing image based on multi-scale semantic encoder-decoder network[J]. Acta Electronica Sinica, 2023, 51(11): 3199–3214. doi: 10.12263/DZXB.20220503.
    [20] ZHANG Xiaoke, HU Zongsheng, ZHANG Guoliang, et al. Dose calculation in proton therapy using a discovery cross-domain generative adversarial network (DiscoGAN)[J]. Medical Physics, 2021, 48(5): 2646–2660. doi: 10.1002/mp.14781.
    [21] LUO Yawei, ZHENG Liang, GUAN Tao, et al. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, 2019: 2502–2511. doi: 10.1109/CVPR.2019.00261.
    [22] XU Tao, SUN Xian, DIAO Wenhui, et al. FADA: Feature aligned domain adaptive object detection in remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5617916. doi: 10.1109/TGRS.2022.3147224.
    [23] TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to adapt structured output space for semantic segmentation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7472–7481. doi: 10.1109/CVPR.2018.00780.
    [24] HOYER L, DAI Dengxin, and VAN GOOL L. DAFormer: Improving network architectures and training strategies for domain-adaptive semantic segmentation[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 9914–9925. doi: 10.1109/CVPR52688.2022.00969.
    [25] JI Yuxiang, HE Boyong, QU Chenyuan, et al. Diffusion features to bridge domain gap for semantic segmentation[C]. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025: 1–5. doi: 10.1109/ICASSP49660.2025.10888537.
    [26] WANG Libo, LI Rui, ZHANG Ce, et al. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196–214. doi: 10.1016/j.isprsjprs.2022.06.008.
    [27] 梁燕, 杨会林, 邵凯. 自适应特征选择的车路协同3D目标检测方案[J]. 电子与信息学报, 2025, 47(12): 5214–5225. doi: 10.11999/JEIT250601.

    LIANG Yan, YANG Huilin, and SHAO Kai. A vehicle-infrastructure cooperative 3D object detection scheme based on adaptive feature selection[J]. Journal of Electronics & Information Technology, 2025, 47(12): 5214–5225. doi: 10.11999/JEIT250601.
  • 加载中
图(9) / 表(7)
计量
  • 文章访问数:  36
  • HTML全文浏览量:  10
  • PDF下载量:  8
  • 被引次数: 0
出版历程
  • 收稿日期:  2026-01-09
  • 修回日期:  2026-03-15
  • 录用日期:  2026-04-09
  • 网络出版日期:  2026-04-27

目录

    /

    返回文章
    返回