Advanced Search
Turn off MathJax
Article Contents
JI Hong, GAO Zhi, CHEN Boan, AO Wei, CAO Min, WANG Qiao. Knowledge-Guided Few-Shot Earth Surface Anomalies Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251000
Citation: JI Hong, GAO Zhi, CHEN Boan, AO Wei, CAO Min, WANG Qiao. Knowledge-Guided Few-Shot Earth Surface Anomalies Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251000

Knowledge-Guided Few-Shot Earth Surface Anomalies Detection

doi: 10.11999/JEIT251000 cstr: 32379.14.JEIT251000
Funds:  The National Natural Science Foundation of China Major Program (42192580, 42192583), The National Natural Science Foundation of China (42501503)
  • Received Date: 2025-09-26
  • Accepted Date: 2025-11-05
  • Rev Recd Date: 2025-11-05
  • Available Online: 2025-11-13
  •   Objective   Earth Surface Anomalies (ESAs), referring to sudden natural or human-induced disasters on the Earth’s surface, pose severe risks and widespread impacts. Timely and accurate earth surface anomalies detection is therefore crucial for social security and sustainable development. Remote sensing provides an effective means for earth surface anomalies detection. However, the performance of existing deep learning models remains constrained due to the scarcity of labeled data, the complexity of anomaly backgrounds, and the distribution shift across multi-source remote sensing imagery. To address these challenges, this paper proposes a knowledge-guided few-shot learning method. The method leverages large language models to generate abstract textual descriptions of normal and anomalous geospatial features, which are encoded and fused with visual prototypes to form a cross-modal joint representation. This integration improves prototype discriminability in few-shot settings and demonstrates the necessity of incorporating linguistic knowledge into earth surface anomalies detection, offering a promising direction for reliable disaster monitoring when annotated data are scarce.  Methods   The knowledge-guided few-shot learning method is built on a metric-based paradigm, where each episode consists of support and query sets and classification is achieved by comparing query features with class prototypes using distance-based similarity and cross-entropy optimization (Figure 1). To supplement limited visual prototypes, class-level textual descriptions are generated with ChatGPT through carefully designed prompts, producing semantic sentences that characterize the appearance, attributes, and contextual relations of both normal and anomalous categories (Figures 23). These descriptions encode domain-specific properties such as anomaly extent, morphology, and environmental impact, which are otherwise difficult to capture with scarce visual samples. The sentences are encoded with a CLIP (Contrastive Language–Image Pre-training) text encoder, and task-adaptive soft prompts are introduced by generating tokens from support features and concatenating them with static embeddings, yielding adaptive word embeddings. Encoded sentence vectors are then processed by a lightweight self-attention module to model dependencies across multiple descriptions, resulting in a coherent paragraph-level semantic representation (Figure 4). The obtained semantic prototypes are fused with the visual prototypes through weighted addition, producing cross-modal prototypes that combine visual grounding and linguistic abstraction. During training, query samples are compared with cross-modal prototypes, and optimization is guided by two objectives: a classification loss that enforces accurate query–prototype alignment, and a prototype regularization loss that ensures semantic prototypes are discriminative and well separated. The entire process is implemented in an episodic training framework (Algorithm 1).  Results and Discussions   The proposed method is evaluated under both cross-domain and in-domain few-shot settings. In the cross-domain case, models are trained on NWPU45 or AID and tested on ESAD to assess earth surface anomalies recognition. As shown in the comparisons (Table 2), traditional meta-learning methods such as MAML and Meta-SGD achieve accuracies below 50%, while metric-based baselines like ProtoNet and RelationNet are more stable but still limited. The proposed method reach 61.99% on NWPU45→ESAD and 59.79% on AID→ESAD settings, outperforming ProtoNet by 4.72% and 2.67% respectively. In the in-domain setting, training and testing on the same dataset, the method achieve 76.94% on NWPU45 and 72.98% on AID, consistently surpassing state-of-the-art baselines such as S2M2, and IDLN (Table 3). Ablation experiments further validate the contribution of each component. Using only visual prototypes produce accuracies of 57.74% and 72.16%, while progressively incorporating simple class names, task-oriented templates, and ChatGPT-generated descriptions improved results. The best performance is obtained by combining ChatGPT descriptions, learnable tokens, and attention-based mechanism, reaching 61.99% and 76.94% (Table 4). Parameter sensitivity analysis confirms that an appropriate weight for language features ($\alpha $=0.2) and two learnable tokens yield optimal performance (Figure 5).  Conclusions   This paper addresses the task of earth surface anomalies detection in remote sensing imagery by introducing a knowledge-guided few-shot learning method. The method exploits large language models to automatically generate abstract textual descriptions for both anomaly categories and conventional remote sensing scenes, thereby constructing multimodal training and testing resources. These descriptions are encoded into semantic feature vectors through a pretrained text encoder. To extract task-specific knowledge, a dynamic token learning strategy is designed, in which a small number of learnable parameters are guided by visual samples within few-shot tasks to generate adaptive semantic vectors. An attention-based semantic knowledge module then models dependencies among language features, producing cross-modal semantic vectors for each class. By fusing these vectors with visual prototypes, the method forms joint multimodal representations that are used for query–prototype matching and network optimization. Experimental evaluations demonstrate that the proposed method effectively leverages prior knowledge from pretrained models, compensates for the limitations of scarce visual data, and enhances feature discriminability for anomalies recognition. Both cross-domain and in-domain results confirm that the method achieves consistent improvements over competitive baselines, highlighting its potential for reliable application in real-world remote sensing anomalies detection scenarios.
  • loading
  • [1]
    王桥. 地表异常遥感探测与即时诊断方法研究框架[J]. 测绘学报, 2022, 51(7): 1141–1152. doi: 10.11947/j.AGCS.2022.20220124.

    WANG Qiao. Research framework of remote sensing monitoring and real-time diagnosis of earth surface anomalies[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(7): 1141–1152. doi: 10.11947/j.AGCS.2022.20220124.
    [2]
    WEI Haishuo, JIA Kun, WANG Qiao, et al. Real-time remote sensing detection framework of the earth's surface anomalies based on a priori knowledge base[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 122: 103429. doi: 10.1016/j.jag.2023.103429.
    [3]
    高智, 胡傲涵, 陈泊安, 等. 多层级几何—语义融合的图神经网络地表异常检测框架[J]. 遥感学报, 2024, 28(7): 1760–1770. doi: 10.11834/jrs.20243301.

    GAO Zhi, HU Aohan, CHEN Boan, et al. A hierarchical geometry-to-semantic fusion GNN framework for earth surface anomalies detection[J]. National Remote Sensing Bulletin, 2024, 28(7): 1760–1770. doi: 10.11834/jrs.20243301.
    [4]
    刘思琪, 高智, 陈泊安, 等. 基于图网络的遥感地物关系表达与推理的地表异常检测[J]. 电子与信息学报, 2025, 47(6): 1690–1703. doi: 10.11999/JEIT240883.

    LIU Siqi, GAO Zhi, CHEN Boan, et al. Earth surface anomaly detection using graph neural network-based representation and reasoning of remote sensing geographic object relationships[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1690–1703. doi: 10.11999/JEIT240883.
    [5]
    ZHAO Chuanwu, PAN Yaozhong, WU Hanyi, et al. A novel spectral index for vegetation destruction event detection based on multispectral remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 11290–11309. doi: 10.1109/JSTARS.2024.3412737.
    [6]
    WU Hanyi, ZHAO Chuanwu, ZHU Yu, et al. A multiscale examination of heat health risk inequality and its drivers in mega-urban agglomeration: A case study in the Yangtze River Delta, China[J]. Journal of Cleaner Production, 2024, 458: 142528. doi: 10.1016/j.jclepro.2024.142528.
    [7]
    WEI Haishuo, JIA Kun, WANG Qiao, et al. A remote sensing index for the detection of multi-type water quality anomalies in complex geographical environments[J]. International Journal of Digital Earth, 2024, 17(1): 2313695. doi: 10.1080/17538947.2024.2313695.
    [8]
    ROY D P, JIN Y, LEWIS P E, et al. Prototyping a global algorithm for systematic fire-affected area mapping using MODIS time series data[J]. Remote Sensing of Environment, 2005, 97(2): 137–162. doi: 10.1016/j.rse.2005.04.007.
    [9]
    王立波, 高智, 王桥. 融合遥感指数协同推理的地表异常检测方法[J]. 电子与信息学报, 2025, 47(6): 1669–1678. doi: 10.11999/JEIT240882.

    WANG Libo, GAO Zhi, and WANG Qiao. A novel earth surface anomaly detection method based on collaborative reasoning of deep learning and remote sensing indexes[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1669–1678. doi: 10.11999/JEIT240882.
    [10]
    ZHANG Zilun, ZHAO Tiancheng, GUO Yulong, et al. RS5M and GeoRSCLIP: A large-scale vision- language dataset and a large vision-language model for remote sensing[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5642123. doi: 10.1109/TGRS.2024.3449154.
    [11]
    GE Junyao, ZHANG Xu, ZHENG Yang, et al. RSTeller: Scaling up visual language modeling in remote sensing with rich linguistic semantics from openly available data and large language models[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2025, 226: 146–163. doi: 10.1016/j.isprsjprs.2025.05.002.
    [12]
    ZHENG Zhuo, ZHONG Yanfei, WANG Junjue, et al. Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters[J]. Remote Sensing of Environment, 2021, 265: 112636. doi: 10.1016/j.rse.2021.112636.
    [13]
    KYRKOU C and THEOCHARIDES T. EmergencyNet: Efficient aerial image classification for drone-based emergency monitoring using atrous convolutional feature fusion[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 1687–1699. doi: 10.1109/JSTARS.2020.2969809.
    [14]
    CHEN Boan, GAO Zhi, LI Ziyao, et al. Hierarchical GNN framework for earth’s surface anomaly detection in single satellite imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5627314. doi: 10.1109/TGRS.2024.3408330.
    [15]
    CHEN Wenyuan, LIU Yen-Cheng, KIRA Zsolt, et al. A closer look at few-shot classification[C]. International Conference on Learning Representations, New Orleans, USA, 2019. (查阅网上资料, 未找到本条文献信息, 请确认).
    [16]
    SNELL J, SWERSKY K, and ZEMEL R. Prototypical networks for few-shot learning[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 4080–4090.
    [17]
    FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]. Proceedings of the 34th International Conference on Machine Learning - Volume 70, Sydney, Australia, 2017: 1126–1135.
    [18]
    RADFORD A, KIM J, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]. Proceedings of the 38th International Conference on Machine Learning, 2021: 8748–8763. (查阅网上资料, 未找到本条文献出版地信息, 请确认).
    [19]
    XU Jingyi and LE H. Generating representative samples for few-shot classification[C]. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 8993–9003. doi: 10.1109/CVPR52688.2022.00880.
    [20]
    ZHANG Baoquan, LI Xutao, YE Yunming, et al. Prototype completion with primitive knowledge for few-shot learning[C]. Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 3753–3761. doi: 10.1109/CVPR46437.2021.00375.
    [21]
    LIU Fan, CHEN Delong, GUAN Zhangqingyun, et al. RemoteCLIP: A vision language foundation model for remote sensing[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5622216. doi: 10.1109/TGRS.2024.3390838.
    [22]
    张永军, 李彦胜, 党博, 等. 多模态遥感基础大模型: 研究现状与未来展望[J]. 测绘学报, 2024, 53(10): 1942–1954. doi: 10.11947/j.AGCS.2024.20240019.

    ZHANG Yongjun, LI Yansheng, DANG Bo, et al. Multi-modal remote sensing large foundation models: Current research status and future prospect[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(10): 1942–1954. doi: 10.11947/j.AGCS.2024.20240019.
    [23]
    OpenAI. 隆重推出ChatGPT[EB/OL]. https://openai.com/blog/chatgpt, 2022.
    [24]
    GUPTA R, HOSFELT D, DODGE S, et al. Creating xBD: A dataset for assessing building damage from satellite imagery[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 447–456. (查阅网上资料, 未找到本条文献信息, 请确认).
    [25]
    RUDNER T G J, RUSSWURM M, FIL J, et al. Multi3Net: Segmenting flooded buildings via fusion of multiresolution, multisensor, and multitemporal satellite imagery[C]. The Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA, 2019, 33: 702–709. doi: 10.1609/aaai.v33i01.3301702.
    [26]
    曾超, 曹振宇, 苏凤环, 等. 四川及周边滑坡泥石流灾害高精度航空影像及解译数据集(2008–2020年)[J]. 中国科学数据, 2022, 7(2): 191–201. doi: 10.11922/noda.2021.0005.zh.

    ZENG Chao, CAO Zhenyu, SU Fenghuan, et al. A dataset of high-precision aerial imagery and interpretation of landslide and debris flow disaster in Sichuan and surrounding areas between 2008 and 2020[J]. China Scientific Data, 2022, 7(2): 195–205. doi: 10.11922/noda.2021.0005.zh.
    [27]
    CHENG Gong, HAN Junwei, LU Xiaoqiang. Remote sensing image scene classification: Benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865–1883. doi: 10.1109/JPROC.2017.2675998.
    [28]
    LI Haifeng, CUI Zhenqi, ZHU Zhiqiang, et al. RS-MetaNet: Deep metametric learning for few-shot remote sensing scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(8): 6983–6994. doi: 10.1109/TGRS.2020.3027387.
    [29]
    LI Lingjun, HAN Junwei, YAO Xiwen, et al. DLA-MatchNet for few-shot remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(9): 7844–7853. doi: 10.1109/TGRS.2020.3033336.
    [30]
    XIA Guisong, HU Jingwen, HU Fan, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965–3981. doi: 10.1109/TGRS.2017.2685945.
    [31]
    MANGLA P, SINGH M, SINHA A, et al. Charting the right manifold: Manifold mixup for few-shot learning[C]. Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision, Snowmass, USA, 2020: 2207–2216. doi: 10.1109/WACV45572.2020.9093338.
    [32]
    SUNG F, YANG Yongxin, ZHANG Li, et al. Learning to compare: Relation network for few-shot learning[C]. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1199–1208. doi: 10.1109/CVPR.2018.00131.
    [33]
    VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]. Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 3637–3645.
    [34]
    NICHOL A, ACHIAM J, and SCHULMAN J. On first-order meta-learning algorithms[J]. arXiv preprint arXiv: 1803.02999, 2018. doi: 10.48550/arXiv.1803.02999. (查阅网上资料,不确定本文献类型是否正确,请确认).
    [35]
    CHENG Gong, CAI Liming, LANG Chunbo, et al. SPNet: Siamese-prototype network for few-shot remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5608011. doi: 10.1109/TGRS.2021.3099033.
    [36]
    ZENG Qingjie, GENG Jie, JIANG Wen, et al. IDLN: Iterative distribution learning network for few-shot remote sensing image scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 8020505. doi: 10.1109/LGRS.2021.3109728.
    [37]
    LI Xiaomin, SHI Daqian, DIAO Xiaolei, et al. SCL-MLNet: Boosting few-shot remote sensing scene classification via self-supervised contrastive learning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5801112. doi: 10.1109/TGRS.2021.3109268.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(5)

    Article Metrics

    Article views (22) PDF downloads(4) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return