Advanced Search
Turn off MathJax
Article Contents
SHENG Weidong, WU Shuanglin, XIAO Chao, LONG Yunli, LI Xiaobin, ZHANG Yiming. Differentiable Sparse Mask Guided Infrared Small Target Fast Detection Network[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250989
Citation: SHENG Weidong, WU Shuanglin, XIAO Chao, LONG Yunli, LI Xiaobin, ZHANG Yiming. Differentiable Sparse Mask Guided Infrared Small Target Fast Detection Network[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250989

Differentiable Sparse Mask Guided Infrared Small Target Fast Detection Network

doi: 10.11999/JEIT250989 cstr: 32379.14.JEIT250989
Funds:  National Natural Science Foundation of China under Grant No. 62501609
  • Received Date: 2025-09-24
  • Accepted Date: 2025-12-08
  • Rev Recd Date: 2025-12-08
  • Available Online: 2025-12-11
  •   Objective  Infrared small target detection holds significant and irreplaceable application value across various critical domains, including infrared guidance, environmental monitoring, and security surveillance. Its importance is underscored by tasks such as early warning systems, precision targeting, and pollution tracking, where timely and accurate detection is paramount. The core challenges in this domain stem from the inherent characteristics of infrared small targets: their extremely small size (typically less than 9×9 pixels), limited spatial features due to long imaging distance and the high probability of being overwhelmed by complex and cluttered backgrounds, such as cloud cover, sea glint, or urban thermal noise. These factors make it difficult to distinguish genuine targets from background clutter using conventional methods. Existing approaches to infrared small target detection can be broadly categorized into traditional model-based methods and modern deep learning techniques. Traditional methods often rely on manually designed background suppression operators, such as morphological filters (e.g., Top-Hat) or low-rank matrix recovery (e.g., IPI). While these methods are interpretable in simple scenarios, they struggle to adapt to dynamic and complex real-world environments, leading to high false alarm rates and limited robustness. On the other hand, deep learning-based methods, particularly those employing dense convolutional neural networks (CNNs), have shown improved detection performance by leveraging data-driven feature learning. However, these networks often fail to fully account for the extreme imbalance between target and background pixels—where targets typically constitute less than 1% of the entire image. This imbalance results in significant computational redundancy, as the network processes vast background regions that contribute little to the detection task, thereby hampering efficiency and real-time performance. To address these challenges, exploiting the sparsity of infrared small targets offers a promising direction. By designing a sparse mask generation module that capitalizes on target sparsity, it becomes feasible to coarsely extract potential target regions while filtering out the majority of redundant background areas. This coarse target region can then be refined through subsequent processing stages to achieve satisfactory detection performance. This paper presents an intelligent solution that effectively balances high detection accuracy with computational efficiency, making it suitable for real-time applications.  Methods  This paper proposes an end-to-end infrared small target detection network guided by a differentiable sparse mask. First, an input infrared image is preprocessed with convolution to generate raw features. A differentiable sparse mask generation module then uses two convolution branches to produce a probability map and a threshold map, and outputs a binary mask via a differentiable binarization function to extract target candidate regions and filter background redundancy. Next, a target region sampling module converts dense raw features into sparse features based on the binary mask. A sparse feature extraction module with a U-shaped structure (composed of encoders, decoders, and skip connections) using Minkowski Engine sparse convolution performs refined processing only on non-zero target regions to reduce computation. Finally, a pyramid pooling module fuses multi-scale sparse features, and the fused features are fed into a target-background binary classifier to output detection results.  Results and Discussions  To fully validate the effectiveness of the proposed method, comprehensive experiments were conducted on two mainstream infrared small target datasets: NUAA-SIRST, which contains 427 real-world infrared images extracted from actual videos, and NUDT-SIRST, a large-scale synthetic dataset with 1327 diverse images. The method was compared against 3 representative traditional algorithms (e.g., Top-Hat, IPI) and 6 state-of-the-art deep learning methods (e.g., DNA-Net, ACM). Results demonstrate the method achieves competitive detection performance: on NUAA-SIRST, it attains 74.38% IoU, 100% Pd, and 7.98×10-6 Fa; on NUDT-SIRST, it reaches 83.03% IoU, 97.67% Pd, and 9.81×10-6 Fa, matching the performance of leading deep learning methods. Notably, it excels in efficiency: with only 0.35M parameters, 11.10G Flops, and 215.06 FPS, its FPS is 4.8 times that of DNA-Net, significantly cutting computational redundancy. Ablation experiments (Fig.6) confirm the differentiable sparse mask module effectively filters most backgrounds while preserving target regions. Visual results (Fig.5) show fewer false alarms than traditional methods like PSTNN, as its "coarse-to-fine" mode reduces background interference, verifying balanced performance and efficiency.  Conclusions  This paper addresses the massive computational redundancy of existing dense computing methods in infrared small target detection—caused by extremely unbalanced target-background proportion (target proportion is usually smaller than 1% of the whole image)—by proposing a fast infrared small target detection network guided by a differentiable sparse mask. The network adaptively extracts candidate target regions and filters background redundancy via a differentiable sparse mask generation module, and constructs a feature extraction module based on Minkowski Engine sparse convolution to reduce computation, forming an end-to-end "coarse-to-fine" detection framework. Experiments on NUDT-SIRST and NUAA-SIRST datasets demonstrate that the proposed method achieves comparable detection performance to existing deep learning methods while significantly optimizing computational efficiency, balancing detection accuracy and speed. It provides a new idea for reducing redundancy based on sparsity in infrared small target detection, is applicable to scenarios like remote sensing detection, infrared guidance and environmental monitoring that require both real-time performance and accuracy, and offers useful references for the lightweight development of the field.
  • loading
  • [1]
    HAN Zonghao, ZHANG Ziye, ZHANG Shun, et al. Aerial visible-to-infrared image translation: Dataset, evaluation, and baseline[J]. Journal of Remote Sensing, 2023, 3: 0096. doi: 10.34133/remotesensing.0096.
    [2]
    WANG Qunming and HUANG Ruijie. RES-STF: Spatio temporal fusion of visible infrared imaging radiometer suite and landsat land surface temperature based on restormer[J]. Journal of Remote Sensing, 2024, 4: 0208. doi: 10.34133/remotesensing.0208.
    [3]
    张晶晶, 曹思华, 崔文楠, 等. 基于改进顶帽变换的红外弱小目标检测[J]. 电子与信息学报, 2024, 46(1): 267–276. doi: 10.11999/JEIT221562.

    ZHANG Jingjing, CAO Sihua, CUI Wennan, et al. Improved top-hat transform-based algorithm for infrared dim and small target detection[J]. Journal of Electronics & Information Technology, 2024, 46(1): 267–276. doi: 10.11999/JEIT221562.
    [4]
    HAN Jinhui, MORADI S, FARAMARZI I, et al. A local contrast method for infrared small-target detection utilizing a tri-layer window[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(10): 1822–1826. doi: 10.1109/LGRS.2019.2954578.
    [5]
    CHEN C L P, LI Hong, WEI Yantao, et al. A local contrast method for small infrared target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1): 574–581. doi: 10.1109/TGRS.2013.2242477.
    [6]
    GAO Chenqiang, MENG Deyu, YANG Yi, et al. Infrared patch-image model for small target detection in a single image[J]. IEEE Transactions on Image Processing, 2013, 22(12): 4996–5009. doi: 10.1109/TIP.2013.2281420.
    [7]
    LIU Ting, YANG Jungang, LI Boyang, et al. Nonconvex tensor low-rank approximation for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5614718. doi: 10.1109/TGRS.2021.3130310.
    [8]
    ZHANG Landan and PENG Zhenming. Infrared small target detection based on partial sum of the tensor nuclear norm[J]. Remote Sensing, 2019, 11(4): 382. doi: 10.3390/rs11040382.
    [9]
    LI Boyang, XIAO Chao, WANG Longguang, et al. Dense nested attention network for infrared small target detection[J]. IEEE Transactions on Image Processing, 2023, 32: 1745–1758. doi: 10.1109/TIP.2022.3199107.
    [10]
    WANG Huan, ZHOU Luping, and WANG Lei. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, 2019: 8508–8517. doi: 10.1109/ICCV.2019.00860.
    [11]
    DAI Yimian, WU Yiquan, ZHOU Fei, et al. Asymmetric contextual modulation for infrared small target detection[C]. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, USA, 2021: 949–958. doi: 10.1109/WACV48630.2021.00099.
    [12]
    DAI Yimian, WU Yiquan, ZHOU Fei, et al. Attentional local contrast networks for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(11): 9813–9824. doi: 10.1109/TGRS.2020.3044958.
    [13]
    ZHANG Mingjin, ZHANG Rui, YANG Yuxiang, et al. ISNet: Shape matters for infrared small target detection[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 2022: 867–876. doi: 10.1109/CVPR52688.2022.00095.
    [14]
    WU Xin, HONG Danfeng, CHANUSSOT J. UIU-net: U-net in U-net for infrared small object detection[J]. IEEE Transactions on Image Processing, 2023, 32: 364–376. doi: 10.1109/TIP.2022.3228497.
    [15]
    HOU Qingyu, ZHANG Liuwei, TAN Fanjiao, et al. ISTDU-Net: Infrared small-target detection U-Net[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 7506205. doi: 10.1109/LGRS.2022.3141584.
    [16]
    SUN Heng, BAI Junxiang, YANG Fan, et al. Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset IRDST[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5000513. doi: 10.1109/TGRS.2023.3235150.
    [17]
    CHEN Gao, WANG Zhuang, WANG Weihua, et al. Holistic modularization of local contrast in the end-to-end network for infrared small target detection[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 7001305. doi: 10.1109/LGRS.2023.3320191.
    [18]
    ZHANG Mingjin, YUE Ke, LI Boyang, et al. Single-frame infrared small target detection via Gaussian curvature inspired network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5005013. doi: 10.1109/TGRS.2024.3423492.
    [19]
    REN Xiangyang, JIAO Boyang, PENG Zhenming, et al. MSFFNet: A multilevel sparse feature fusion network for infrared dim small target detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025, 18: 147–159. doi: 10.1109/JSTARS.2024.3488698.
    [20]
    ZHANG Luping, LUO Junhai, HUANG Yian, et al. MDIGCNet: Multidirectional information-guided contextual network for infrared small target detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025, 18: 2063–2076. doi: 10.1109/JSTARS.2024.3508255.
    [21]
    HU Chen, HUANG Yian, LI Kexuan, et al. DATransNet: Dynamic attention transformer network for infrared small target detection[J]. IEEE Geoscience and Remote Sensing Letters, 2025, 22: 7001005. doi: 10.1109/LGRS.2025.3557021.
    [22]
    LIAO Minghui, ZOU Zhisheng, WAN Zhaoyi, et al. Real-time scene text detection with differentiable binarization and adaptive scale fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 919–931. doi: 10.1109/TPAMI.2022.3155612.
    [23]
    SMITH L N and TOPIN N. Super-convergence: Very fast training of neural networks using large learning rates[C]. Proceedings Volume 11006, Artificial Intelligence and Machine Learning for Multi-domain Operations Applications, Baltimore, United States, 2019: 369–386. doi: 10.1117/12.2520589.
    [24]
    RIVEST J F and FORTIN R. Detection of dim targets in digital infrared imagery by morphological image processing[J]. Optical Engineering, 1996, 35(7): 1886–1893. doi: 10.1117/1.600620.
    [25]
    WU Shuanglin, XIAO Chao, WANG Yingqian, et al. Sparsity-aware global channel pruning for infrared small-target detection networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5615011. doi: 10.1109/TGRS.2025.3544645.
    [26]
    CHUNG W Y, LEE I H, PARK C G. Lightweight infrared small target detection network using full-scale skip connection U-Net[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 7000705. doi: 10.1109/LGRS.2023.3276326.
    [27]
    KOU Renke, WANG Chunping, YU Ying, et al. LW-IRSTNet: Lightweight infrared small target segmentation network and application deployment[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5621313. doi: 10.1109/TGRS.2023.3314586.
    [28]
    MA Tianlei, YANG Zhen, SONG Yifan, et al. DMEF-Net: Lightweight infrared dim small target detection network for limited samples[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5626015. doi: 10.1109/TGRS.2023.3333378.
    [29]
    ZHANG Mingjin, YANG Handi, GUO Jie, et al. IRPruneDet: Efficient infrared small target detection via wavelet structure-regularized soft channel pruning[C]. Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 7224–7232. doi: 10.1609/aaai.v38i7.28551.
    [30]
    LI Boyang, WANG Longguang, WANG Yingqian, et al. Mixed-precision network quantization for infrared small target segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5000812. doi: 10.1109/TGRS.2023.3346904.
    [31]
    XIAO Chao, AN Wei, ZHANG Yifan, et al. Highly efficient and unsupervised framework for moving object detection in satellite videos[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 11532–11539. doi: 10.1109/TPAMI.2024.3409824.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(2)

    Article Metrics

    Article views (41) PDF downloads(10) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return