Multiscale Fractional Information Potential Field and Dynamic Gradient-Guided Energy Modeling for SAR and Multispectral Image Fusion

SONG Jiawen; WANG Qingsong

doi:10.11999/JEIT250976

Volume 47 Issue 12

Dec. 2026

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 > 47(12): 4874-4886

SONG Jiawen, WANG Qingsong. Multiscale Fractional Information Potential Field and Dynamic Gradient-Guided Energy Modeling for SAR and Multispectral Image Fusion[J]. Journal of Electronics & Information Technology, 2025, 47(12): 4874-4886. doi: 10.11999/JEIT250976

Citation:

SONG Jiawen, WANG Qingsong. Multiscale Fractional Information Potential Field and Dynamic Gradient-Guided Energy Modeling for SAR and Multispectral Image Fusion[J]. Journal of Electronics & Information Technology, 2025, 47(12): 4874-4886. doi: 10.11999/JEIT250976

Citation:

PDF( 24023 KB)

Multiscale Fractional Information Potential Field and Dynamic Gradient-Guided Energy Modeling for SAR and Multispectral Image Fusion

doi: 10.11999/JEIT250976 cstr: 32379.14.JEIT250976

SONG Jiawen,
WANG Qingsong^,

School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen 518107, China

Funds: The National Natural Science Foundation of China (62273365), Xiaomi Young Talents Program

Received Date: 2025-09-24
Accepted Date: 2025-12-22
Rev Recd Date: 2025-12-22

Available Online: 2026-01-02

Publish Date: 2025-12-10

Abstract

Abstract

Objective In remote sensing, fusion of Synthetic Aperture Radar (SAR) and MultiSpectral (MS) images is essential for comprehensive Earth observation. SAR sensors provide all-weather imaging capability and capture dielectric and geometric surface characteristics, although they are inherently affected by multiplicative speckle noise. In contrast, MS sensors provide rich spectral information that supports visual interpretation, although their performance is constrained by atmospheric conditions. The objective of SAR-MS image fusion is to integrate the structural details and scattering characteristics of SAR imagery with the spectral content of MS imagery, thereby improving performance in applications such as land-cover classification and target detection. However, existing fusion approaches, ranging from component substitution and multiscale transformation to Deep Learning (DL), face persistent limitations. Many methods fail to achieve an effective balance between noise suppression and texture preservation, which leads to spectral distortion or residual speckle, particularly in highly heterogeneous regions. DL-based methods, although effective in specific scenarios, exhibit strong dependence on training data and limited generalization across sensors. To address these issues, a robust unsupervised fusion framework is developed that explicitly models modality-specific noise characteristics and structural differences. Fractional calculus and dynamic energy modeling are combined to improve structural preservation and spectral fidelity without relying on large-scale training datasets. Methods The proposed framework adopts a multistage fusion strategy based on Relative Total Variation filtering for image decomposition and consists of four core components. First, a MultiScale Fractional Information Potential Field (MS-FIPF) method (Fig. 2) is proposed to extract robust detail layers. A fractional-order kernel is constructed in the Fourier domain to achieve nonlinear frequency weighting, and a local entropy-driven adaptive scale mechanism is applied to enhance edge information while suppressing noise. Second, to address the different noise distributions observed in SAR and MS detail layers, a Bayesian adaptive fusion model based on the minimum mean square error criterion is constructed. A dynamic regularization term is incorporated to adaptively balance structural preservation and noise suppression. Third, for base layers containing low-frequency geometric information, a Dynamic Gradient-Guided Multiresolution Local Energy (DGMLE) method (Fig. 3) is proposed. This method constructs a global entropy-driven multiresolution pyramid and applies a gradient-variance-controlled enhancement factor combined with adaptive Gaussian smoothing to emphasize significant geometric structures. Finally, a Scattering Intensity Adaptive Modulation (SIAM) strategy is applied through a nonlinear mapping regulated by joint entropy and root mean square error, enabling adaptive adjustment of SAR scattering contributions to maintain visual and spectral consistency. Results and Discussions The proposed framework is evaluated on the WHU, YYX, and HQ datasets, which represent different spatial resolutions and scene complexities, and is compared with seven state-of-the-art fusion methods. Qualitative comparisons (Figs. 5$ \sim $7) show that several existing approaches, including hybrid multiscale decomposition and image fusion convolutional neural networks, exhibit limited noise modeling capability. This limitation results in spectral distortion and detail blurring caused by SAR speckle interference. Methods based on infrared feature extraction and visual information preservation also show image whitening and contrast degradation due to excessive scattering feature injection. In contrast, the proposed method effectively filters redundant SAR noise through multiscale fractional denoising and adaptive scattering modulation, while preserving MS spectral consistency and salient SAR geometric structures. Improved visual clarity and detail preservation are observed, exceeding the performance of competitive approaches such as visual saliency feature fusion, which still presents residual noise. Quantitative results (Tables 1$ \sim $3) demonstrate consistent superiority across six evaluation metrics. On the WHU dataset, optimal ERGAS (3.737 0) and PSNR (24.798 3 dB) values are achieved. Performance improvements are more evident on the high-resolution YYX dataset and the structurally complex HQ dataset, where the proposed method ranks first for all indices. The mutual information on the YYX dataset reaches 3.353 5, which is nearly twice that of the second-ranked method, confirming strong multimodal information preservation. On average, the proposed framework achieves a performance improvement of 29.11% compared with the second-best baseline. Mechanism validation and efficiency analysis (Tables 4, 5) further support these results. Ablation experiments demonstrate that SIAM plays a critical role in maintaining the balance between spectral information and scattering characteristics, whereas DGMLE contributes substantially to structural fidelity. With an average runtime of 1.303 3 s, the proposed method achieves an effective trade-off between computational efficiency and fusion quality and remains significantly faster than complex transform-domain approaches such as multiscale non-subsampled shearlet transform combined with parallel convolutional neural networks. Conclusions A robust and unsupervised framework for SAR and MS image fusion is proposed. By integrating MS-FIPF-based fractional-order saliency extraction with DGMLE-based gradient-guided energy modeling, the proposed method addresses the long-standing trade-off between noise suppression and detail preservation. Bayesian adaptive fusion and scattering intensity modulation further improve robustness to modality differences. Experimental results confirm that the proposed framework outperforms seven representative fusion algorithms, achieving an average improvement of 29.11% across comprehensive evaluation metrics. Significant gains are observed in noise suppression, structural fidelity, and spectral preservation, demonstrating strong potential for multisource remote sensing data processing.
- Synthetic Aperture Radar (SAR),
- Multispectral image,
- Image fusion,
- MultiScale Fractional Information Potential Field (MS-FIPF),
- Dynamic gradient guidance

FullText(HTML)

References(36)

References

[1]	ZHANG Hao, XU Han, TIAN Xin, et al. Image fusion meets deep learning: A survey and perspective[J]. Information Fusion, 2021, 76: 323–336. doi: 10.1016/j.inffus.2021.06.008.
[2]	LIU Shuaijun, LIU Jia, TAN Xiaoyue, et al. A hybrid spatiotemporal fusion method for high spatial resolution imagery: Fusion of Gaofen-1 and Sentinel-2 over agricultural landscapes[J]. Journal of Remote Sensing, 2024, 4: 0159. doi: 10.34133/remotesensing.0159.
[3]	MEI Shaohui, LIAN Jiawei, WANG Xiaofei, et al. A comprehensive study on the robustness of deep learning-based image classification and object detection in remote sensing: Surveying and benchmarking[J]. Journal of Remote Sensing, 2024, 4: 0219. doi: 10.34133/remotesensing.0219.
[4]	YANG Songling, WANG Lihua, YUAN Yi, et al. Recognition of small water bodies under complex terrain based on SAR and optical image fusion algorithm[J]. Science of the Total Environment, 2024, 946: 174329. doi: 10.1016/j.scitotenv.2024.174329.
[5]	WU Wenfu, SHAO Zhenfeng, HUANG Xiao, et al. Quantifying the sensitivity of SAR and optical images three-level fusions in land cover classification to registration errors[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102868. doi: 10.1016/j.jag.2022.102868.
[6]	DONG Jun, FENG Jiewen, and TANG Xiaoyu. OptiSAR-Net: A cross-domain ship detection method for multisource remote sensing data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4709311. doi: 10.1109/TGRS.2024.3502447.
[7]	QUAN Yujun, ZHANG Rongrong, LI Jian, et al. Learning SAR-optical cross modal features for land cover classification[J]. Remote Sensing, 2024, 16(2): 431. doi: 10.3390/rs16020431.
[8]	HE Xiaoning, ZHANG Shuangcheng, XUE Bowei, et al. Cross-modal change detection flood extraction based on convolutional neural network[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 117: 103197. doi: 10.1016/j.jag.2023.103197.
[9]	YE Yuanxin, ZHANG Jiacheng, ZHOU Liang, et al. Optical and SAR image fusion based on complementary feature decomposition and visual saliency features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5205315. doi: 10.1109/TGRS.2024.3366519.
[10]	DENG Liangjian, VIVONE G, PAOLETTI M E, et al. Machine learning in pansharpening: A benchmark, from shallow to deep networks[J]. IEEE Geoscience and Remote Sensing Magazine, 2022, 10(3): 279–315. doi: 10.1109/MGRS.2022.3187652.
[11]	CHU Tianyong, TAN Yumin, LIU Qiang, et al. Novel fusion method for SAR and optical images based on non-subsampled shearlet transform[J]. International Journal of Remote Sensing, 2020, 41(12): 4590–4604. doi: 10.1080/01431161.2020.1723175.
[12]	ZHANG Wei and YU Le. SAR and Landsat ETM+ image fusion using variational model[C]. The 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering, Chengdu, China, 2010: 205–207. doi: 10.1109/CCTAE.2010.5544210.
[13]	KONG Yingying, HONG Fang, LEUNG H, et al. A fusion method of optical image and SAR image based on dense-UGAN and Gram–Schmidt transformation[J]. Remote Sensing, 2021, 13(21): 4274. doi: 10.3390/rs13214274.
[14]	SHAO Zhenfeng, WU Wenfu, and GUO Songjing. IHS-GTF: A fusion method for optical and synthetic aperture radar data[J]. Remote Sensing, 2020, 12(17): 2796. doi: 10.3390/rs12172796.
[15]	LI Wenmei, WU Jiaqi, LIU Qing, et al. An effective multimodel fusion method for SAR and optical remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 5881–5892. doi: 10.1109/JSTARS.2023.3288143.
[16]	GONG Xunqiang, HOU Zhaoyang, WAN Yuting, et al. Multispectral and SAR image fusion for multiscale decomposition based on least squares optimization rolling guidance filtering[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5401920. doi: 10.1109/TGRS.2024.3353868.
[17]	XU Li, YAN Qiong, XIA Yang, et al. Structure extraction from texture via relative total variation[J]. ACM Transactions on Graphics, 2012, 31(6): 139. doi: 10.1145/2366145.2366158.
[18]	LI Bo and XIE Wei. Adaptive fractional differential approach and its application to medical image enhancement[J]. Computers & Electrical Engineering, 2015, 45: 324–335. doi: 10.1016/j.compeleceng.2015.02.013.
[19]	YANG Qi, CHEN Dali, ZHAO Tiebiao, et al. Fractional calculus in image processing: A review[J]. Fractional Calculus and Applied Analysis, 2016, 19(5): 1222–1249. doi: 10.1515/fca-2016-0063.
[20]	ENßLIN T A. Information theory for fields[J]. Annalen der Physik, 2019, 531(3): 1800127. doi: 10.1002/andp.201800127.
[21]	LIU Yu, WANG Lei, CHENG Juan, et al. Multi-focus image fusion: A survey of the state of the art[J]. Information Fusion, 2020, 64: 71–91. doi: 10.1016/j.inffus.2020.06.013.
[22]	FIZA S and SAFINAZ S. Multi-focus image fusion using edge discriminative diffusion filter for satellite images[J]. Multimedia Tools and Applications, 2024, 83(25): 66087–66106. doi: 10.1007/s11042-024-18174-3.
[23]	MA Jiayi, YU Wei, LIANG Pengwei, et al. FusionGAN: A generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11–26. doi: 10.1016/j.inffus.2018.09.004.
[24]	ZHOU Zhiqiang, WANG Bo, LI Sun, et al. Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with Gaussian and bilateral filters[J]. Information Fusion, 2016, 30: 15–26. doi: 10.1016/j.inffus.2015.11.003.
[25]	MA Jinlei, ZHOU Zhiqiang, WANG Bo, et al. Infrared and visible image fusion based on visual saliency map and weighted least square optimization[J]. Infrared Physics & Technology, 2017, 82: 8–17. doi: 10.1016/j.infrared.2017.02.005.
[26]	ZHANG Yu, LIU Yu, SUN Peng, et al. IFCNN: A general image fusion framework based on convolutional neural network[J]. Information Fusion, 2020, 54: 99–118. doi: 10.1016/j.inffus.2019.07.011.
[27]	MA Jiayi, TANG Linfeng, FAN Fan, et al. SwinFusion: Cross-domain long-range learning for general image fusion via Swin transformer[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(7): 1200–1217. doi: 10.1109/JAS.2022.105686.
[28]	ZHANG Yu, ZHANG Lijia, BAI Xiangzhi, et al. Infrared and visual image fusion through infrared feature extraction and visual information preservation[J]. Infrared Physics & Technology, 2017, 83: 227–237. doi: 10.1016/j.infrared.2017.05.007.
[29]	HUANG Wei, LIU Yanyan, SUN Le, et al. A novel dual-branch pansharpening network with high-frequency component enhancement and multi-scale skip connection[J]. Remote Sensing, 2025, 17(5): 776. doi: 10.3390/rs17050776.
[30]	VIVONE G, ALPARONE L, CHANUSSOT J, et al. A critical comparison among pansharpening algorithms[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(5): 2565–2586. doi: 10.1109/TGRS.2014.2361734.
[31]	LONCAN L, DE ALMEIDA L B, BIOUCAS-DIAS J M, et al. Hyperspectral pansharpening: A review[J]. IEEE Geoscience and Remote Sensing Magazine, 2015, 3(3): 27–46. doi: 10.1109/MGRS.2015.2440094.
[32]	PLUIM J P W, MAINTZ J B A, and VIERGEVER M A. Mutual-information-based registration of medical images: A survey[J]. IEEE Transactions on Medical Imaging, 2003, 22(8): 986–1004. doi: 10.1109/TMI.2003.815867.
[33]	LIU Danfeng, WANG Enyuan, WANG Liguo, et al. Pansharpening based on multimodal texture correction and adaptive edge detail fusion[J]. Remote Sensing, 2024, 16(16): 2941. doi: 10.3390/rs16162941.
[34]	WEN Xincan, MA Hongbing, and LI Liangliang. A three-branch pansharpening network based on spatial and frequency domain interaction[J]. Remote Sensing, 2025, 17(1): 13. doi: 10.3390/rs17010013.
[35]	LI Xue, ZHANG Guo, CUI Hao, et al. MCANet: A joint semantic segmentation framework of optical and SAR images for land use classification[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 106: 102638. doi: 10.1016/j.jag.2021.102638.
[36]	LI Jinjin, ZHANG Jiacheng, YANG Chao, et al. Comparative analysis of pixel-level fusion algorithms and a new high-resolution dataset for SAR and optical image fusion[J]. Remote Sensing, 2023, 15(23): 5514. doi: 10.3390/rs15235514.