多重关系感知的红外与可见光图像融合网络

李晓玲; 陈后金; 李艳凤; 孙嘉; 王敏鋆; 陈卢一夫

doi:10.11999/JEIT231062

多重关系感知的红外与可见光图像融合网络

doi: 10.11999/JEIT231062

北京交通大学电子信息工程学院北京 100044

基金项目: 国家自然科学基金(62172029,62272027)，北京市自然科学基金(4232012)，中央高校基本科研业务费专项资金(2022YJS013)

详细信息

作者简介:
李晓玲：女，博士生，研究方向为图像融合、深度学习

陈后金：男，教授，研究方向为图像处理、模式识别

李艳凤：女，教授，研究方向为图像处理、深度学习

孙嘉：女，讲师，研究方向为图像处理、模式识别

王敏鋆：女，博士生，研究方向为医学图像处理

陈卢一夫：男，博士生，研究方向为深度学习、模式识别

通讯作者:
陈后金　hjchen@bjtu.edu.cn

中图分类号: TN911.73; TP751
计量
- 文章访问数: 561
- HTML全文浏览量: 242
- PDF下载量: 102
- 被引次数: 0
出版历程
- 收稿日期: 2023-10-07
- 修回日期: 2024-04-12
- 网络出版日期: 2024-04-27
- 刊出日期: 2024-05-30

Infrared and Visible Image Fusion Network with Multi-Relation Perception

School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China

Funds: The National Natural Science Foundation of China (62172029, 62272027), The Natural Science Foundation of Beijing (4232012), The Fundamental Research Funds for the Central Universities (2022YJS013)

摘要

摘要: 为充分整合红外与可见光图像间的一致特征和互补特征，该文提出一种基于多重关系感知的红外与可见光图像融合方法。该方法首先利用双分支编码器提取源图像特征，然后将提取的源图像特征输入设计的基于多重关系感知的跨模态融合策略，最后利用解码器重建融合特征生成最终的融合图像。该融合策略通过构建特征间关系感知和权重间关系感知，利用不同模态间的共享关系、差分关系和累积关系的相互作用，实现源图像一致特征和互补特征的充分整合，以得到融合特征。为约束网络训练以保留源图像的固有特征，设计了一种基于小波变换的损失函数，以辅助融合过程对源图像低频分量和高频分量的保留。实验结果表明，与目前基于深度学习的图像融合方法相比，该文方法能够充分整合源图像的一致特征和互补特征，能够有效保留可见光图像的背景信息和红外图像的热目标，整体融合效果优于对比方法。
- 图像融合 /
- 红外图像 /
- 可见光图像 /
- 多重关系感知 /
- 小波变换
Abstract: A multi-relation perception network for infrared and visible image fusion is proposed in this paper to fully integrate consistent features and complementary features between infrared and visible images. First, a dual-branch encoder module is used to extract features from the source images. The extracted features are then fed into the fusion strategy module based on multi-relation perception. Finally, a decoder module is used to reconstruct the fused features and generate the final fused image. In this fusion strategy module, the feature relationship perception and the weight relationship perception are constructed by exploring the interactions between the shared relationship, the differential relationship, and the cumulative relationship across different modalities, so as to integrate consistent features and complementary features between different modalities and obtain fused features. To constrain network training and preserve the intrinsic features of the source images, a wavelet transform-based loss function is developed to assist in preserving low-frequency components and high-frequency components of the source images during the fusion process. Experiments indicate that, compared to the state-of-the-art deep learning-based image fusion methods, the proposed method can fully integrate consistent features and complementary features between source images, thereby successfully preserving the background information of visible images and the thermal targets of infrared images. Overall, the fusion performance of the proposed method surpasses that of the compared methods.
- Image fusion /
- Infrared image /
- Visible image /
- Multi-relation perception /
- Wavelet transform

HTML全文

图 1 基于多重关系感知的图像融合网络

下载: 全尺寸图片幻灯片

图 2 训练损失与Epoch关系图

下载: 全尺寸图片幻灯片

图 3 不同图像融合方法在M³FD数据集上的融合效果比较

下载: 全尺寸图片幻灯片

图 4 不同图像融合方法在MSRS数据集上的融合效果比较

下载: 全尺寸图片幻灯片

图 5 不同融合策略和不同损失函数在M³FD数据集上的融合效果比较

下载: 全尺寸图片幻灯片

图 6 不同融合策略和不同损失函数在MSRS数据集上的融合效果比较

下载: 全尺寸图片幻灯片

1 跨模态融合策略伪代码

输入：红外图像特征$ {{\boldsymbol{F}}_{{\text{ir}}}} $，可见光图像特征$ {{\boldsymbol{F}}_{{\text{vis}}}} $
输出：融合特征$ {{\boldsymbol{F}}_{{\text{fuse}}}} $
do
(1) 步骤1计算共享特征、差分特征和累积特征：
(2) 　$ {\hat {\boldsymbol{F}}_{\text{s}}} \leftarrow {{\boldsymbol{F}}_{{\text{ir}}}}*{{\boldsymbol{F}}_{{\text{vis}}}} $
(3) 　$ {\hat {\boldsymbol{F}}_{\text{d}}} \leftarrow {F_{{\text{ir}}}} - {{\boldsymbol{F}}_{{\text{vis}}}} $
(4) 　$ {\hat {\boldsymbol{F}}_{\text{a}}} \leftarrow {{\boldsymbol{F}}_{{\text{ir}}}} + {{\boldsymbol{F}}_{{\text{vis}}}} $
(5) 步骤2计算基于坐标注意力机制不同模态的加权特征表示：
(6) 　$ {{\boldsymbol{W}}_{{\text{ir}}}} \leftarrow {{\mathrm{CA}}} \left( {{{\boldsymbol{F}}_{{\text{ir}}}}} \right) $
(7) 　$ {{\boldsymbol{W}}_{{\text{vis}}}} \leftarrow {{\mathrm{CA}}} \left( {{{\boldsymbol{F}}_{{\text{vis}}}}} \right) $
(8) 步骤3计算共享权重、差分权重和累积权重：
(9) 　$ {\hat {\boldsymbol{W}}_{\text{s}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}}*{{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
(10) $ {\hat {\boldsymbol{W}}_{\text{d}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}} - {{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
(11) $ {\hat {\boldsymbol{W}}_{\text{a}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}} + {{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
(12) 步骤4沿通道维度拼接，获取融合特征：
(13) $ {{\boldsymbol{F}}_{{\text{fuse}}}} \leftarrow {{\mathrm{Cat}}} \left( {{{\hat {\boldsymbol{W}}}_{\text{s}}} * {{\hat {\boldsymbol{F}}}_{\text{s}}},{{\hat {\boldsymbol{W}}}_{\text{d}}} * {{\hat {\boldsymbol{F}}}_{\text{d}}},{{\hat {\boldsymbol{W}}}_{\text{a}}} * {{\hat {\boldsymbol{F}}}_{\text{a}}}} \right) $
return $ {{\boldsymbol{F}}_{{\text{fuse}}}} $

下载: 导出CSV

表 1 不同图像融合方法在M³FD数据集和MSRS数据集上的定量结果比较

方法	M³FD				MSRS
方法	MI↑	$ {Q_{\text{p}}} $↑	$ {Q_{\text{w}}} $↑	$ {Q_{{\text{CV}}}} $↓	MI↑	$ {Q_{\text{p}}} $↑	$ {Q_{\text{w}}} $↑	$ {Q_{{\text{CV}}}} $↓
CoCoNet^[19]	2.779 5	0.329 2	0.992 2	778.539 5	2.575 7	0.327 0	0.989 2	847.188 9
LapH^[16]	2.628 4	0.378 5	0.992 3	728.642 3	2.169 2	0.385 6	0.996 6	436.233 3
MuFusion^[17]	2.348 0	0.240 1	0.994 6	875.821 7	1.617 6	0.256 4	0.996 4	1 203.265 8
SwinFusion^[15]	3.391 4	0.373 3	0.992 0	520.361 2	3.478 5	0.425 5	0.996 8	283.761 4
TIMFusion^[18]	3.036 7	0.209 5	0.991 4	653.178 7	3.202 3	0.369 0	0.996 4	314.666 0
TUFusion^[20]	2.882 1	0.186 4	0.995 6	611.821 8	2.504 4	0.250 7	0.997 3	664.691 8
本文方法	4.458 2	0.383 5	0.991 9	547.978 5	4.296 3	0.477 9	0.996 6	241.200 4

下载: 导出CSV

表 2 不同融合策略和不同损失函数在M³FD数据集和MSRS数据集上的定量结果比较

类型		M³FD				MSRS
类型		MI↑	$ {Q_{\text{p}}} $↑	$ {Q_{\text{w}}} $↑	$ {Q_{{\text{CV}}}} $↓	MI↑	$ {Q_{\text{p}}} $↑	$ {Q_{\text{w}}} $↑	$ {Q_{{\text{CV}}}} $↓
融合策略	仅有共享关系	2.549 0	0.198 2	0.993 5	604.322 8	2.865 5	0.259 5	0.996 9	357.813 6
	仅有差分关系	2.956 0	0.212 6	0.991 0	615.008 8	2.637 3	0.300 2	0.996 3	333.188 2
	仅有累积关系	2.737 9	0.278 6	0.990 4	794.850 3	2.904 0	0.361 1	0.996 4	632.420 0
	本文方法	4.458 2	0.383 5	0.991 9	547.978 5	4.296 3	0.477 9	0.996 6	241.200 4
损失函数	仅有低频损失	3.738 7	0.200 8	0.990 3	564.404 9	3.773 1	0.346 8	0.996 4	266.584 4
	仅有高频损失	1.633 7	0.142 4	0.994 7	1 030.369 4	1.268 7	0.173 2	0.995 6	2 393.834 2
	本文方法	4.458 2	0.383 5	0.991 9	547.978 5	4.296 3	0.477 9	0.996 6	241.200 4

下载: 导出CSV

表 3 不同图像融合方法的模型复杂度和运行时间比较

		CoCoNet	LapH	MuFusion	SwinFusion	TIMFusion	TUFusion	本文方法
参数量(M)		9.130	0.134	2.124	0.974	0.127	76.282	14.727
FLOPs(G)		63.447	16.087	179.166	259.045	45.166	272.992	25.537
时间(s)	M³FD	0.131	0.062	0.670	2.471	0.208	0.234	0.196
时间(s)	MSRS	0.129	0.062	0.683	2.529	0.211	0.233	0.203

下载: 导出CSV

参考文献(25)

[1]	杨莘, 田立凡, 梁佳明, 等. 改进双路径生成对抗网络的红外与可见光图像融合[J]. 电子与信息学报, 2023, 45(8): 3012–3021. doi: 10.11999/JEIT220819. YANG Shen, TIAN Lifan, LIANG Jiaming, et al. Infrared and visible image fusion based on improved dual path generation adversarial network[J]. Journal of Electronics & Information Technology, 2023, 45(8): 3012–3021. doi: 10.11999/JEIT220819.
[2]	高绍兵, 詹宗逸, 匡梅. 视觉多通路机制启发的多场景感知红外与可见光图像融合框架[J]. 电子与信息学报, 2023, 45(8): 2749–2758. doi: 10.11999/JEIT221361. GAO Shaobing, ZHAN Zongyi, and KUANG Mei. Multi-scenario aware infrared and visible image fusion framework based on visual multi-pathway mechanism[J]. Journal of Electronics & Information Technology, 2023, 45(8): 2749–2758. doi: 10.11999/JEIT221361.
[3]	XU Guoxia, HE Chunming, WANG Hao, et al. DM-Fusion: Deep model-driven network for heterogeneous image fusion[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023: 1–15. doi: 10.1109/TNNLS.2023.3238511.
[4]	MA Jiayi, YU Wei, LIANG Pengwei, et al. FusionGAN: A generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11–26. doi: 10.1016/j.inffus.2018.09.004.
[5]	TANG Wei, HE Fazhi, and LIU Yu. YDTR: Infrared and visible image fusion via Y-shape dynamic transformer[J]. IEEE Transactions on Multimedia, 2023, 25: 5413–5428. doi: 10.1109/TMM.2022.3192661.
[6]	LI Hui and WU Xiaojun. DenseFuse: A fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2614–2623. doi: 10.1109/TIP.2018.2887342.
[7]	XU Han, ZHANG Hao, and MA Jiayi. Classification saliency-based rule for visible and infrared image fusion[J]. IEEE Transactions on Computational Imaging, 2021, 7: 824–836. doi: 10.1109/TCI.2021.3100986.
[8]	QU Linhao, LIU Shaolei, WANG Manning, et al. TransMEF: A transformer-based multi-exposure image fusion framework using self-supervised multi-task learning[C]. The 36th AAAI Conference on Artificial Intelligence, Tel Aviv, Israel, 2022: 2126–2134. doi: 10.1609/aaai.v36i2.20109.
[9]	QU Linhao, LIU Shaolei, WANG Manning, et al. TransFuse: A unified transformer-based image fusion framework using self-supervised learning[EB/OL]. https://arxiv.org/abs/2201.07451, 2022. doi: 10.48550/arXiv.2201.07451.
[10]	LI Hui, WU Xiaojun, and KITTLER J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72–86. doi: 10.1016/j.inffus.2021.02.023.
[11]	LI Junwu, LI Binhua, JIANG Yaoxi, et al. MrFDDGAN: Multireceptive field feature transfer and dual discriminator-driven generative adversarial network for infrared and color visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5006228. doi: 10.1109/TIM.2023.3241999.
[12]	HOU Qibin, ZHOU Daquan, and FENG Jiashi. Coordinate attention for efficient mobile network design[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13708–13717. doi: 10.1109/cvpr46437.2021.01350.
[13]	ZHANG Pengyu, ZHAO Jie, WANG Dong, et al. Visible-thermal UAV tracking: A large-scale benchmark and new baseline[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 8876–8885. doi: 10.1109/cvpr52688.2022.00868.
[14]	LIU Jinyuan, FAN Xin, HUANG Zhanbo, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 5792–5801. doi: 10.1109/cvpr52688.2022.00571.
[15]	MA Jiayi, TANG Linfeng, FAN Fan, et al. SwinFusion: Cross-domain long-range learning for general image fusion via swin transformer[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(7): 1200–1217. doi: 10.1109/JAS.2022.105686.
[16]	LUO Xing, FU Guizhong, YANG Jiangxin, et al. Multi-modal image fusion via deep laplacian pyramid hybrid network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(12): 7354–7369. doi: 10.1109/TCSVT.2023.3281462.
[17]	CHENG Chunyang, XU Tianyang, and WU Xiaojun. MUFusion: A general unsupervised image fusion network based on memory unit[J]. Information Fusion, 2023, 92: 80–92. doi: 10.1016/j.inffus.2022.11.010.
[18]	LIU Risheng, LIU Zhu, LIU Jinyuan, et al. A task-guided, implicitly-searched and meta-initialized deep model for image fusion[EB/OL].https://arxiv.org/abs/2305.15862, 2023. doi: 10.48550/arXiv.2305.15862.
[19]	LIU Jinyuan, LIN Runjia, WU Guanyao, et al. CoCoNet: Coupled contrastive learning network with multi-level feature ensemble for multi-modality image fusion[J]. International Journal of Computer Vision, 2023. doi: 10.1007/s11263-023-01952-1.
[20]	ZHAO Yangyang, ZHENG Qingchun, ZHU Peihao, et al. TUFusion: A transformer-based universal fusion algorithm for multimodal images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(3): 1712–1725. doi: 10.1109/TCSVT.2023.3296745.
[21]	QU Guihong, ZHANG Dali, YAN Pingfan, et al. Information measure for performance of image fusion[J]. Electronics Letters, 2002, 38(7): 313–315. doi: 10.1049/el:20020212.
[22]	ZHAO Jiying, LAGANIERE R, and LIU Zheng. Performance assessment of combinative pixel-level image fusion based on an absolute feature measurement[J]. International Journal of Innovative Computing, Information and Control, 2007, 3(6(A)): 1433–1447.
[23]	PIELLA G and HEIJMANS H. A new quality metric for image fusion[C]. The 2003 International Conference on Image Processing, Barcelona, Spain, 2003: 173–176. doi: 10.1109/ICIP.2003.1247209.
[24]	CHEN Hao and VARSHNEY P K. A human perception inspired quality metric for image fusion based on regional information[J]. Information Fusion, 2007, 8(2): 193–207. doi: 10.1016/j.inffus.2005.10.001.
[25]	ZHANG Xingchen. Deep learning-based multi-focus image fusion: A survey and a comparative study[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 4819–4838. doi: 10.1109/TPAMI.2021.3078906.