高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多重关系感知的红外与可见光图像融合网络

李晓玲 陈后金 李艳凤 孙嘉 王敏鋆 陈卢一夫

李晓玲, 陈后金, 李艳凤, 孙嘉, 王敏鋆, 陈卢一夫. 基于多重关系感知的红外与可见光图像融合网络[J]. 电子与信息学报. doi: 10.11999/JEIT231062
引用本文: 李晓玲, 陈后金, 李艳凤, 孙嘉, 王敏鋆, 陈卢一夫. 基于多重关系感知的红外与可见光图像融合网络[J]. 电子与信息学报. doi: 10.11999/JEIT231062
LI Xiaoling, CHEN Houjin, LI Yanfeng, SUN Jia, WANG Minjun, CHEN Luyifu. Multi-Relation Perception Network for Infrared and Visible Image Fusion[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT231062
Citation: LI Xiaoling, CHEN Houjin, LI Yanfeng, SUN Jia, WANG Minjun, CHEN Luyifu. Multi-Relation Perception Network for Infrared and Visible Image Fusion[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT231062

基于多重关系感知的红外与可见光图像融合网络

doi: 10.11999/JEIT231062
基金项目: 国家自然科学基金(62172029,62272027),北京市自然科学基金(4232012),中央高校基本科研业务费专项资金(2022YJS013)
详细信息
    作者简介:

    李晓玲:女,博士生,研究方向为图像融合、深度学习

    陈后金:男,教授,研究方向为图像处理、模式识别

    李艳凤:女,教授,研究方向为图像处理、深度学习

    孙嘉:女,讲师,研究方向为图像处理、模式识别

    王敏鋆:女,博士生,研究方向为医学图像处理

    陈卢一夫:男,博士生,研究方向为深度学习、模式识别

    通讯作者:

    陈后金 hjchen@bjtu.edu.cn

  • 中图分类号: TN911.73; TP751

Multi-Relation Perception Network for Infrared and Visible Image Fusion

Funds: The National Natural Science Foundation of China (62172029, 62272027), The Natural Science Foundation of Beijing (4232012), The Fundamental Research Funds for the Central Universities (2022YJS013)
  • 摘要: 为充分整合红外与可见光图像间的一致特征和互补特征,该文提出一种基于多重关系感知的红外与可见光图像融合方法。该方法首先利用双分支编码器提取源图像特征,然后将提取的源图像特征输入设计的基于多重关系感知的跨模态融合策略,最后利用解码器重建融合特征生成最终的融合图像。该融合策略通过构建特征间关系感知和权重间关系感知,利用不同模态间的共享关系、差分关系和累积关系的相互作用,实现源图像一致特征和互补特征的充分整合,以得到融合特征。为约束网络训练以保留源图像的固有特征,设计了一种基于小波变换的损失函数,以辅助融合过程对源图像低频分量和高频分量的保留。实验结果表明,与目前基于深度学习的图像融合方法相比,该文方法能够充分整合源图像的一致特征和互补特征,能够有效保留可见光图像的背景信息和红外图像的热目标,整体融合效果优于对比方法。
  • 图  1  基于多重关系感知的图像融合网络

    图  2  训练损失与Epoch关系图

    图  3  不同图像融合方法在M3FD数据集上的融合效果比较

    图  4  不同图像融合方法在MSRS数据集上的融合效果比较

    图  5  不同融合策略和不同损失函数在M3FD数据集上的融合效果比较

    图  6  不同融合策略和不同损失函数在MSRS数据集上的融合效果比较

    1  跨模态融合策略伪代码

     输入:红外图像特征$ {{\boldsymbol{F}}_{{\text{ir}}}} $,可见光图像特征$ {{\boldsymbol{F}}_{{\text{vis}}}} $
     输出:融合特征$ {{\boldsymbol{F}}_{{\text{fuse}}}} $
     do
     (1) step1. 计算共享特征、差分特征和累积特征:
     (2)  $ {\hat {\boldsymbol{F}}_{\text{s}}} \leftarrow {{\boldsymbol{F}}_{{\text{ir}}}}*{{\boldsymbol{F}}_{{\text{vis}}}} $
     (3)  $ {\hat {\boldsymbol{F}}_{\text{d}}} \leftarrow {F_{{\text{ir}}}} - {{\boldsymbol{F}}_{{\text{vis}}}} $
     (4)  $ {\hat {\boldsymbol{F}}_{\text{a}}} \leftarrow {{\boldsymbol{F}}_{{\text{ir}}}} + {{\boldsymbol{F}}_{{\text{vis}}}} $
     (5) step2. 计算基于坐标注意力机制不同模态的加权特征表示:
     (6)  $ {{\boldsymbol{W}}_{{\text{ir}}}} \leftarrow {{\mathrm{CA}}} \left( {{{\boldsymbol{F}}_{{\text{ir}}}}} \right) $
     (7)  $ {{\boldsymbol{W}}_{{\text{vis}}}} \leftarrow {{\mathrm{CA}}} \left( {{{\boldsymbol{F}}_{{\text{vis}}}}} \right) $
     (8) step3. 计算共享权重、差分权重和累积权重:
     (9)  $ {\hat {\boldsymbol{W}}_{\text{s}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}}*{{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
     (10) $ {\hat {\boldsymbol{W}}_{\text{d}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}} - {{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
     (11) $ {\hat {\boldsymbol{W}}_{\text{a}}} \leftarrow {{\mathrm{Sigmoid}}} \left( {{{\boldsymbol{W}}_{{\text{ir}}}} + {{\boldsymbol{W}}_{{\text{vis}}}}} \right) $
     (12) step4. 沿通道维度拼接,获取融合特征:
     (13) $ {{\boldsymbol{F}}_{{\text{fuse}}}} \leftarrow {{\mathrm{Cat}}} \left( {{{\hat {\boldsymbol{W}}}_{\text{s}}} * {{\hat {\boldsymbol{F}}}_{\text{s}}},{{\hat {\boldsymbol{W}}}_{\text{d}}} * {{\hat {\boldsymbol{F}}}_{\text{d}}},{{\hat {\boldsymbol{W}}}_{\text{a}}} * {{\hat {\boldsymbol{F}}}_{\text{a}}}} \right) $
     return $ {{\boldsymbol{F}}_{{\text{fuse}}}} $
    下载: 导出CSV

    表  1  不同图像融合方法在M3FD数据集和MSRS数据集上的定量结果比较

    方法M3FDMSRS
    MI↑$ {Q_{\text{p}}} $↑$ {Q_{\text{w}}} $↑$ {Q_{{\text{CV}}}} $↓MI↑$ {Q_{\text{p}}} $↑$ {Q_{\text{w}}} $↑$ {Q_{{\text{CV}}}} $↓
    CoCoNet[19]2.779 50.329 20.992 2778.539 52.575 70.327 00.989 2847.188 9
    LapH[16]2.628 40.378 50.992 3728.642 32.169 20.385 60.996 6436.233 3
    MuFusion[17]2.348 00.240 10.994 6875.821 71.617 60.256 40.996 41 203.265 8
    SwinFusion[15]3.391 40.373 30.992 0520.361 23.478 50.425 50.996 8283.761 4
    TIMFusion[18]3.036 70.209 50.991 4653.178 73.202 30.369 00.996 4314.666 0
    TUFusion[20]2.882 10.186 40.995 6611.821 82.504 40.250 70.997 3664.691 8
    本文方法4.458 20.383 50.991 9547.978 54.296 30.477 90.996 6241.200 4
    下载: 导出CSV

    表  2  不同融合策略和不同损失函数在M3FD数据集和MSRS数据集上的定量结果比较

    类型M3FDMSRS
    MI↑$ {Q_{\text{p}}} $↑$ {Q_{\text{w}}} $↑$ {Q_{{\text{CV}}}} $↓MI↑$ {Q_{\text{p}}} $↑$ {Q_{\text{w}}} $↑$ {Q_{{\text{CV}}}} $↓
    融合策略仅有共享关系2.549 00.198 20.993 5604.322 82.865 50.259 50.996 9357.813 6
    仅有差分关系2.956 00.212 60.991 0615.008 82.637 30.300 20.996 3333.188 2
    仅有累积关系2.737 90.278 60.990 4794.850 32.904 00.361 10.996 4632.420 0
    本文方法4.458 20.383 50.991 9547.978 54.296 30.477 90.996 6241.200 4
    损失函数仅有低频损失3.738 70.200 80.990 3564.404 93.773 10.346 80.996 4266.584 4
    仅有高频损失1.633 70.142 40.994 71 030.369 41.268 70.173 20.995 62 393.834 2
    本文方法4.458 20.383 50.991 9547.978 54.296 30.477 90.996 6241.200 4
    下载: 导出CSV

    表  3  不同图像融合方法的模型复杂度和运行时间比较

    CoCoNetLapHMuFusionSwinFusionTIMFusionTUFusion本文方法
    参数量(M)9.1300.1342.1240.9740.12776.28214.727
    FLOPs(G)63.44716.087179.166259.04545.166272.99225.537
    时间(s)M3FD0.1310.0620.6702.4710.2080.2340.196
    MSRS0.1290.0620.6832.5290.2110.2330.203
    下载: 导出CSV
  • [1] 杨莘, 田立凡, 梁佳明, 等. 改进双路径生成对抗网络的红外与可见光图像融合[J]. 电子与信息学报, 2023, 45(8): 3012–3021. doi: 10.11999/JEIT220819.

    YANG Shen, TIAN Lifan, LIANG Jiaming, et al. Infrared and visible image fusion based on improved dual path generation adversarial network[J]. Journal of Electronics & Information Technology, 2023, 45(8): 3012–3021. doi: 10.11999/JEIT220819.
    [2] 高绍兵, 詹宗逸, 匡梅. 视觉多通路机制启发的多场景感知红外与可见光图像融合框架[J]. 电子与信息学报, 2023, 45(8): 2749–2758. doi: 10.11999/JEIT221361.

    GAO Shaobing, ZHAN Zongyi, and KUANG Mei. Multi-scenario aware infrared and visible image fusion framework based on visual multi-pathway mechanism[J]. Journal of Electronics & Information Technology, 2023, 45(8): 2749–2758. doi: 10.11999/JEIT221361.
    [3] XU Guoxia, HE Chunming, WANG Hao, et al. DM-Fusion: Deep model-driven network for heterogeneous image fusion[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023. doi: 10.1109/TNNLS.2023.3238511.
    [4] MA Jiayi, YU Wei, LIANG Pengwei, et al. FusionGAN: A generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11–26. doi: 10.1016/j.inffus.2018.09.004.
    [5] TANG Wei, HE Fazhi, and LIU Yu. YDTR: Infrared and visible image fusion via Y-shape dynamic transformer[J]. IEEE Transactions on Multimedia, 2023, 25: 5413–5428. doi: 10.1109/TMM.2022.3192661.
    [6] LI Hui and WU Xiaojun. DenseFuse: A fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2614–2623. doi: 10.1109/TIP.2018.2887342.
    [7] XU Han, ZHANG Hao, and MA Jiayi. Classification saliency-based rule for visible and infrared image fusion[J]. IEEE Transactions on Computational Imaging, 2021, 7: 824–836. doi: 10.1109/TCI.2021.3100986.
    [8] QU Linhao, LIU Shaolei, WANG Manning, et al. TransMEF: A transformer-based multi-exposure image fusion framework using self-supervised multi-task learning[C]. The 36th AAAI Conference on Artificial Intelligence, Tel Aviv, Israel, 2022: 2126–2134. doi: 10.1609/aaai.v36i2.20109.
    [9] QU Linhao, LIU Shaolei, WANG Manning, et al. TransFuse: A unified transformer-based image fusion framework using self-supervised learning[EB/OL]. https://arxiv.org/abs/2201.07451, 2022. doi: 10.48550/arXiv.2201.07451.
    [10] LI Hui, WU Xiaojun, and KITTLER J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72–86. doi: 10.1016/j.inffus.2021.02.023.
    [11] LI Junwu, LI Binhua, JIANG Yaoxi, et al. MrFDDGAN: Multireceptive field feature transfer and dual discriminator-driven generative adversarial network for infrared and color visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5006228. doi: 10.1109/TIM.2023.3241999.
    [12] HOU Qibin, ZHOU Daquan, and FENG Jiashi. Coordinate attention for efficient mobile network design[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13708–13717. doi: 10.1109/cvpr46437.2021.01350.
    [13] ZHANG Pengyu, ZHAO Jie, WANG Dong, et al. Visible-thermal UAV tracking: A large-scale benchmark and new baseline[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 8876–8885. doi: 10.1109/cvpr52688.2022.00868.
    [14] LIU Jinyuan, FAN Xin, HUANG Zhanbo, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 5792–5801. doi: 10.1109/cvpr52688.2022.00571.
    [15] MA Jiayi, TANG Linfeng, FAN Fan, et al. SwinFusion: Cross-domain long-range learning for general image fusion via swin transformer[J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9(7): 1200–1217. doi: 10.1109/JAS.2022.105686.
    [16] LUO Xing, FU Guizhong, YANG Jiangxin, et al. Multi-modal image fusion via deep laplacian pyramid hybrid network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(12): 7354–7369. doi: 10.1109/TCSVT.2023.3281462.
    [17] CHENG Chunyang, XU Tianyang, and WU Xiaojun. MUFusion: A general unsupervised image fusion network based on memory unit[J]. Information Fusion, 2023, 92: 80–92. doi: 10.1016/j.inffus.2022.11.010.
    [18] LIU Risheng, LIU Zhu, LIU Jinyuan, et al. A task-guided, implicitly-searched and meta-initialized deep model for image fusion[EB/OL].https://arxiv.org/abs/2305.15862, 2023. doi: 10.48550/arXiv.2305.15862.
    [19] LIU Jinyuan, LIN Runjia, WU Guanyao, et al. CoCoNet: Coupled contrastive learning network with multi-level feature ensemble for multi-modality image fusion[J]. International Journal of Computer Vision, 2023. doi: 10.1007/s11263-023-01952-1.
    [20] ZHAO Yangyang, ZHENG Qingchun, ZHU Peihao, et al. TUFusion: A transformer-based universal fusion algorithm for multimodal images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(3): 1712–1725. doi: 10.1109/TCSVT.2023.3296745.
    [21] QU Guihong, ZHANG Dali, YAN Pingfan, et al. Information measure for performance of image fusion[J]. Electronics Letters, 2002, 38(7): 313–315. doi: 10.1049/el:20020212.
    [22] ZHAO Jiying, LAGANIERE R, and LIU Zheng. Performance assessment of combinative pixel-level image fusion based on an absolute feature measurement[J]. International Journal of Innovative Computing, Information and Control, 2007, 3(6(A)): 1433–1447.
    [23] PIELLA G and HEIJMANS H. A new quality metric for image fusion[C]. The 2003 International Conference on Image Processing, Barcelona, Spain, 2003: 173–176. doi: 10.1109/ICIP.2003.1247209.
    [24] CHEN Hao and VARSHNEY P K. A human perception inspired quality metric for image fusion based on regional information[J]. Information Fusion, 2007, 8(2): 193–207. doi: 10.1016/j.inffus.2005.10.001.
    [25] ZHANG Xingchen. Deep learning-based multi-focus image fusion: A survey and a comparative study[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 4819–4838. doi: 10.1109/TPAMI.2021.3078906.
  • 加载中
图(6) / 表(4)
计量
  • 文章访问数:  36
  • HTML全文浏览量:  7
  • PDF下载量:  12
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-10-07
  • 修回日期:  2024-04-15
  • 网络出版日期:  2024-04-27

目录

    /

    返回文章
    返回