高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

FCSNet: 频域感知的跨特征融合烟雾分割网络

王开正 曾瑶 张占喜 谭义章 文刚

王开正, 曾瑶, 张占喜, 谭义章, 文刚. FCSNet: 频域感知的跨特征融合烟雾分割网络[J]. 电子与信息学报, 2025, 47(7): 2320-2333. doi: 10.11999/JEIT241021
引用本文: 王开正, 曾瑶, 张占喜, 谭义章, 文刚. FCSNet: 频域感知的跨特征融合烟雾分割网络[J]. 电子与信息学报, 2025, 47(7): 2320-2333. doi: 10.11999/JEIT241021
WANG Kaizheng, ZENG Yao, ZHANG Zhanxi, TAN Yizhang, WEN Gang. FCSNet: A Frequency-Domain Aware Cross-Feature Fusion Network for Smoke Segmentation[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2320-2333. doi: 10.11999/JEIT241021
Citation: WANG Kaizheng, ZENG Yao, ZHANG Zhanxi, TAN Yizhang, WEN Gang. FCSNet: A Frequency-Domain Aware Cross-Feature Fusion Network for Smoke Segmentation[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2320-2333. doi: 10.11999/JEIT241021

FCSNet: 频域感知的跨特征融合烟雾分割网络

doi: 10.11999/JEIT241021 cstr: 32379.14.JEIT241021
基金项目: 国家自然科学基金(52107017),云南省科技厅基础研究专项青年项目(202201AU070172)
详细信息
    作者简介:

    王开正:男,副教授,研究方向为电网防灾减灾、电力装备绝缘失效机制与防御技术

    曾瑶:男,硕士生,研究方向为电网防灾减灾、电力装备绝缘失效机制与防御技术

    张占喜:男,硕士,研究方向为深度学习、目标分割

    谭义章:男,硕士生,研究方向为电网防灾减灾、电力装备绝缘失效机制与防御技术

    文刚:男,硕士,研究方向为电网防灾检测、深度学习

    通讯作者:

    王开正 kz.wang@foxmail.com

  • 中图分类号: TP391

FCSNet: A Frequency-Domain Aware Cross-Feature Fusion Network for Smoke Segmentation

Funds: The National Natural Science Foundation of China (52107017), Yunnan Provincial Department of Science and Technology Basic Research Special Youth Project (202201AU070172)
  • 摘要: 由于烟雾具有非刚性结构、半透明性以及形态多变等特点,烟雾的语义分割相较于其他物体具有更大的挑战性。为此,该文设计了频域感知的跨特征融合烟雾分割网络(FCSNet),用于应对真实场景中的烟雾分割任务。该网络由Transformer分支、卷积神经网络分支、特征融合分支以及多级高频感知分支组成。为了在获取全局上下文信息的同时降低Transformer分支的计算复杂度,提出频率Transformer分支,该分支基于傅里叶变换来获取全局特征,并使用频域中的低频幅值来代表大规模的语义结构。此外,还提出域间交互模块(DIM),该模块通过加权融合操作和坐标注意力机制,促进了来自不同特征源信息的协同学习和整合,有效地融合全局和局部信息。为了充分利用边缘信息的固有分割能力来解决混淆区域的不准确分割问题,提出多级高频感知模块(MHFM),以获取准确的高频边缘信息。考虑到烟雾的非刚性特征,设计多向交叉注意力模块(MHCA),该模块计算边缘特征图和解码器特征图之间的相似性,从而指导混淆区域的分割结果。实验结果表明,该网络在真实场景烟雾分割数据集中的两个测试数据集上分别达到了58.95%和63.92%的平均交并比,在SMOKE5K数据集上达到了78.94%的平均交并比。与其他方法相比,该网络能够获得准确的烟雾定位和更精细的烟雾边缘。
  • 图  1  烟雾分割网络

    图  2  不同的Transformer架构

    图  3  傅里叶频谱解耦示意图

    图  4  域间交互模块的框架

    图  5  MHFM的详细结构

    图  6  MHCA的结构

    图  7  空间注意力图的可视化

    图  8  在RealSmoke-1数据集上的结果

    图  9  在RealSmoke-2数据集上的结果

    图  10  在SMOKE5K数据集上的结果

    表  1  频率Transformer的消融实验

    方法 mIoU (%) 参数量 (M) FLOPs (G) fps
    RealSmoke-1 RealSmoke-2
    Model1 58.81 64.89 23.67 11.95 28.34
    Model2 56.12 58.37 19.16 (19.05%) 11.11 (7.03%) 30.93
    Model3 57.74 61.74 20.29 (14.28%) 11.11 (7.03%) 30.15
    本文FCSNet 58.59(0.22) 63.92(0.97) 19.72 (16.69%) 11.11 (7.03%) 30.49
    下载: 导出CSV

    表  2  DIM的消融实验

    方法mIoU (%)
    RealSmoke-1RealSmoke-2
    Model153.7851.23
    Model254.7352.81
    Model358.1555.31
    本文FCSNet58.5963.92
    下载: 导出CSV

    表  3  MHFM和MHCA的消融实验

    方法mIoU (%)
    RealSmoke-1RealSmoke-2
    Model153.5154.32
    Model255.8758.69
    Model356.5357.53
    Model458.2159.19
    本文FCSNet58.5963.92
    下载: 导出CSV

    表  4  在RealSmoke数据集上的烟雾分割性能

    方法mIoU (%)参数量 (M)FLOPs (G)
    RealSmoke-1RealSmoke-2
    FCN8s52.4754.3214.4219.69
    W-Net56.6357.5331.7512.50
    DMAENet56.8658.3732.6144.08
    Trans-BVM58.1763.2451.62/
    DSS50.4652.3228.8021.62
    TransUnet54.5661.3275.6522.23
    TransFuse53.7161.5625.028.44
    本文FCSNet58.5963.9219.7211.11
    下载: 导出CSV

    表  5  在SMOKE5K数据集上的分割性能

    方法mIoU (%)
    FCN8s71.39
    W-Net74.65
    Trans-BVM77.69
    DSS68.87
    TransFuse75.87
    DMAENet76.06
    本文FCSNet78.94
    下载: 导出CSV
  • [1] 王开正, 周顺珍, 王健, 等. 基于多尺度时空特征深度融合神经网络的输电线路火点判识方法[J]. 高电压技术, 2025, 51(3): 1145–1157. doi: 10.13336/j.1003-6520.hve.20240086.

    WANG Kaizheng, ZHOU Shunzhen, WANG Jian, et al. Wildfire identification method for transmission lines based on deep fusion neural network with multi-scale spatio-temporal features[J]. High Voltage Engineering, 2025, 51(3): 1145–1157. doi: 10.13336/j.1003-6520.hve.20240086.
    [2] 杜辰, 王兴, 董增寿, 等. 改进YOLOv5s的地下车库火焰烟雾检测方法[J]. 计算机工程与应用, 2024, 60(11): 298–308. doi: 10.3778/j.issn.1002-8331.2307-0003.

    DU Chen, WANG Xing, DONG Zengshou, et al. Improved YOLOv5s flame and smoke detection method for underground garage[J]. Computer Engineering and Applications, 2024, 60(11): 298–308. doi: 10.3778/j.issn.1002-8331.2307-0003.
    [3] 张欣雨, 梁煜, 张为. 融合全局和局部信息的实时烟雾分割算法[J]. 西安电子科技大学学报, 2024, 51(1): 147–156. doi: 10.19665/j.issn1001-2400.20230405.

    ZHANG Xinyu, LIANG Yu, and ZHANG Wei. Real-time smoke segmentation algorithm combining global and local information[J]. Journal of Xidian University, 2024, 51(1): 147–156. doi: 10.19665/j.issn1001-2400.20230405.
    [4] CAO Hu, WANG Yueyue, CHEN J, et al. Swin-unet: Unet-like pure transformer for medical image segmentation[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 205–218. doi: 10.1007/978-3-031-25066-8_9.
    [5] JIANG Huiyan, DIAO Zhaoshuo, SHI Tianyu, et al. A review of deep learning-based multiple-lesion recognition from medical images: Classification, detection and segmentation[J]. Computers in Biology and Medicine, 2023, 157: 106726. doi: 10.1016/j.compbiomed.2023.106726.
    [6] MIAH M S U, KABIR M M, SARWAR T B, et al. A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM[J]. Scientific Reports, 2024, 14(1): 9603. doi: 10.1038/s41598-024-60210-7.
    [7] TANG M C S, TING K C, and RASHIDI N H. DenseNet201-based waste material classification using transfer learning approach[J]. Applied Mathematics and Computational Intelligence, 2024, 13(2): 113–120. doi: 10.58915/amci.v13i2.555.
    [8] MIN Hai, ZHANG Yemao, ZHAO Yang, et al. Hybrid feature enhancement network for few-shot semantic segmentation[J]. Pattern Recognition, 2023, 137: 109291. doi: 10.1016/j.patcog.2022.109291.
    [9] CHENG Huixian, HAN Xianfeng, and XIAO Guoqiang. TransRVNet: LiDAR semantic segmentation with transformer[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(6): 5895–5907. doi: 10.1109/TITS.2023.3248117.
    [10] GROSSBERG S. Recurrent neural networks[J]. Scholarpedia, 2013, 8(2): 1888. doi: 10.4249/scholarpedia.1888.
    [11] ZAGORUYKO S and KOMODAKIS N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer[C]. 5th International Conference on Learning Representations, Toulon, France, 2017.
    [12] KHAN S, MUHAMMAD K, HUSSAIN T, et al. DeepSmoke: Deep learning model for smoke detection and segmentation in outdoor environments[J]. Expert Systems with Applications, 2021, 182: 115125. doi: 10.1016/j.eswa.2021.115125.
    [13] HUANG Yonghao, CHEN Leiting, ZHOU Chuan, et al. Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers[J]. Knowledge-Based Systems, 2023, 271: 110544. doi: 10.1016/j.knosys.2023.110544.
    [14] HUTCHINS D, SCHLAG I, WU Yuhuai, et al. Block-recurrent transformers[C]. The 36th International Conference on Neural Information Processing Systems, New Orleans, USA, 2022: 2409.
    [15] GAO Mingyu, QI Dawei, MU Hongbo, et al. A transfer residual neural network based on ResNet-34 for detection of wood knot defects[J]. Forests, 2021, 12(2): 212. doi: 10.3390/f12020212.
    [16] LI Xiuqing, CHEN Zhenxue, WU Q M J, et al. 3D parallel fully convolutional networks for real-time video wildfire smoke detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(1): 89–103. doi: 10.1109/TCSVT.2018.2889193.
    [17] 张俊鹏, 刘辉, 李清荣. 基于FCN-LSTM的工业烟尘图像分割[J]. 计算机工程与科学, 2021, 43(5): 907–916. doi: 10.3969/j.issn.1007-130X.2021.05.018.

    ZHANG Junpeng, LIU Hui, and LI Qingrong. An industrial smoke image segmentation method based on FCN-LSTM[J]. Computer Engineering & Science, 2021, 43(5): 907–916. doi: 10.3969/j.issn.1007-130X.2021.05.018.
    [18] YUAN Feiniu, ZHANG Lin, XIA Xue, et al. Deep smoke segmentation[J]. Neurocomputing, 2019, 357: 248–260. doi: 10.1016/j.neucom.2019.05.011.
    [19] YUAN Feiniu, ZHANG Lin, XIA Xue, et al. A wave-shaped deep neural network for smoke density estimation[J]. IEEE Transactions on Image Processing, 2020, 29: 2301–2313. doi: 10.1109/TIP.2019.2946126.
    [20] TAO Huanjie and DUAN Qianyue. Learning discriminative feature representation for estimating smoke density of smoky vehicle rear[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(12): 23136–23147. doi: 10.1109/TITS.2022.3198047.
    [21] YAN Siyuan, ZHANG Jing, and BARNES N. Transmission-guided Bayesian generative model for smoke segmentation[C]. Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2022: 3009–3017. doi: 10.1609/aaai.v36i3.20207.
    [22] HE Qiqi, YANG Qiuju, and XIE Minghao. HCTNet: A hybrid CNN-transformer network for breast ultrasound image segmentation[J]. Computers in Biology and Medicine, 2023, 155: 106629. doi: 10.1016/j.compbiomed.2023.106629.
    [23] CHEN Jieneng, LU Yongyi, YU Qihang, et al. TransUNet: Transformers make strong encoders for medical image segmentation[J]. arXiv preprint arXiv, 2021: 2102.04306. doi: 10.48550/arXiv.2102.04306.
    [24] GHOSH R and BOVOLO F. An FFT-based CNN-transformer encoder for semantic segmentation of radar sounder signal[C]. Proceedings of SPIE 12267, Image and Signal Processing for Remote Sensing XXVIII, Berlin, Germany, 2022: 122670R. doi: 10.1117/12.2636693.
    [25] LABBIHI I, EL MESLOUHI O, BENADDY M, et al. Combining frequency transformer and CNNs for medical image segmentation[J]. Multimedia Tools and Applications, 2024, 83(7): 21197–21212. doi: 10.1007/s11042-023-16279-9.
    [26] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]. The 9th International Conference on Learning Representations, 2021.
    [27] ZHENG Sixiao, LU Jiachen, ZHAO Hengshuang, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 6877–6886. doi: 10.1109/CVPR46437.2021.00681.
    [28] SUN Yu, ZHI Xiyang, JIANG Shikai, et al. Image fusion for the novelty rotating synthetic aperture system based on vision transformer[J]. Information Fusion, 2024, 104: 102163. doi: 10.1016/j.inffus.2023.102163.
    [29] HUANG Zilong, WANG Xinggang, HUANG Lichao, et al. CCNet: Criss-cross attention for semantic segmentation[C]. The IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 603–612. doi: 10.1109/ICCV.2019.00069.
    [30] WANG Xiaolong, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7794–7803. doi: 10.1109/CVPR.2018.00813.
    [31] LONG J, SHELHAMER E, and DARRELL T. Fully convolutional networks for semantic segmentation[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 3431–3440. doi: 10.1109/CVPR.2015.7298965.
    [32] WEN Gang, ZHOU Fangrong, MA Yutang, et al. A dense multi-scale context and asymmetric pooling embedding network for smoke segmentation[J]. IET Computer Vision, 2024, 18(2): 236–246. doi: 10.1049/cvi2.12246.
    [33] ZHANG Yundong, LIU Huiye, and HU Qiang. TransFuse: Fusing transformers and CNNs for medical image segmentation[C]. The 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Strasbourg, France, 2021: 14–24. doi: 10.1007/978-3-030-87193-2_2.
  • 加载中
图(10) / 表(5)
计量
  • 文章访问数:  293
  • HTML全文浏览量:  137
  • PDF下载量:  24
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-11-14
  • 修回日期:  2025-05-28
  • 网络出版日期:  2025-06-10
  • 刊出日期:  2025-07-22

目录

    /

    返回文章
    返回