高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

集成多种上下文与混合交互的显著性目标检测

夏晨星 陈欣雨 孙延光 葛斌 方贤进 高修菊 张艳

夏晨星, 陈欣雨, 孙延光, 葛斌, 方贤进, 高修菊, 张艳. 集成多种上下文与混合交互的显著性目标检测[J]. 电子与信息学报. doi: 10.11999/JEIT230719
引用本文: 夏晨星, 陈欣雨, 孙延光, 葛斌, 方贤进, 高修菊, 张艳. 集成多种上下文与混合交互的显著性目标检测[J]. 电子与信息学报. doi: 10.11999/JEIT230719
XIA Chenxing, CHEN Xinyu, SUN Yanguang, GE Bin, FANG Xianjin, GAO Xiuju, ZHANG Yan. Integrating Multiple Context and Hybrid Interaction for Salient Object Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT230719
Citation: XIA Chenxing, CHEN Xinyu, SUN Yanguang, GE Bin, FANG Xianjin, GAO Xiuju, ZHANG Yan. Integrating Multiple Context and Hybrid Interaction for Salient Object Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT230719

集成多种上下文与混合交互的显著性目标检测

doi: 10.11999/JEIT230719
基金项目: 国家自然科学基金(62102003),安徽省自然科学基金(2108085QF258),安徽省博士后基金(2022B623),淮南市科技计划项目(2023A316),安徽高校协同创新项目(GXXT-2021-006, GXXT-2022-038),安徽理工大学青年科学研究基金一般项目(xjyb2020-04),中央引导地方科技发展专项资金(202107d06020001)
详细信息
    作者简介:

    夏晨星:男,副教授,研究方向为计算机视觉

    陈欣雨:女,硕士生,研究方向为伪装目标检测

    孙延光:男,博士生,研究方向为显著性目标检测

    葛斌:男,教授,研究方向为图像加密、计算机视觉

    方贤进:男,教授,研究方向为人工智能、信息安全

    高修菊:女,讲师,研究方向为计算机视觉

    张艳:女,副教授,研究方向为模式识别与图像处理

    通讯作者:

    陈欣雨 18900573647@163.com

  • 中图分类号: TP391.41

Integrating Multiple Context and Hybrid Interaction for Salient Object Detection

Funds: The National Natural Science Foundation of China (62102003), The Natural Science Foundation of Anhui Province (2108085QF258), Anhui Postdoctoral Science Foundation (2022B623), Huainan City Science and Technology Plan Project (2023A316), The University Synergy Innovation Program of Anhui Province (GXXT-2021-006, GXXT-2022-038), University-level General Projects of Anhui University of Science and Technology (xjyb2020-04), The Central Guiding Local Technology Development Special Funds (202107d06020001)
  • 摘要: 显著性目标检测目的是识别和分割图像中的视觉显著性目标,它是计算机视觉任务及其相关领域的重要研究内容之一。当下基于全卷积网络的显著性目标检测方法已经取得了不错的性能,然而现实场景中的显著性目标类型多变且尺寸不固定,这使得准确检测并完整分割出显著性目标仍然是一个巨大的挑战。为此,该文提出集成多种上下文和混合交互的显著性目标检测方法,通过利用密集上下文信息探索模块和多源特征混合交互模块来高效预测显著性目标。密集上下文信息探索模块采用空洞卷积、不对称卷积和密集引导连接渐进地捕获具有强关联性的多尺度和多感受野上下文信息,通过集成这些信息来增强每个初始多层级特征的表达能力。多源特征混合交互模块包含多种特征聚合操作,可以自适应交互来自多层级特征中的互补性信息,以生成用于准确预测显著性图的高质量特征表示。此方法在5个公共数据集上进行了性能测试,实验结果表明,该文方法在不同的评估指标下与19种基于深度学习的显著性目标检测方法相比取得优越的预测性能。
  • 图  1  一些方法与本文方法的预测显著性图

    图  2  本文显著性目标检测方法的完整流程图

    图  3  密集上下文信息探索模块

    图  4  多源特征混合交互模块

    图  5  不同显著性目标检测方法的PR曲线比较

    图  7  不同显著性目标检测方法的Fm曲线比较

    图  6  本文方法与10种最近的SOD方法进行视觉比较结果

    图  8  本文方法使用不同模块定性比较结果

    图  9  MFHI模块不同聚合策略定性比较结果

    图  10  不同模块预测显著性图比较

    图  11  DCIE模块不同结构预测显著性图比较

    表  1  MAE、AFm和WFm的定量比较结果

    方法ECSSD (1000)PASCAL-S (850)HKU-IS (4447)DUT-OMRON (5168)DUTS-TE (5019)
    MAEAFmWFmMAEAFmWFmMAEAFmWFmMAEAFmWFmMAEAFmWFm
    Amulet170.0590.8680.8400.1000.7570.7280.0510.8410.8170.0980.6470.6260.0850.6780.658
    UCF170.0690.8440.8060.1160.7260.7260.0620.8230.7790.1200.6210.5740.1120.6310.596
    DGRL180.0460.8930.8710.0770.7940.7720.0410.8750.8510.0660.7110.6880.0540.7550.748
    BDMPM180.0450.8690.8710.0740.7580.7740.0390.8710.8590.0640.6920.6810.0490.7460.761
    PoolNet190.0390.9150.8960.0750.8150.7930.0320.9000.8830.0560.7390.7210.0400.8090.807
    CPD190.0370.9170.8980.0710.8200.7940.0330.8950.8790.0560.7470.7190.0430.8050.795
    AFNet190.0420.9080.8860.0700.8150.7920.0360.8880.8690.0570.7390.7170.0460.7930.785
    R2Net200.0380.9140.8990.0690.8170.7930.0330.8960.8800.0540.7440.7280.0410.8010.804
    GateNet200.0400.9160.8940.0670.8190.7970.0330.8990.8800.0550.7460.7290.0400.8070.809
    ITSD200.0350.8950.9110.0660.7850.8120.0310.8990.8940.0610.7560.7500.0410.8040.824
    MINet200.0340.9240.9110.0640.8290.8090.0290.9090.8970.0560.7560.7380.0370.8280.825
    SUCA210.0360.9150.9060.0670.8180.8030.0310.8970.890---0.0440.8030.802
    CANet210.0440.9000.8780.0730.8130.7920.0370.8820.8660.0580.7310.7200.0440.7850.788
    DSRNet210.0390.9100.8910.0670.8190.8010.0350.8930.8730.0610.7270.7110.0430.7910.794
    VST210.0330.9200.9100.0610.8290.8160.0290.9000.8970.0580.7560.7550.0370.8180.828
    DNA220.0420.8910.8830.0790.7900.7720.0350.8630.8640.0630.6940.6960.0460.7470.765
    DCENet220.0350.9260.9130.0610.8450.8250.0290.9080.8980.0550.7710.7540.0380.8420.834
    DNTDF220.0340.9000.9090.0640.8100.8140.0280.9050.9010.0510.7480.7320.0330.8220.839
    ICON230.0320.9280.9180.0640.8330.8180.0290.9100.9020.0570.7720.7610.0370.8380.837
    本文0.0320.9350.9220.0600.8410.8220.0260.9230.9110.0490.7780.7590.0340.8620.846
    下载: 导出CSV

    表  2  参数量、推理速度和模型内存的比较结果

    方法输入尺寸参数量 (M)推理速度 (帧/s)模型内存 (MB)
    Amulet320×32033.158132
    DGRL384×384161.748631
    BDMPM256×256-22259
    PoolNet384×38468.2617410
    GateNet384×384128.6330503
    MINet320×320162.3825635
    DSRNet400×40075.2915290
    本文320×32029.9726117
    下载: 导出CSV

    表  3  本文方法使用不同模块定量比较结果

    方法 HKU-IS (4447) DUTS-TE (5019)
    MAE AFm WFm MAE AFm WFm
    Res 0.042 0.866 0.843 0.053 0.772 0.760
    Res+FPN 0.037 0.884 0.870 0.045 0.800 0.784
    Res+DCIE+FPN 0.028 0.913 0.900 0.037 0.845 0.828
    Res+DCIE+MFHI 0.026 0.923 0.911 0.034 0.862 0.846
    下载: 导出CSV

    表  4  MFHI模块不同聚合策略定量比较结果

    方法HKU-IS (4447)DUTS-TE (5019)
    MAEAFmWFmMAEAFmWFm
    Res+MFHI(cat)0.0300.9030.8920.0410.8220.813
    Res+MFHI(mul)0.0320.8990.8880.0410.8180.811
    Res+MFHI(add)0.0330.8940.8840.0420.8180.803
    Res+MFHI(h1)0.0300.9030.8920.0390.8250.815
    Res+MFHI(h2)0.0310.9020.8910.0390.8240.815
    Res+MFHI0.0290.9060.8980.0390.8320.819
    下载: 导出CSV

    表  5  不同模块对比测试

    方法 HKU-IS (4447) DUTS-TE (5019)
    MAE AFm WFm MAE AFm WFm
    Res+ASPP+FPN 0.031 0.905 0.892 0.040 0.825 0.815
    Res+RFB+FPN 0.031 0.903 0.891 0.039 0.828 0.818
    Res+PDC+FPN 0.032 0.899 0.899 0.040 0.826 0.813
    Res+DCIE+FPN 0.028 0.913 0.900 0.037 0.845 0.828
    下载: 导出CSV

    表  6  DCIE模块消融分析

    方法HKU-IS (4447)DUTS-TE (5019)
    MAEAFmWFmMAEAFmWFm
    Res+DCIE(w D)+FPN0.0320.8980.8860.0390.8230.815
    Res+DCIE(w A)+FPN0.0320.8980.8860.0400.8220.814
    Res+DCIE(w D+A)+FPN0.0300.9070.8950.0380.8390.823
    Res+DCIE+FPN0.0280.9130.9000.0370.8450.828
    下载: 导出CSV
  • [1] LIU Mingyuan, SCHONFELD D, and TANG Wei. Exploit visual dependency relations for semantic segmentation[C]. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 9726–9735.
    [2] ZHANG Xi and WU Xiaolin. Attention-guided image compression by deep reconstruction of compressive sensed saliency skeleton[C]. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13354–13364.
    [3] LEE S, SEONG H, LEE S, et al. Correlation verification for image retrieval[C]. Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 5374–5384.
    [4] ZHU Junyan, WU Jianjun, XU Yan, et al. Unsupervised object class discovery via saliency-guided multiple class learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(4): 862–875. doi: 10.1109/tpami.2014.2353617
    [5] GUPTA D K, ARYA D, and GAVVES E. Rotation equivariant Siamese networks for tracking[C]. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 12362–12371.
    [6] PANG Youwei, ZHAO Xiaoqi, XIANG Tianzhu, et al. Zoom in and out: A mixed-scale triplet network for camouflaged object detection[C]. Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 2160–2170.
    [7] PENG Houwen, LI Bing, LING Haibin, et al. Salient object detection via structured matrix decomposition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 818–832. doi: 10.1109/TPAMI.2016.2562626
    [8] SHEN Xiaohui and WU Ying. A unified approach to salient object detection via low rank matrix recovery[C]. Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 853–860.
    [9] 唐红梅, 白梦月, 韩力英, 等. 基于低秩背景约束与多线索传播的图像显著性检测[J]. 电子与信息学报, 2021, 43(5): 1432–1440. doi: 10.11999/JEIT200193

    TANG Hongmei, BAI Mengyue, HAN Liying, et al. Image saliency detection based on background constraint of low rank and multi-cue propagation[J]. Journal of Electronics & Information Technology, 2021, 43(5): 1432–1440. doi: 10.11999/JEIT200193
    [10] MARGOLIN R, TAL A, and ZELNIK-MANOR L. What makes a patch distinct?[C]. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013: 1139–1146.
    [11] TONG Na, LU Huchuan, RUAN Xiang, et al. Salient object detection via bootstrap learning[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1884–1892.
    [12] JIANG Zhuolin and DAVIS L S. Submodular salient region detection[C]. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, 2013: 2043–2050.
    [13] LI Guanbin and YU Yizhou. Visual saliency based on multiscale deep features[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 5455–5463.
    [14] LIU Nian and HAN Junwei. DHSnet: Deep hierarchical saliency network for salient object detection[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 678–686.
    [15] SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683
    [16] ZHAO Ting and WU Xiangqian. Pyramid feature attention network for saliency detection[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3085–3094.
    [17] SIRIS A, JIAO Jianbo, TAM G K L, et al. Scene context-aware salient object detection[C]. Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 4156–4166.
    [18] CHEN Zuyao, XU Qianqian, CONG Runmin, et al. Global context-aware progressive aggregation network for salient object detection[C]. Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, 2020: 10599–10606.
    [19] WU Zhenyu, LI Shuai, CHEN Chenglizhao, et al. Salient object detection via dynamic scale routing[J]. IEEE Transactions on Image Processing, 2022, 31: 6649–6663. doi: 10.1109/tip.2022.3214332
    [20] ZHANG Pingping, WANG Dong, LU Huchuan, et al. Amulet: Aggregating multi-level convolutional features for salient object detection[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 202–211.
    [21] HOU Qibin, CHENG Mingming, HU Xiaowei, et al. Deeply supervised salient object detection with short connections[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(4): 815–828. doi: 10.1109/TPAMI.2018.2815688
    [22] LI Junxia, PAN Zefeng, LIU Qingshan, et al. Stacked U-shape network with channel-wise attention for salient object detection[J]. IEEE Transactions on Multimedia, 2021, 23: 1397–1409. doi: 10.1109/TMM.2020.2997192
    [23] ZHANG Lu, DAI Ju, LU Huchuan, et al. A bi-directional message passing model for salient object detection[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1741–1750.
    [24] 雷大江, 杜加浩, 张莉萍, 等. 联合多流融合和多尺度学习的卷积神经网络遥感图像融合方法[J]. 电子与信息学报, 2022, 44(1): 237–244. doi: 10.11999/JEIT200792

    LEI Dajiang, DU Jiahao, ZHANG Liping, et al. Multi-stream architecture and multi-scale convolutional neural network for remote sensing image fusion[J]. Journal of Electronics & Information Technology, 2022, 44(1): 237–244. doi: 10.11999/JEIT200792
    [25] 李珣, 李林鹏, LAZOVIK A, 等. 基于改进双流卷积递归神经网络的RGB-D物体识别方法[J]. 光电工程, 2021, 48(2): 200069. doi: 10.12086/oee.2021.200069

    LI Xun, LI Linpeng, LAZOVIK A, et al. RGB-D object recognition algorithm based on improved double stream convolution recursive neural network[J]. Opto-Electronic Engineering, 2021, 48(2): 200069. doi: 10.12086/oee.2021.200069
    [26] 邓箴, 王一斌, 刘立波. 视觉注意机制的注意残差稠密神经网络弱光照图像增强[J]. 液晶与显示, 2021, 36(11): 1463–1473. doi: 10.37188/CJLCD.2021-0098

    DENG Zhen, WANG Yibin, and LIU Libo. Attentive residual dense network of visual attention mechanism for weakly illuminated image enhancement[J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(11): 1463–1473. doi: 10.37188/CJLCD.2021-0098
    [27] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [28] LIU Jiangjiang, HOU Qibin, CHENG Mingming, et al. A simple pooling-based design for real-time salient object detection[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3917–3926.
    [29] 卢珊妹, 郭强, 王任, 等. 基于多特征注意力循环网络的显著性检测[J]. 计算机辅助设计与图形学学报, 2020, 32(12): 1926–1937. doi: 10.3724/sp.j.1089.2020.18240

    LU Shanmei, GUO Qiang, WANG Ren, et al. Salient object detection using multi-scale features with attention recurrent mechanism[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(12): 1926–1937. doi: 10.3724/sp.j.1089.2020.18240
    [30] HUANG Gao, LIU Zhuang, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4700–4708.
    [31] ZHANG Xiangyu, ZHOU Xinyu, LIN Mengxiao, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6848–6856.
    [32] ZHANG Pingping, WANG Dong, LU Huchuan, et al. Learning uncertain convolutional features for accurate saliency detection[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 212–221.
    [33] WANG Tiantian, ZHANG Lihe, WANG Shuo, et al. Detect globally, refine locally: A novel approach to saliency detection[C]. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3127–3135.
    [34] FENG Mengyang, LU Huchuan, and DING Errui. Attentive feedback network for boundary-aware salient object detection[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 1623–1632.
    [35] WU Zhe, SU Li, and HUANG Qingming. Cascaded partial decoder for fast and accurate salient object detection[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3907–3916.
    [36] FENG Mengyang, LU Huchuan, and YU Yizhou. Residual learning for salient object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 4696–4708. doi: 10.1109/tip.2020.2975919
    [37] ZHAO Xiaoqi, PANG Youwei, ZHANG Lihe, et al. Suppress and balance: A simple gated network for salient object detection[C]. Proceedings of the 16th European Conference on Computer Vision, Glasgow, United Kingdom, 2020: 35–51.
    [38] ZHOU Huajun, XIE Xiaohua, LAI Jianhuang, et al. Interactive two-stream decoder for accurate and fast saliency detection[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 9141–9150.
    [39] PANG Youwei, ZHAO Xiaoqi, ZHANG Lihe, et al. Multi-scale interactive network for salient object detection[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 9413–9422.
    [40] REN Qinghua, LU Shijian, ZHANG Jinxia, et al. Salient object detection by fusing local and global contexts[J]. IEEE Transactions on Multimedia, 2021, 23: 1442–1453. doi: 10.1109/tmm.2020.2997178
    [41] WANG Liansheng, CHEN Rongzhen, ZHU Lei, et al. Deep sub-region network for salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(2): 728–741. doi: 10.1109/tcsvt.2020.2988768
    [42] LIU Nian, ZHANG Ni, WAN Kaiyuan, et al. Visual saliency transformer[C]. Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 4722–4732.
    [43] LIU Yun, CHENG Mingming, ZHANG Xinyu, et al. DNA: Deeply supervised nonlinear aggregation for salient object detection[J]. IEEE Transactions on Cybernetics, 2022, 52(7): 6131–6142. doi: 10.1109/tcyb.2021.3051350
    [44] MEI Haiyang, LIU Yuanyuan, WEI Ziqi, et al. Exploring dense context for salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(3): 1378–1389. doi: 10.1109/tcsvt.2021.3069848
    [45] FANG Chaowei, TIAN Haibin, ZHANG Dingwen, et al. Densely nested top-down flows for salient object detection[J]. Science China Information Sciences, 2022, 65(8): 182103. doi: 10.1007/s11432-021-3384-y
    [46] ZHUGE Mingchen, FAN Dengping, LIU Nian, et al. Salient object detection via integrity learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3738–3752. doi: 10.1109/tpami.2022.3179526
    [47] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2117–2125.
    [48] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848. doi: 10.1109/tpami.2017.2699184
    [49] LIU Songtao, HUANG Di, and WANG Yunhong. Receptive field block net for accurate and fast object detection[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 385–400.
  • 加载中
图(11) / 表(6)
计量
  • 文章访问数:  163
  • HTML全文浏览量:  56
  • PDF下载量:  40
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-07-18
  • 修回日期:  2024-01-06
  • 网络出版日期:  2024-01-28

目录

    /

    返回文章
    返回