面向360度全景图像显著目标检测的相邻协调网络

陈晓雷; 王兴; 张学功; 杜泽龙

doi:10.11999/JEIT240502

面向360度全景图像显著目标检测的相邻协调网络

doi: 10.11999/JEIT240502 cstr: 32379.14.JEIT240502

兰州理工大学电气工程与信息工程学院兰州 730050

基金项目: 国家自然科学基金 (61967012)

详细信息

作者简介:
陈晓雷：男，博士，副教授，研究方向为人工智能、计算机视觉、虚拟现实

王兴：男，硕士生，研究方向为虚拟现实、360°全景图像显著目标检测

张学功：男，硕士生，研究方向为显著目标检测、人工智能

杜泽龙：男，硕士生，研究方向为图像处理、虚拟现实

通讯作者:
陈晓雷　chenxl703@lut.edu.cn

中图分类号: TN911; TP391
计量
- 文章访问数: 200
- HTML全文浏览量: 165
- PDF下载量: 38
- 被引次数: 0
出版历程
- 收稿日期: 2024-06-19
- 修回日期: 2024-11-15
- 网络出版日期: 2024-11-27
- 刊出日期: 2024-12-01

Adjacent Coordination Network for Salient Object Detection in 360 Degree Omnidirectional Images

College of Electrical Engineering and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China

Funds: The National Natural Science Foundation of China (61967012)

摘要

摘要: 为解决360°全景图像显著目标检测(SOD)中的显著目标尺度变化和边缘不连续、易模糊的问题，该文提出一种基于相邻协调网络的360°全景图像显著目标检测方法(ACoNet)。首先，利用相邻细节融合模块获取相邻特征中的细节和边缘信息，以促进显著目标的精确定位。其次，使用语义引导特征聚合模块来聚合浅层特征和深层特征之间不同尺度上的语义特征信息，并抑制浅层特征传递的噪声，缓解解码阶段显著目标与背景区域不连续、边界易模糊的问题。同时构建多尺度语义融合子模块扩大不同卷积层的多尺度感受野，实现精确训练显著目标边界的效果。在2个公开的数据集上进行的大量实验结果表明，相比于其他13种先进方法，所提方法在6个客观评价指标上均有明显的提升，同时主观可视化检测的显著图边缘轮廓性更好，空间结构细节信息更清晰。
- 显著目标检测 /
- 深度学习 /
- 360°全景图像 /
- 多尺度特征
Abstract: To address the issues of significant target scale variation, edge discontinuity, and blurring in 360° omnidirectional images Salient Object Detection (SOD), a method based on the Adjacent Coordination Network (ACoNet) is proposed. First, an adjacent detail fusion module is used to capture detailed and edge information from adjacent features, which facilitates accurate localization of salient objects. Then, a semantic-guided feature aggregation module is employed to aggregate semantic feature information from different scales between shallow and deep features, suppressing the noise transmitted by shallow features. This helps alleviate the problem of discontinuous salient objects and blurred boundaries between the object and background in the decoding stage. Additionally, a multi-scale semantic fusion submodule is constructed to enlarge the receptive field across different convolution layers, thereby achieving better training of the salient object boundaries. Extensive experimental results on two public datasets demonstrate that, compared to 13 other advanced methods, the proposed approach achieves significant improvements in six objective evaluation metrics. Moreover, the subjective visualized detection results show better edge contours and clearer spatial structural details of the salient maps.
- Salient Object Detection (SOD) /
- Deep learning /
- 360° omnidirectional images /
- Multi-scale features

HTML全文

图 1 ACoNet网络整体结构图

下载: 全尺寸图片幻灯片

图 2 相邻细节融合模块ADFM结构图

下载: 全尺寸图片幻灯片

图 3 语义引导特征聚合模块SGFA结构图

下载: 全尺寸图片幻灯片

图 4 多尺度语义融合子模块结构图

下载: 全尺寸图片幻灯片

图 5 360-SOD数据集上不同模型的主观可视化对比结果

下载: 全尺寸图片幻灯片

图 6 360-SSOD数据集上不同模型的主观可视化对比结果

下载: 全尺寸图片幻灯片

图 7 本文模型与不同先进方法损失函数收敛对比结果

下载: 全尺寸图片幻灯片

图 8 相关模块以及结构的消融实验可视化对比

下载: 全尺寸图片幻灯片

表 1 360-SOD 数据集上不同方法客观指标对比

		MAE↓	max-F↑	mean-F↑	max-Em↑	mean-Em↑	Sm↑
360°SOD	本文方法	0.0181	0.7893	0.7815	0.9141	0.9043	0.8493
	MPFRNet(2024)	0.0190	0.7653	0.7556	0.8854	0.8750	0.8416
	LDNet(2023)	0.0289	0.6561	0.6414	0.8655	0.8437	0.7680
	FANet(2020)	0.0208	0.7699	0.7485	0.9002	0.8729	0.8261
2D SOD	MEANet(2022)	0.0243	0.7098	0.7016	0.8675	0.8398	0.7808
	BSCGNet(2023)	0.0246	0.6795	0.6725	0.8666	0.8380	0.7844
	SeaNet(2023)	0.0293	0.6080	0.5981	0.8611	0.8319	0.7374
	AGNet(2022)	0.0221	0.7503	0.7410	0.8754	0.8574	0.8055
	ACCoNet(res)(2023)	0.0243	0.6855	0.6735	0.8690	0.8159	0.7864
	ACCoNet(vgg)(2023)	0.0245	0.6910	0.6782	0.8629	0.8060	0.7711
	MSCNet(2023)	0.0247	0.7286	0.7162	0.8740	0.8672	0.8139
	CorrNet(2022)	0.0474	0.2515	0.1673	0.5465	0.3867	0.5322
	PGNet(2021)	0.0237	0.6944	0.6861	0.8625	0.8360	0.7895
	CSDF(2023)	0.0275	0.6621	0.6433	0.8521	0.7829	0.7530
	SwinNet(2022)	0.0240	0.7023	0.6818	0.8624	0.8315	0.7897

下载: 导出CSV

表 2 360-SSOD 数据集上不同方法客观指标对比

		MAE↓	max-F↑	mean-F↑	max-Em↑	mean-Em↑	Sm↑
360°SOD	本文方法	0.0288	0.6641	0.6564	0.8695	0.8632	0.7796
	MPFRNet(2024)	---	---	---	---	---	---
	LDNet(2023)	0.0341	0.5868	0.5695	0.8415	0.8221	0.7252
	FANet(2020)	0.0421	0.5939	0.5215	0.8426	0.7324	0.7186
2D SOD	MEANet(2022)	0.0289	0.6343	0.6242	0.8578	0.8437	0.7479
	BSCGNet(2023)	0.0316	0.5969	0.5832	0.8216	0.8079	0.7395
	SeaNet(2023)	0.0383	0.3929	0.3432	0.7525	0.5695	0.5978
	AGNet(2022)	0.0297	0.6371	0.6291	0.8409	0.8176	0.7701
	ACCoNet(res)(2023)	0.0365	0.5752	0.5555	0.8312	0.7929	0.7341
	ACCoNet(vgg)(2023)	0.0312	0.6043	0.5904	0.8297	0.7938	0.7358
	MSCNet(2023)	0.0370	0.6239	0.6026	0.8512	0.8328	0.7613
	CorrNet(2022)	0.0540	0.2586	0.0904	0.7046	0.3347	0.5006
	PGNet(2021)	0.0946	0.0772	0.0545	0.5889	0.5613	0.4493
	CSDF(2023)	0.0326	0.5490	0.5211	0.8117	0.7399	0.6942
	SwinNet(2022)	0.0940	0.0772	0.0469	0.5877	0.5379	0.4482

下载: 导出CSV

表 3 本文模型与不同先进方法复杂度对比结果

复杂度指标	本文模型	FANet	MEANet	BSCGNet	SeaNet	AGNet
GFLOPs(G)	76.24	340.94	11.99	86.46	2.86	25.28
Params(M)	24.15	25.40	3.27	36.99	2.75	24.55

复杂度指标	ACCoNet	MSCNet	CorrNet	PGNet	CSDF	SwinNet
GFLOPs(G)	406.06	15.47	42.63	14.37	78.14	43.56
Params(M)	127.00	3.26	4.07	72.67	26.24	97.56

下载: 导出CSV

表 4 相关模块以及多分支结构的消融实验结果

Baseline	ADFM	SGFA	多分支结构	MAE↓	max-F↑	mean-F↑	max-Em↑	mean-Em↑	Sm↑
√				0.0238	0.7193	0.7107	0.8823	0.8525	0.7912
√	√			0.0220	0.7580	0.7466	0.9133	0.8898	0.8185
√		√		0.0209	0.7562	0.7517	0.8915	0.8796	0.8252
√			√	0.0206	0.7584	0.7483	0.8959	0.8745	0.8304
√	√	√		0.0207	0.7819	0.7642	0.9170	0.8650	0.8244
√	√		√	0.0202	0.7704	0.7587	0.9007	0.8899	0.8322
√		√	√	0.0184	0.7777	0.7660	0.9113	0.8878	0.8407
√	√	√	√	0.0181	0.7893	0.7815	0.9141	0.9043	0.8493

下载: 导出CSV

参考文献(49)

[1]	CONG Runmin, LEI Jianjun, FU Huazhu, et al. Review of visual saliency detection with comprehensive information[J]. IEEE Transactions on circuits and Systems for Video Technology, 2019, 29(10): 2941–2959. doi: 10.1109/TCSVT.2018.2870832.
[2]	丁颖, 刘延伟, 刘金霞, 等. 虚拟现实全景图像显著性检测研究进展综述[J]. 电子学报, 2019, 47(7): 1575–1583. doi: 10.3969/j.issn.0372-2112.2019.07.024. DING Ying, LIU Yanwei, LIU Jinxia, et al. An overview of research progress on saliency detection of panoramic VR images[J]. Acta Electronica Sinica, 2019, 47(7): 1575–1583. doi: 10.3969/j.issn.0372-2112.2019.07.024.
[3]	GONG Xuan, XIA Xin, ZHU Wentao, et al. Deformable Gabor feature networks for biomedical image classification[C]. 2021 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2021: 4003–4011. doi: 10.1109/WACV48630.2021.00405.
[4]	XU K, BA J, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]. The 32nd International Conference on Machine Learning, Lille, France, 2048–2057.
[5]	张德祥, 王俊, 袁培成. 基于注意力机制的多尺度全场景监控目标检测方法[J]. 电子与信息学报, 2022, 44(9): 3249–3257. doi: 10.11999/JEIT210664. ZHANG Dexiang, WANG Jun, and YUAN Peicheng. Object detection method for multi-scale full-scene surveillance based on attention mechanism[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3249–3257. doi: 10.11999/JEIT210664.
[6]	GAO Yuan, SHI Miaojing, TAO Dacheng, et al. Database saliency for fast image retrieval[J]. IEEE Transactions on Multimedia, 2015, 17(3): 359–369. doi: 10.1109/TMM.2015.2389616.
[7]	LI Jia, SU Jinming, XIA Changqun, et al. Distortion-adaptive salient object detection in 360°omnidirectional images[J]. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(1): 38–48. doi: 10.1109/JSTSP.2019.2957982.
[8]	MA Guangxiao, LI Shuai, CHEN Chenglizhao, et al. Stage-wise salient object detection in 360°omnidirectional image via object-level semantical saliency ranking[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 26(12): 3535–3545. doi: 10.1109/TVCG.2020.3023636.
[9]	HUANG Mengke, LIU Zhi, LI Gongyang, et al. FANet: Features adaptation network for 360°omnidirectional salient object detection[J]. IEEE Signal Processing Letters, 2020, 27: 1819–1823. doi: 10.1109/LSP.2020.3028192.
[10]	LIU Nian and HAN Junwei. DHSNet: Deep hierarchical saliency network for salient object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 678–686. doi: 10.1109/CVPR.2016.80.
[11]	PANG Youwei, ZHAO Xiaoqi, ZHANG Lihe, et al. Multi-scale interactive network for salient object detection[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 9410–9419. doi: 10.1109/CVPR42600.2020.00943.
[12]	ZENG Yi, ZHANG Pingping, LIN Zhe, et al. Towards high-resolution salient object detection[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 7233–7242. doi: 10.1109/ICCV.2019.00733.
[13]	ZHANG Lu, DAI Ju, LU Huchuan, et al. A bi-directional message passing model for salient object detection[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1741–1750. doi: 10.1109/CVPR.2018.00187.
[14]	FENG Mengyang, LU Huchuan, and DING E. Attentive feedback network for boundary-aware salient object detection[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 1623–1632. doi: 10.1109/CVPR.2019.00172.
[15]	WU Zhe, SU Li, and HUANG Qingming. Cascaded partial decoder for fast and accurate salient object detection[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3902–3911. doi: 10.1109/CVPR.2019.00403.
[16]	MA Mingcan, XIA Changqun, and LI Jia. Pyramidal feature shrinking for salient object detection[C]. The 35th AAAI Conference on Artificial Intelligence, 2021: 2311–2318. doi: 10.1609/aaai.v35i3.16331.
[17]	QIN Xuebin, ZHANG Zichen, HUANG Chenyang, et al. U²-Net: Going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: 107404. doi: 10.1016/j.patcog.2020.107404.
[18]	PAN Chen, LIU Jianfeng, YAN Weiqi, et al. Salient object detection based on visual perceptual saturation and two-stream hybrid networks[J]. IEEE Transactions on Image Processing, 2021, 30: 4773–4787. doi: 10.1109/TIP.2021.3074796.
[19]	MA Guangxiao, LI Shuai, CHEN Chenglizhao, et al. Rethinking image salient object detection: Object-level semantic saliency reranking first, pixelwise saliency refinement later[J]. IEEE Transactions on Image Processing, 2021, 30: 4238–4252. doi: 10.1109/TIP.2021.3068649.
[20]	REN Guangyu, XIE Yanchu, DAI Tianhong, et al. Progressive multi-scale fusion network for RGB-D salient object detection[J]. arXiv: 2106.03941, 2022. doi: 10.48550/arXiv.2106.03941.
[21]	CHENG Mingming, MITRA N J, HUANG Xiaolei, et al. Global contrast based salient region detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569–582. doi: 10.1109/TPAMI.2014.2345401.
[22]	LIU Qing, HONG Xiaopeng, ZOU Beiji, et al. Hierarchical contour closure-based holistic salient object detection[J]. IEEE Transactions on Image Processing, 2017, 26(9): 4537–4552. doi: 10.1109/TIP.2017.2703081.
[23]	ZHAO Rui, OUYANG Wanli, LI Hongsheng, et al. Saliency detection by multi-context deep learning[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1265–1274. doi: 10.1109/CVPR.2015.7298731.
[24]	LIU Jiangjiang, HOU Qibin, CHENG Mingming, et al. A simple pooling-based design for real-time salient object detection[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3912–3921. doi: 10.1109/CVPR.2019.00404.
[25]	WU Yuhuan, LIU Yun, ZHANG Le, et al. EDN: Salient object detection via extremely-downsampled network[J]. IEEE Transactions on Image Processing, 2022, 31: 3125–3136. doi: 10.1109/TIP.2022.3164550.
[26]	ZHAO Jiaxing, LIU Jiangjiang, FAN Dengping, et al. EGNet: Edge guidance network for salient object detection[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 8778–8787. doi: 10.1109/ICCV.2019.00887.
[27]	LV Yunqiu, LIU Bowen, ZHANG Jing, et al. Semi-supervised active salient object detection[J]. Pattern Recognition, 2022, 123: 108364. doi: 10.1016/j.patcog.2021.108364.
[28]	WANG Tiantian, BORJI A, ZHANG Lihe, et al. A stagewise refinement model for detecting salient objects in images[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 4039–4048. doi: 10.1109/ICCV.2017.433.
[29]	HOU Qibin, CHENG Mingming, HU Xiaowei, et al. Deeply supervised salient object detection with short connections[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5300–5309. doi: 10.1109/CVPR.2017.563.
[30]	LIU Nian, HAN Junwei, and YANG M H. PiCANet: Learning pixel-wise contextual attention for saliency detection[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3089–3098. doi: 10.1109/CVPR.2018.00326.
[31]	OUYANG Wentao, ZHANG Xiuwu, ZHAO Lei, et al. MiNet: Mixed interest network for cross-domain click-through rate prediction[C/Ol]. The 29th ACM International Conference on Information & Knowledge Management, 2020: 2669–2676. doi: 10.1145/3340531.3412728.
[32]	HUANG Tongwen, SHE Qingyun, WANG Zhiqiang, et al. GateNet: Gating-enhanced deep network for click-through rate prediction[J]. arXiv: 2007.03519, 2020. doi: 10.48550/arXiv.2007.03519.
[33]	ZHANG Yi, ZHANG Lu, HAMIDOUCHE W, et al. A fixation-based 360° benchmark dataset for salient object detection[C]. Proceedings of 2020 IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 2020: 3458–3462. doi: 10.1109/ICIP40778.2020.9191158.
[34]	CONG Runmin, HUANG Ke, LEI Jianjun, et al. Multi-projection fusion and refinement network for salient object detection in 360° omnidirectional image[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(7): 9495–9507. doi: 10.1109/TNNLS.2022.3233883.
[35]	CHEN Dongwen, QING Chunmei, XU Xiangmin, et al. SalBiNet360: Saliency prediction on 360° images with local-global bifurcated deep network[C]. 2020 IEEE Conference on Virtual Reality and 3D User Interfaces, Atlanta, USA, 2020: 92–100. doi: 10.1109/VR46266.2020.00027.
[36]	CHEN Gang, SHAO Feng, CHAI Xiongli, et al. Multi-stage salient object detection in 360° omnidirectional image using complementary object-level semantic information[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024, 8(1): 776–789. doi: 10.1109/TETCI.2023.3259433.
[37]	ZHANG Yi, HAMIDOUCHE W, and DEFORGES O. Channel-spatial mutual attention network for 360° salient object detection[C]. The 2022 26th International Conference on Pattern Recognition, Montreal, Canada, 2022: 3436–3442. doi: 10.1109/ICPR56361.2022.9956354.
[38]	WU Junjie, XIA Changqun, YU Tianshu, et al. View-aware salient object detection for 360° omnidirectional image[J]. IEEE Transactions on Multimedia, 2023, 25: 6471–6484. doi: 10.1109/TMM.2022.3209015.
[39]	LIN Yuhan, SUN Han, LIU Ningzhong, et al. A lightweight multi-scale context network for salient object detection in optical remote sensing images[C]. The 2022 26th International Conference on Pattern Recognition, Montreal, Canada, 2022: 238–244. doi: 10.1109/ICPR56361.2022.9956350.
[40]	JIANG Yao, ZHANG Wenbo, FU Keren, et al. MEANet: Multi-modal edge-aware network for light field salient object detection[J]. Neurocomputing, 2022, 491: 78–90. doi: 10.1016/j.neucom.2022.03.056.
[41]	LI Gongyang, LIU Zhi, ZHANG Xinpeng, et al. Lightweight salient object detection in optical remote-sensing images via semantic matching and edge alignment[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5601111. doi: 10.1109/TGRS.2023.3235717.
[42]	LI Gongyang, LIU Zhi, BAI Zhen,et al. Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5617712. doi: 10.1109/TGRS.2022.3145483.
[43]	FENG Dejun, CHEN Hongyu, LIU Suning, et al. Boundary-semantic collaborative guidance network with dual-stream feedback mechanism for salient object detection in optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4706317. doi: 10.1109/TGRS.2023.3332282.
[44]	LIN Yuhan, SUN Han, LIU Ningzhong, et al. Attention guided network for salient object detection in optical remote sensing images[C]. The 31st International Conference on Artificial Neural Networks, Bristol, UK, 2022: 25–36. doi: 10.1007/978-3-031-15919-0_3.
[45]	WANG Pengfei, ZHANG Chengquan, QI Fei, et al. PGNet: Real-time arbitrarily-shaped text spotting with point gathering network[C/OL]. The 35th AAAI Conference on Artificial Intelligence, 2021: 2782–2790. doi: 10.1609/aaai.v35i4.16383.
[46]	LI Gongyang, LIU Zhi, ZENG Dan, et al. Adjacent context coordination network for salient object detection in optical remote sensing images[J]. IEEE Transactions on Cybernetics, 2023, 53(1): 526–538. doi: 10.1109/TCYB.2022.3162945.
[47]	SONG Yue, TANG Hao, SEBE N, et al. Disentangle saliency detection into cascaded detail modeling and body filling[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(1): 7. doi: 10.1145/3513134.
[48]	LIU Zhengyi, TAN Yacheng, HE Qian, et al. SwinNet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4486–4497. doi: 10.1109/TCSVT.2021.3127149.
[49]	HUANG Mengke, LI Gongyang, LIU Zhi, et al. Lightweight distortion-aware network for salient object detection in omnidirectional images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(10): 6191–6197. doi: 10.1109/TCSVT.2023.3253685.