Multitask Collaborative Multi-modal Remote Sensing Target Segmentation Algorithm

MAO Xiuhua; ZHANG Qiang; RUAN Hang; YANG Yuang

doi:10.11999/JEIT231267

Volume 46 Issue 8

Aug. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(8): 3363-3371

MAO Xiuhua, ZHANG Qiang, RUAN Hang, YANG Yuang. Multitask Collaborative Multi-modal Remote Sensing Target Segmentation Algorithm[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3363-3371. doi: 10.11999/JEIT231267

Citation:

MAO Xiuhua, ZHANG Qiang, RUAN Hang, YANG Yuang. Multitask Collaborative Multi-modal Remote Sensing Target Segmentation Algorithm[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3363-3371. doi: 10.11999/JEIT231267

Citation:

PDF( 2753 KB)

Multitask Collaborative Multi-modal Remote Sensing Target Segmentation Algorithm

doi: 10.11999/JEIT231267 cstr: 32379.14.JEIT231267

1.
Beijing Institute of Tracking and Telecommunications Technology, Beijing 100094, China
2.
National Key Laboratory of Space Integrated Information System, Beijing 100094, China

Received Date: 2023-11-05
Rev Recd Date: 2024-03-27

Available Online: 2024-05-11

Publish Date: 2024-08-10

Abstract

Abstract

The use of semantic segmentation technology to extract high-resolution remote sensing image object segmentation has important application prospects. With the rapid development of multi-sensor technology, the good complementary advantages between multimodal remote sensing images have received widespread attention, and joint analysis of them has become a research hotspot. This article analyzes both optical remote sensing images and elevation data, and proposes a multi-task collaborative model based on multimodal remote sensing data (United Refined PSPNet, UR-PSPNet) to address the issue of insufficient fusion classification accuracy of the two types of data due to insufficient fully registered elevation data in real scenarios. This model extracts deep features of optical images, predicts semantic labels and elevation values, and embeds elevation data as supervised information, to improve the accuracy of target segmentation. This article designs a comparative experiment based on ISPRS, which proves that this algorithm can better fuse multimodal data features and improve the accuracy of object segmentation in optical remote sensing images.
- Semantic segmentation,
- Remote sensing images,
- Multi-modal data,
- Deep learning,
- Elevation estimation

FullText(HTML)

References(15)

References

[1]	李树涛, 李聪妤, 康旭东. 多源遥感图像融合发展现状与未来展望[J]. 遥感学报, 2021, 25(1): 148–166. doi: 10.11834/jrs.20210259. LI Shutao, LI Congyu, and KANG Xudong. Development status and future prospects of multi-source remote sensing image fusion[J]. National Remote Sensing Bulletin, 2021, 25(1): 148–166. doi: 10.11834/jrs.20210259.
[2]	QIN Rongjun and FANG Wei. A hierarchical building detection method for very high resolution remotely sensed images combined with DSM using graph cut optimization[J]. Photogrammetric Engineering & Remote Sensing, 2014, 80(9): 873–883. doi: 10.14358/PERS.80.9.873.
[3]	LONG J, SHELHAMER E, and DARRELL T. Fully convolutional networks for semantic segmentation[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 3431–3440. doi: 10.1109/CVPR.2015.7298965.
[4]	ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6230–6239. doi: 10.1109/CVPR.2017.660.
[5]	MOU Lichao and ZHU Xiaoxiang. IM2HEIGHT: Height estimation from single monocular imagery via fully residual convolutional-deconvolutional network[J]. 2018. doi: 10.48550/arXiv.1802.10249.
[6]	GHAMISI P and YOKOYA N. IMG2DSM: Height simulation from single imagery using conditional generative adversarial net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(5): 794–798. doi: 10.1109/LGRS.2018.2806945.
[7]	YUAN Min, REN Dingbang, FENG Qisheng, et al. MCAFNet: A multiscale channel attention fusion network for semantic segmentation of remote sensing images[J]. Remote Sensing, 2023, 15(2): 361. doi: 10.3390/rs15020361.
[8]	WENG Liguo, PANG Kai, XIA Min, et al. Sgformer: A local and global features coupling network for semantic segmentation of land cover[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 6812–6824. doi: 10.1109/JSTARS.2023.3295729.
[9]	HAO Xuejie, YIN Lizeyan, LI Xiuhong, et al. A multi-objective semantic segmentation algorithm based on improved U-Net networks[J]. Remote Sensing, 2023, 15(7): 1838. doi: 10.3390/rs15071838.
[10]	LV Ning, ZHANG Zenghui, LI Cong, et al. A hybrid-attention semantic segmentation network for remote sensing interpretation in land-use surveillance[J]. International Journal of Machine Learning and Cybernetics, 2023, 14(2): 395–406. doi: 10.1007/s13042-022-01517-7.
[11]	ZHANG Jiaming, LIU Ruiping, SHI Hao, et al. Delivering arbitrary-modal semantic segmentation[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 1136–1147. doi: 10.1109/CVPR52729.2023.00116.
[12]	ROTTENSTEINER F, SOHN G, JUNG J, et al. The ISPRS benchmark on urban object classification and 3D building reconstruction[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2012, 1(1): 293–298. doi: 10.5194/isprsannals-I-3-293-2012.
[13]	CARVALHO M, LE SAUX B, TROUVÉ-PELOUX P, et al. Multitask learning of height and semantics from aerial images[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(8): 1391–1395. doi: 10.1109/LGRS.2019.2947783.
[14]	WANG Yufeng, DING Wenrui, ZHANG Ruiqian, et al. Boundary-aware multitask learning for remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 951–963. doi: 10.1109/JSTARS.2020.3043442.
[15]	HE Qibin, SUN Xian, DIAO Wenhui, et al. Multimodal remote sensing image segmentation with intuition-inspired hypergraph modeling[J]. IEEE Transactions on Image Processing, 2023, 32: 1474–1487. doi: 10.1109/TIP.2023.3245324.