Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention

ZHANG Jing; WU Xinjia; ZHAO Xiaolei; ZHUO Li; ZHANG Jie

doi:10.11999/JEIT210466

Volume 44 Issue 8

Aug. 2022

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2022 > 44(8): 2924-2931

ZHANG Jing, WU Xinjia, ZHAO Xiaolei, ZHUO Li, ZHANG Jie. Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2924-2931. doi: 10.11999/JEIT210466

Citation:

ZHANG Jing, WU Xinjia, ZHAO Xiaolei, ZHUO Li, ZHANG Jie. Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2924-2931. doi: 10.11999/JEIT210466

Citation:

ZHANG Jing, WU Xinjia, ZHAO Xiaolei, ZHUO Li, ZHANG Jie. Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2924-2931. doi: 10.11999/JEIT210466

PDF( 12447 KB)

Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention

doi: 10.11999/JEIT210466 cstr: 32379.14.JEIT210466

ZHANG Jing^{1, 2
,
,},
WU Xinjia¹,
ZHAO Xiaolei¹,
ZHUO Li^{1, 2},
ZHANG Jie³

1.
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
2.
Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China
3.
Department of Resource Information Engineering, China University of Geosciences, Wuhan 430074, China

Funds: The National Natural Science Foundation of China (61370189), Beijing Municipal Education Commission Cooperation Beijing Natural Science Foundation (KZ201810005002), The General Program of Beijing Municipal Education Commission (KM202110005027)

Received Date: 2021-05-25
Rev Recd Date: 2021-09-01

Available Online: 2022-04-13

Publish Date: 2022-08-17

Abstract

Abstract

Ground objects in high-resolution remote sensing images are often closely related to the scene categories. If the constraint information of the scene on the ground object can be usefully employed, it is expected to improve further the performance of object detection. Considering the relationship between scene information and objects, a scene constrained object detection method in high-resolution remote sensing images by Relation-aware Global Attention (RGA) is proposed. First, the global scene features are learned by adding the global relational attention to the basic network in Feature fusion and Scaling-based Single Shot Detector (FS-SSD). Then, object is predicted by combining the oriented response convolution module with the multiscale feature module under the constraints of learned global scene features. Finally, two loss functions are used to optimize jointly the network to achieve object detection. Four experiments are conducted on NWPU VHR-10 dataset and better object detection performance is achieved under the constraints of scene information.
- High-resolution remote sensing image,
- Deep learning,
- Object detection,
- Scene constrain,
- Relation-aware global attention

FullText(HTML)

References(17)

References

[1]	CHENG Gong, ZHOU Peicheng, and HAN Junwei. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7405–7415. doi: 10.1109/TGRS.2016.2601622
[2]	RADOVIC M, ADARKWA O, and WANG Qiaosong. Object recognition in aerial images using convolutional neural networks[J]. Journal of Imaging, 2017, 3(2): 21. doi: 10.3390/jimaging3020021
[3]	LI Ke, WAN Gang, CHENG Gong, et al. Object detection in optical remote sensing images: A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296–307. doi: 10.1016/j.isprsjprs.2019.11.023
[4]	WANG Chen, BAI Xiao, WANG Shuai, et al. Multiscale visual attention networks for object detection in VHR remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(2): 310–314. doi: 10.1109/LGRS.2018.2872355
[5]	LIANG Xi, ZHANG Jing, ZHUO Li, et al. Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1758–1770. doi: 10.1109/TCSVT.2019.2905881
[6]	ZHAO Xiaolei, ZHANG Jing, TIAN Jimiao, et al. Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention[J]. International Journal of Remote Sensing, 2021, 42(15): 5764–5783. doi: 10.1080/01431161.2021.1931537
[7]	DIVVALA S K, HOIEM D, HAYS J H, et al. An empirical study of context in object detection[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 1271–1278.
[8]	LIU Yong, WANG Ruiping, SHAN Shiguang, et al. Structure inference net: Object detection using scene-level context and instance-level relationships[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6985–6994.
[9]	ZHANG Zhizheng, LAN Cuiling, ZENG Wenjun, et al. Relation-aware global attention for person re-identification[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 3183–3192.
[10]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
[11]	HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
[12]	FU Jun, LIU Jing, TIAN Haijie, et al. Dual attention network for scene segmentation[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3141–3149.
[13]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19.
[14]	LI Ke, CHENG Gong, BU Shuhui, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2337–2348. doi: 10.1109/TGRS.2017.2778300
[15]	ZENG Xingyu, OUYANG Wanli, YAN Junjie, et al. Crafting GBD-net for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(9): 2109–2123. doi: 10.1109/TPAMI.2017.2745563
[16]	ZHANG Jun, XIE Changming, XU Xia, et al. A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518–4531. doi: 10.1109/JSTARS.2020.3015049
[17]	WU Xin, HONG Danfeng, TIAN Jiaojiao, et al. ORSIm detector: A novel object detection framework in optical remote sensing imagery using spatial-frequency channel features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(7): 5146–5158. doi: 10.1109/TGRS.2019.2897139