全局关系注意力引导场景约束的高分辨率遥感影像目标检测

张菁; 吴鑫嘉; 赵晓蕾; 卓力; 张洁

doi:10.11999/JEIT210466

全局关系注意力引导场景约束的高分辨率遥感影像目标检测

doi: 10.11999/JEIT210466

张菁^{1, 2, ,},
吴鑫嘉¹,
赵晓蕾¹,
卓力^{1, 2},
张洁³

1.
北京工业大学信息学部北京 100124
2.
北京工业大学计算智能与智能系统北京市重点实验室北京 100124
3.
中国地质大学(武汉)资源信息工程系武汉 430074

基金项目: 国家自然科学基金(61370189)，北京市教委-市基金联合资助项目(KZ201810005002), 北京市教育委员会科技计划一般项目(KM202110005027)

详细信息

作者简介:
张菁：女，1975年生，教授，研究方向为遥感影像内容分析与理解等

吴鑫嘉：女，1998年生，硕士生，研究方向为遥感影像目标检测

赵晓蕾：女，1995年生，硕士，研究方向为遥感影像目标检测

卓力：女，1971年生，教授，研究方向为图像/视频信号处理

张洁：女，1977年生，博士，研究方向为遥感数据处理

通讯作者:
张菁　zhj@bjut.edu.cn

中图分类号: TN911.73; TP751.1
计量
- 文章访问数: 826
- HTML全文浏览量: 248
- PDF下载量: 116
- 被引次数: 0
出版历程
- 收稿日期: 2021-05-25
- 修回日期: 2021-09-01
- 网络出版日期: 2022-04-13
- 刊出日期: 2022-08-17

Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention

ZHANG Jing^{1, 2
, ,},
WU Xinjia¹,
ZHAO Xiaolei¹,
ZHUO Li^{1, 2},
ZHANG Jie³

1.
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
2.
Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China
3.
Department of Resource Information Engineering, China University of Geosciences, Wuhan 430074, China

Funds: The National Natural Science Foundation of China (61370189), Beijing Municipal Education Commission Cooperation Beijing Natural Science Foundation (KZ201810005002), The General Program of Beijing Municipal Education Commission (KM202110005027)

摘要

摘要: 高分辨率遥感影像中地物目标往往与所处场景类别息息相关，如能充分利用场景对地物目标的约束信息，有望进一步提升目标检测性能。考虑到场景信息和地物目标之间的关联关系，提出全局关系注意力(RGA)引导场景约束的高分辨率遥感影像目标检测方法。首先在多尺度特征融合检测器的基础网络之后，加入全局关系注意力学习全局场景特征；然后以学到的全局场景特征作为约束，结合方向响应卷积模块和多尺度特征模块进行目标预测；最后利用两个损失函数联合优化网络实现目标检测。在NWPU VHR-10数据集上进行了4组实验，在场景信息约束的条件下取得了更好的目标检测性能。
- 高分辨率遥感影像 /
- 深度学习 /
- 目标检测 /
- 场景约束 /
- 全局关系注意力
Abstract: Ground objects in high-resolution remote sensing images are often closely related to the scene categories. If the constraint information of the scene on the ground object can be usefully employed, it is expected to improve further the performance of object detection. Considering the relationship between scene information and objects, a scene constrained object detection method in high-resolution remote sensing images by Relation-aware Global Attention (RGA) is proposed. First, the global scene features are learned by adding the global relational attention to the basic network in Feature fusion and Scaling-based Single Shot Detector (FS-SSD). Then, object is predicted by combining the oriented response convolution module with the multiscale feature module under the constraints of learned global scene features. Finally, two loss functions are used to optimize jointly the network to achieve object detection. Four experiments are conducted on NWPU VHR-10 dataset and better object detection performance is achieved under the constraints of scene information.
- High-resolution remote sensing image /
- Deep learning /
- Object detection /
- Scene constrain /
- Relation-aware global attention

HTML全文

图 1 全局关系引导场景约束的高分辨率遥感影像目标检测方法(OR-FS-SSD+RGA)

下载: 全尺寸图片幻灯片

图 2 全局关系注意力结构

下载: 全尺寸图片幻灯片

图 3 全局空间关系注意力

下载: 全尺寸图片幻灯片

图 4 全局通道关系注意力

下载: 全尺寸图片幻灯片

图 5 不同注意力模块的检测结果

下载: 全尺寸图片幻灯片

图 6 4个网络中每类目标检测准确率

下载: 全尺寸图片幻灯片

图 7 FSSD, FS-SSD, Faster-RCNN, OR-FS-SSD+CA和OR-FS-SSD+RGA-S的主观结果对比

下载: 全尺寸图片幻灯片

表 1 OR-FS-SSD+RGA最终预测特征图尺寸

Pred1 Pred2 Pred3 Pred4 Pred5 Pred6 Pred_avg

64×64 32×32 16×16 8×8 4×4 2×2 16×16

下载: 导出CSV

表 2 网络超参数设置

迭代次数学习率批处理大小动量权重衰减

150 0.001 12 0.9 0.005

下载: 导出CSV

表 3 和主流网络的检测准确率对比

网络 mAP (%) FPS

Faster-RCNN 93.10 0.09
YOLOv3 91.04 14.68
OR-FS-SSD+CA^[6] 94.74 29.57
LCFFN^[14] 93.67 0.35
GBD^[15] 93.95 2.20
CBD-E^[16] 94.98 2.00
ORSIm^[17] 95.39 4.72
OR-FS-SSD+RGA-S (本文) 95.59 30.07

下载: 导出CSV

参考文献(17)

[1]	CHENG Gong, ZHOU Peicheng, and HAN Junwei. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7405–7415. doi: 10.1109/TGRS.2016.2601622
[2]	RADOVIC M, ADARKWA O, and WANG Qiaosong. Object recognition in aerial images using convolutional neural networks[J]. Journal of Imaging, 2017, 3(2): 21. doi: 10.3390/jimaging3020021
[3]	LI Ke, WAN Gang, CHENG Gong, et al. Object detection in optical remote sensing images: A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296–307. doi: 10.1016/j.isprsjprs.2019.11.023
[4]	WANG Chen, BAI Xiao, WANG Shuai, et al. Multiscale visual attention networks for object detection in VHR remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(2): 310–314. doi: 10.1109/LGRS.2018.2872355
[5]	LIANG Xi, ZHANG Jing, ZHUO Li, et al. Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1758–1770. doi: 10.1109/TCSVT.2019.2905881
[6]	ZHAO Xiaolei, ZHANG Jing, TIAN Jimiao, et al. Multiscale object detection in high-resolution remote sensing images via rotation invariant deep features driven by channel attention[J]. International Journal of Remote Sensing, 2021, 42(15): 5764–5783. doi: 10.1080/01431161.2021.1931537
[7]	DIVVALA S K, HOIEM D, HAYS J H, et al. An empirical study of context in object detection[C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 2009: 1271–1278.
[8]	LIU Yong, WANG Ruiping, SHAN Shiguang, et al. Structure inference net: Object detection using scene-level context and instance-level relationships[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6985–6994.
[9]	ZHANG Zhizheng, LAN Cuiling, ZENG Wenjun, et al. Relation-aware global attention for person re-identification[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 3183–3192.
[10]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
[11]	HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
[12]	FU Jun, LIU Jing, TIAN Haijie, et al. Dual attention network for scene segmentation[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3141–3149.
[13]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19.
[14]	LI Ke, CHENG Gong, BU Shuhui, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2337–2348. doi: 10.1109/TGRS.2017.2778300
[15]	ZENG Xingyu, OUYANG Wanli, YAN Junjie, et al. Crafting GBD-net for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(9): 2109–2123. doi: 10.1109/TPAMI.2017.2745563
[16]	ZHANG Jun, XIE Changming, XU Xia, et al. A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518–4531. doi: 10.1109/JSTARS.2020.3015049
[17]	WU Xin, HONG Danfeng, TIAN Jiaojiao, et al. ORSIm detector: A novel object detection framework in optical remote sensing imagery using spatial-frequency channel features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(7): 5146–5158. doi: 10.1109/TGRS.2019.2897139

施引文献

资源附件(0)

访问统计

图(7) / 表(3)

计量

文章访问数: 826
HTML全文浏览量: 248
PDF下载量: 116
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

全局关系注意力引导场景约束的高分辨率遥感影像目标检测

doi: 10.11999/JEIT210466

通讯作者:
张菁　zhj@bjut.edu.cn

计量

Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention

计量

目录

Pred1	Pred2	Pred3	Pred4	Pred5	Pred6	Pred_avg
64×64	32×32	16×16	8×8	4×4	2×2	16×16

迭代次数	学习率	批处理大小	动量	权重衰减
150	0.001	12	0.9	0.005

网络	mAP (%)	FPS
Faster-RCNN	93.10	0.09
YOLOv3	91.04	14.68
OR-FS-SSD+CA^[6]	94.74	29.57
LCFFN^[14]	93.67	0.35
GBD^[15]	93.95	2.20
CBD-E^[16]	94.98	2.00
ORSIm^[17]	95.39	4.72
OR-FS-SSD+RGA-S (本文)	95.59	30.07

留言板

全局关系注意力引导场景约束的高分辨率遥感影像目标检测

doi: 10.11999/JEIT210466

通讯作者: 张菁 zhj@bjut.edu.cn

计量

出版历程

Scene Constrained Object Detection Method in High-Resolution Remote Sensing Images by Relation-Aware Global Attention

计量

出版历程

目录

通讯作者:
张菁　zhj@bjut.edu.cn