地球信息科学学报 ›› 2023, Vol. 25 ›› Issue (3): 638-653.doi: 10.12082/dqxxkx.2023.220708
收稿日期:
2022-09-20
修回日期:
2022-12-02
出版日期:
2023-03-25
发布日期:
2023-04-19
通讯作者:
* 李科(1977—),男,河北青县人,博士,教授,研究方向为地理环境智能感知和数据工程。 E-mail: like19771223@163.com作者简介:
高鹏飞(1997—),男,河南开封人,硕士生,研究方向为深度学习和地理空间数据工程。E-mail: gao_pengfei2020@163.com
基金资助:
GAO Pengfei(), CAO Xuefeng, LI Ke(
), YOU Xiong
Received:
2022-09-20
Revised:
2022-12-02
Online:
2023-03-25
Published:
2023-04-19
Contact:
LI Ke
Supported by:
摘要:
遥感影像目标检测在城市规划、自然资源调查、国土测绘、军事侦察等领域有着广泛的应用价值。针对遥感影像目标检测在目标尺度变化大、目标外观相似性高以及背景复杂度高等方面的难点,本文提出了一种新的目标检测算法,该算法有效融合了多元稀疏特征提取模块(MNB)和阶层深度特征融合模块(HDFB)。多元稀疏特征提取模块以多个卷积分支结构来模拟神经元的多个突触结构提取稀疏分布的特征,随着网络层的堆叠获取更大感受野范围内的稀疏特征,从而提高捕获的多尺度目标特征的质量。阶层深度特征融合模块基于空洞卷积提取不同深度的上下文信息特征,然后提取特征通过独创的树状融合网络,从而实现局部特征与全局特征在特征图级别的融合。本文算法在大规模公开数据集DIOR进行验证,实验结果表明:① 多元稀疏特征提取模块和阶层深度特征融合模块相结合的方法总体准确率达到72.5%,单张遥感影像的平均检测耗时为3.8毫秒;② 通过使用多元稀疏特征提取模块,多尺度和外观相似性目标的检测精度得到了提高,与使用Step-wise分支的物体检测结果相比,总体精度提高了5.8%;③ 通过阶层深度特征融合模块的多感受野深度特征融合网络提取阶层深度特征,并为实现局部特征与全局特征在特征图级别的融合提供了一种新的思路,提高了网络对上下文信息的获取能力;④ 重构PANet特征融合网络,以多元稀疏特征提取模块对不同尺度的稀疏特征进行融合,有效提高了PANet结构在遥感影像目标检测任务中的有效性。许多因素深刻影响着算法的最终表现:一方面高质量数据集是更高精度的基础,如图像质量、目标遮挡、目标的类内差异性大等深刻影响着检测器的训练效果;另一方面算法模型参数设置,如对数据集进行聚类分析得到候选框以提高最佳召回率,保证阶层深度特征融合模块的感受野范围覆盖特征图是确保精度的关键。我们得出结论:使用多元稀疏特征提取网络可以提高特征质量,而阶层深度特征融合模块可以融合上下文信息,减少复杂背景噪声的影响,从而在遥感影像的目标检测任务中获得更好的性能。
高鹏飞, 曹雪峰, 李科, 游雄. 融合多元稀疏特征与阶层深度特征的遥感影像目标检测[J]. 地球信息科学学报, 2023, 25(3): 638-653.DOI:10.12082/dqxxkx.2023.220708
GAO Pengfei, CAO Xuefeng, LI Ke, YOU Xiong. Object Detection in Remote Sensing Images by Fusing Multi-neuron Sparse Features and Hierarchical Depth Features[J]. Journal of Geo-information Science, 2023, 25(3): 638-653.DOI:10.12082/dqxxkx.2023.220708
表2
不同多分支模型测试结果对比
不同的多 分支模型 | 准确率AP/% | mean AP | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | C11 | C12 | C13 | C14 | C15 | C16 | C17 | C18 | C19 | C20 | ||
Step-wise MNB | 69.7 | 74.9 | 65.3 | 85.3 | 35.4 | 68.8 | 59.9 | 51.9 | 57.1 | 70.9 | 66.7 | 57.8 | 52.7 | 87.1 | 63.9 | 72.5 | 85.1 | 58.3 | 53.1 | 79.2 | 65.5 |
Inception[ | 77.0 | 79.7 | 71.7 | 87.8 | 42.9 | 77.3 | 61.7 | 59.0 | 59.9 | 76.7 | 72.0 | 60.6 | 57.3 | 89.4 | 66.8 | 72.7 | 86.7 | 62.7 | 56.2 | 78.8 | 69.8 |
MNSB | 75.7 | 82.1 | 70.5 | 88.8 | 46.0 | 77.8 | 64.7 | 61.6 | 61.7 | 78.8 | 73.9 | 62.3 | 59.2 | 89.4 | 71.4 | 73.9 | 86.9 | 66.4 | 56.6 | 79.3 | 71.3 |
表3
不同的ACG模型(空洞卷积组)测试结果对比
实验名称 | 准确率AP/% | mean AP | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | C11 | C12 | C13 | C14 | C15 | C16 | C17 | C18 | C19 | C20 | ||
SPP+ACG | 71.2 | 57.6 | 69.7 | 79.8 | 44.0 | 75.4 | 58.0 | 53.6 | 55.2 | 61.8 | 58.7 | 48.4 | 55.3 | 88.4 | 49.7 | 70.7 | 83.7 | 35.0 | 53.9 | 74.0 | 60.9 |
Three-ACG | 76.4 | 79.1 | 72.2 | 88.3 | 44.1 | 78.2 | 63.6 | 59.8 | 58.6 | 78.0 | 73.4 | 61.1 | 58.2 | 88.6 | 68.0 | 74.0 | 86.6 | 63.2 | 55.9 | 77.7 | 70.2 |
SPP | 76.6 | 78.5 | 71.7 | 88.4 | 44.4 | 78.2 | 60.4 | 61.6 | 61.5 | 75.5 | 74.1 | 59.9 | 59.6 | 89.7 | 69.2 | 73.8 | 86.5 | 61.1 | 56.3 | 79.0 | 70.3 |
HDFB | 76.8 | 80.0 | 72.6 | 88.7 | 42.5 | 78.7 | 63.6 | 57.9 | 60.7 | 79.6 | 73.0 | 60.8 | 57.7 | 88.6 | 70.4 | 76.9 | 86.4 | 65.1 | 54.7 | 76.3 | 70.6 |
表4
各方法在DIOR数据集的结果对比,粗体表示最佳精度
对比算法 | 准确率AP/% | mean AP | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | C11 | C12 | C13 | C14 | C15 | C16 | C17 | C18 | C19 | C20 | ||
Faster R-CNN[ | 54.0 | 74.5 | 63.6 | 80.7 | 44.8 | 72.5 | 60.0 | 75.6 | 62.3 | 76.0 | 76.8 | 46.4 | 57.2 | 71.8 | 68.3 | 53.8 | 81.1 | 59.5 | 43.1 | 81.2 | 65.1 |
Mask R-CNN[ | 53.9 | 76.6 | 63.2 | 80.9 | 40.2 | 72.5 | 60.4 | 76.3 | 62.5 | 76.0 | 75.9 | 46.5 | 57.4 | 71.8 | 68.3 | 53.7 | 81.0 | 62.3 | 43.0 | 81.0 | 65.2 |
Retina-Net[ | 53.3 | 77.0 | 69.3 | 85.0 | 44.1 | 73.2 | 62.4 | 78.6 | 62.8 | 78.6 | 76.6 | 49.9 | 59.6 | 71.1 | 68.4 | 45.8 | 81.3 | 55.2 | 44.4 | 85.8 | 66.1 |
PANet[ | 60.2 | 72.0 | 70.6 | 80.5 | 43.6 | 72.3 | 61.4 | 72.1 | 66.7 | 72.0 | 73.4 | 45.3 | 56.9 | 71.7 | 70.4 | 62.0 | 80.9 | 57.0 | 47.2 | 84.5 | 66.1 |
YOLOv3[ | 72.2 | 29.2 | 74.0 | 78.6 | 31.2 | 69.7 | 26.9 | 48.6 | 54.4 | 31.1 | 61.1 | 44.9 | 49.7 | 87.4 | 70.6 | 68.7 | 87.3 | 29.4 | 48.3 | 78.7 | 57.1 |
Corner-Net[ | 58.8 | 84.2 | 72.0 | 80.8 | 46.4 | 75.3 | 64.3 | 81.6 | 76.3 | 79.5 | 79.5 | 26.1 | 60.6 | 37.6 | 70.7 | 45.2 | 84.0 | 57.1 | 43.0 | 75.9 | 64.9 |
YOLOv5 | 76.6 | 78.5 | 71.7 | 88.4 | 44.4 | 78.2 | 60.4 | 61.6 | 61.5 | 75.5 | 74.1 | 59.9 | 59.6 | 89.7 | 69.2 | 73.8 | 86.5 | 61.1 | 56.3 | 79.0 | 70.3 |
本文算法 | 78.1 | 83.9 | 73.0 | 89.0 | 48.2 | 79.4 | 65.6 | 63.9 | 61.9 | 80.6 | 76.6 | 63.5 | 61.6 | 89.6 | 68.7 | 76.4 | 87.0 | 66.4 | 57.0 | 78.7 | 72.5 |
[1] | 周培诚, 程塨, 姚西文, 等. 高分辨率遥感影像解译中的机器学习范式[J]. 遥感学报, 2021, 25(1):182-197. |
[Zhou P C, Cheng G, Yao X W, et al. Machine learning paradigms in high-resolution remote sensing image interpretation[J]. National Remote Sensing Bulletin, 2021, 25(1):182-197.] DOI:10.11834/jrs.20210164
doi: 10.11834/jrs.20210164 |
|
[2] |
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2014:580-587. DOI:10.1109/CVPR.2014.81
doi: 10.1109/CVPR.2014.81 |
[3] |
Girshick R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. IEEE, 2015:1440-1448. DOI:10.1109/ICCV.2015.169
doi: 10.1109/ICCV.2015.169 |
[4] |
Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. DOI:10.1109/TPAMI.2016.2577031
doi: 10.1109/TPAMI.2016.2577031 pmid: 27295650 |
[5] |
Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector[C]// European Conference on Computer Vision. Cham: Springer, 2016:21-37. DOI:10.1007/978-3-319-46448-0_2
doi: 10.1007/978-3-319-46448-0_2 |
[6] |
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:779-788. DOI:10.1109/CVPR.2016.91
doi: 10.1109/CVPR.2016.91 |
[7] |
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017:6517-6525. DOI:10.1109/CVPR.2017.690
doi: 10.1109/CVPR.2017.690 |
[8] | Redmon J, Farhadi A. YOLOv3: An incremental improvement[EB/OL]. 2018:arXiv:1804.02767. https://arxiv.org/abs/1804.02767 |
[9] | Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. 2020: arXiv: 2004.10934. https://arxiv.org/abs/2004.10934 |
[10] |
Li K. Object detection in optical remote sensing images: A survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159:296-307. DOI:10.1016/j.isprsjprs.2019.11.023
doi: 10.1016/j.isprsjprs.2019.11.023 |
[11] |
Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Patten Analysis and Machine Intellegence, 2018, 40(4):834-848. DOI:10.1109/TPAMI.2017.2699184
doi: 10.1109/TPAMI.2017.2699184 |
[12] |
Liu S T, Huang D, Wang Y H. Receptive Field Block Net for Accurate and Fast Object Detection[C]// European Conference on Computer Vision. Cham: Springer, 2018:404-419. DOI:10.1007/978-3-030-01252-6_24
doi: 10.1007/978-3-030-01252-6_24 |
[13] |
Guo W, Yang W, Zhang H J, et al. Geospatial object detection in high resolution satellite images based on multi-scale convolutional neural network[J]. Remote Sensing, 2018, 10(1):131. DOI:10.3390/rs10010131
doi: 10.3390/rs10010131 |
[14] | 陈丁, 万刚, 李科. 多层特征与上下文信息相结合的光学遥感影像目标检测[J]. 测绘学报, 2019, 48(10):1275-1284. |
[Chen D, Wan G, Li K. Object detection in optical remote sensing images baesd on combination of multi-layer feature and context information[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(10):1275-1284.] DOI:10.11947/j.AGCS.2019.20180431
doi: 10.11947/j.AGCS.2019.20180431 |
|
[15] |
Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015:1-9. DOI:10.1109/CVPR.2015.7298594
doi: 10.1109/CVPR.2015.7298594 |
[16] |
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016:2818-2826. DOI:10.1109/CVPR.2016.308
doi: 10.1109/CVPR.2016.308 |
[17] | 黄洁, 姜志国, 张浩鹏, 等. 基于卷积神经网络的遥感图像舰船目标检测[J]. 北京航空航天大学学报, 2017, 43(9):1841-1848. |
[Huang J, Jiang Z G, Zhang H P, et al. Ship object detection in remote sensing images using convolutional neural networks[J]. Journal of Beijing University of Aeronautics and Astronautics, 2017, 43(9):1841-1848.] DOI:10.13700/j.bh.1001-5965.2016.0755
doi: 10.13700/j.bh.1001-5965.2016.0755 |
|
[18] |
Zhang Y L, Yuan Y, Feng Y C, et al. Hierarchical and robust convolutional neural network for very high-resolution remote sensing object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(8):5535-5548. DOI:10.1109/TGRS.2019.2900302
doi: 10.1109/TGRS.2019.2900302 |
[19] |
Li K, Cheng G, Bu S H, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4):2337-2348. DOI:10.1109/TGRS.2017.2778300
doi: 10.1109/TGRS.2017.2778300 |
[20] |
Çatalyürek Ü V, Aykanat C, Uçar B. On two-dimensional sparse matrix partitioning: Models, methods, and a recipe[J]. SIAM Journal on Scientific Computing, 2010, 32(2):656-683. DOI:10.1137/080737770
doi: 10.1137/080737770 |
[21] |
Motta A, Berning M, Boergens K M, et al. Dense connectomic reconstruction in layer 4 of the somatosensory cortex[J]. Science, 2019, 366(6469): eaay3134. DOI:10.1126/science.aay3134
doi: 10.1126/science.aay3134 |
[22] |
Liu S, Qi L, Qin H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018:8759-8768. DOI:10.1109/CVPR.2018.00913
doi: 10.1109/CVPR.2018.00913 |
[23] |
He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. IEEE, 2017:2980-2988. DOI:10.1109/ICCV.20 17.322
doi: 10.1109/ICCV.20 |
[24] |
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. IEEE, 2017:2999-3007. DOI:10.1109/ICCV.2017.324
doi: 10.1109/ICCV.2017.324 |
[25] |
Law H, Deng J. CornerNet: detecting objects as paired keypoints[J]. International Journal of Computer Vision, 2020, 128(3):642-656. DOI:10.1007/s11263-019-01204-1
doi: 10.1007/s11263-019-01204-1 |
[26] |
Wang P Q, Chen P F, Yuan Y, et al. Understanding convolution for semantic segmentation[C]// 2018 IEEE Winter Conference on Applications of Computer Vision. IEEE, 2018:1451-1460. DOI:10.1109/WACV.2018.00163
doi: 10.1109/WACV.2018.00163 |
[27] | Chen K, Wang J Q, Pang J M, et al. MMDetection: open MMLab detection toolbox and benchmark[EB/OL]. 2019: arXiv:1906.07155. https://arxiv.org/abs/1906.07155 |
[1] | 张俊瑶, 杨晓梅, 王志华, 杨海坤, 张博淳, 万庆, 雷梅. 绿色发展理念视角下内蒙古煤矿区格局演变分析[J]. 地球信息科学学报, 2023, 25(8): 1655-1668. |
[2] | 林娜, 何静, 王斌, 唐菲菲, 周俊宇, 郭江. 结合植被光谱特征与Sep-UNet的城市植被信息智能提取方法[J]. 地球信息科学学报, 2023, 25(8): 1717-1729. |
[3] | 吴敏, 张明达, 李盼盼, 张勇健. 面向多源遥感影像数据的溯源模型研究[J]. 地球信息科学学报, 2023, 25(7): 1325-1335. |
[4] | 令振飞, 刘涛, 杜萍, 赵丹, 陈朴一, 马天恩. 一种支持建筑群组相似模式检索的变分图卷积自编码模型[J]. 地球信息科学学报, 2023, 25(7): 1405-1417. |
[5] | 刘潇, 刘智, 林雨准, 王淑香, 左溪冰. 面向遥感影像场景分类的类中心知识蒸馏方法[J]. 地球信息科学学报, 2023, 25(5): 1050-1063. |
[6] | 侯慧太, 蓝朝桢, 徐青. 基于卫星影像全局和局部深度学习特征检索的无人机绝对定位方法[J]. 地球信息科学学报, 2023, 25(5): 1064-1074. |
[7] | 陈科, 管海燕, 雷相达, 曹爽. 基于特征增强核点卷积网络的多光谱LiDAR点云分类方法[J]. 地球信息科学学报, 2023, 25(5): 1075-1087. |
[8] | 衡雪彪, 许捍卫, 唐璐, 汤恒, 许怡蕾. 基于改进全卷积神经网络模型的土地覆盖分类方法研究[J]. 地球信息科学学报, 2023, 25(3): 495-509. |
[9] | 徐繁树, 王保云, 韩俊. 一种沟谷型潜在泥石流危险性评价方法:基于多源数据融合的卷积神经网络[J]. 地球信息科学学报, 2023, 25(3): 588-605. |
[10] | 黄帅元, 董有福, 李海鹏. 黄土高原区SRTM1 DEM高程误差校正模型构建及对比分析[J]. 地球信息科学学报, 2023, 25(3): 669-681. |
[11] | 陈凯, 雷少华, 代文, 王春, 刘爱利, 李敏. 基于开源数据和条件生成对抗网络的地形重建方法[J]. 地球信息科学学报, 2023, 25(2): 252-264. |
[12] | 黄昕, 毛政元. 基于时空多图卷积网络的网约车乘客需求预测[J]. 地球信息科学学报, 2023, 25(2): 311-323. |
[13] | 饶子昱, 卢俊, 郭海涛, 余东行, 侯青峰. 利用视角转换的跨视角影像匹配方法[J]. 地球信息科学学报, 2023, 25(2): 368-379. |
[14] | 王龙号, 蓝朝桢, 姚富山, 侯慧太, 武蓓蓓. 多源遥感影像深度特征融合匹配算法[J]. 地球信息科学学报, 2023, 25(2): 380-395. |
[15] | 刘洋, 康健, 管海燕, 汪汉云. 基于双注意力残差网络的高分遥感影像道路提取模型[J]. 地球信息科学学报, 2023, 25(2): 396-408. |
|