自适应膨胀和结构嵌入的非对称哈希遥感图像检索算法
李强强(1997— ),男,甘肃庆阳人,硕士生,主要从事遥感图像检索研究。E-mail: 329172947@qq.com |
Copy editor: 蒋树芳 , 黄光玉
收稿日期: 2024-03-27
修回日期: 2024-05-23
网络出版日期: 2024-07-24
基金资助
国家重点研发计划项目(2022YFB3903604)
国家自然科学基金项目(42161069)
国家自然科学基金项目(41861055)
中国博士后基金项目(2019M653795)
An Adaptive Dilated and Structural Embedding Asymmetric Hashing Algorithm for Remote Sensing Image Retrieval
Received date: 2024-03-27
Revised date: 2024-05-23
Online published: 2024-07-24
Supported by
National Key Research and Development Program of China(2022YFB3903604)
National Natural Science Foundation of China(42161069)
National Natural Science Foundation of China(41861055)
China Postdoctoral Science Foundation(2019M653795)
随着遥感平台日新月异,遥感图像数量也呈指数级地增长,如何从遥感大数据中筛选出所需遥感图像,已成为遥感应用亟待解决核心问题之一。目前利用深度卷积神经网络获取图像深度特征被认为是图像检索中最为有效的方法。然而,由于其特征维度过高从而导致相似性度量困难,降低了检索的速度和精度。为此,本文提出了一种结合自适应膨胀卷积和结构嵌入网络的非对称哈希遥感图像检索方法。该方法首先设计了自适应膨胀卷积模块,该模块能够在不增加额外模型参数同时自适应地捕捉遥感图像的多尺度特征;其次,针对遥感图像中的结构信息提取不足问题,对已有的结构嵌入模块进行了优化改进,改进后模块能够有效提取遥感图像中的几何结构特征;最后针对类内差异性和类间相似性导致的检索效率低下问题,引入了成对相似性约束,使原始特征空间中遥感图像之间的相似性能在哈希空间中得到保留。通过在4个不同数据集上的对比实验,验证了本文方法优于现有的深度哈希图像检索方法。同时,通过消融实验验证了所提模型中各模块的有效性。
李强强 , 李小军 , 李轶鲲 , 杨树文 , 杨睿哲 . 自适应膨胀和结构嵌入的非对称哈希遥感图像检索算法[J]. 地球信息科学学报, 2024 , 26(8) : 1926 -1940 . DOI: 10.12082/dqxxkx.2024.240168
With the rapid changes in remote sensing platforms, there is a noticeable exponential increase in the quantity of remote sensing images. Choosing the appropriate remote sensing images from extensive remote sensing big data is now a fundamental challenge in remote sensing applications. Currently, utilizing deep Convolutional Neural Networks (CNNs) for extracting deep features from images has become the main approach for remote sensing image retrieval due to its effectiveness. However, the high feature dimensions pose challenges for similarity measurement in the image retrieval, resulting in decreased processing speed and retrieval accuracy. The hash method maps images into compact binary codes from a high-dimensional space, which can be used in remote sensing image retrieval to efficiently reduce feature dimensions. Therefore, this paper proposes a ResNet-based adaptive dilated and structural embedding asymmetric hashing algorithm for the remote sensing image retrieval. Firstly, an adaptive dilated convolution module is designed to adaptively capture multi-scale features of remote sensing images without introducing additional model parameters. Secondly, to address the issue of insufficient extraction of structural information in remote sensing imagery, the current structural embedding module has been optimized and improved to effectively extract geometric structure features from remote sensing images. Lastly, to tackle the problem of low retrieval efficiency caused by intra-class differences and inter-class similarities, pairwise similarity-based constraints are introduced to preserve the similarity of remote sensing images in both the original feature space and the hash space. Experimental comparisons with four datasets (i.e. UCM, NWPU, AID, and PatternNet) were conducted to demonstrate the effectiveness of the proposed method. The mean average precision rates for 64-bit hash codes were 98.07%, 93.65%, 97.92%, and 97.53% with these four datasets, respectively, proving the superiority of our proposed approach over other existing deep hashing image retrieval methods. In addition, four ablation experiments were carried out to verify each module of the proposed method. The ablation experimental results showed that the mean average precision rate was 68.9% by only using the ResNet18 backbone network. The rate will rise to 81.71% after introducing the structural self-similarity coding module, indicating an improvement of 12.81%. Meanwhile, introducing the adaptive dilated convolution module increased the average precision rate by 10.53%. The additional implementation of the pairwise similarity constraints module further increased the average precision rate to 98.07%, indicating a rise of 5.83%. In summary, the experimental results confirm the efficiency of the proposed network framework, which can improve the retrieval accuracy of remote sensing images while maintaining the advantages of deep hashing features.
表1 不同的网络框架下的检索评价指标Tab. 1 Retrieval evaluation indicators under different network frameworks |
方法 | mAP | ANMMR |
---|---|---|
ResNet18 | 0.689 0 | 0.260 |
ResNet18+结构自相似性编码 | 0.817 1 | 0.150 |
ResNet18+结构自相似性编码 +自适应膨胀卷积 | 0.922 4 | 0.061 |
ResNet18+结构自相似性编码 +自适应膨胀卷积+成对相似性约束 | 0.980 7 | 0.009 |
表2 mAP定量对比结果Tab. 2 mAP quantitative comparison results |
数据集 | 哈希码位数/bits | DSAH | AHCL[28] | ADSH[27] | FAH[26] | DPSH[23] | DHN[19] |
---|---|---|---|---|---|---|---|
UCM | 32 | 0.965 5 | 0.881 6 | 0.949 5 | 0.933 1 | 0.826 7 | 0.856 7 |
64 | 0.980 7 | 0.937 8 | 0.952 6 | 0.915 4 | 0.836 7 | 0.867 9 | |
128 | 0.990 5 | 0.970 7 | 0.981 0 | 0.950 6 | 0.815 4 | 0.819 8 | |
NWPU | 32 | 0.868 8 | 0.850 4 | 0.789 4 | 0.813 2 | 0.355 7 | 0.480 7 |
64 | 0.936 5 | 0.893 1 | 0.928 4 | 0.724 6 | 0.501 2 | 0.687 0 | |
128 | 0.996 8 | 0.920 7 | 0.963 5 | 0.824 0 | 0.574 6 | 0.677 7 | |
AID | 32 | 0.906 0 | 0.8595 | 0.895 4 | 0.833 4 | 0.745 1 | 0.793 6 |
64 | 0.979 2 | 0.7772 | 0.956 3 | 0.865 9 | 0.765 6 | 0.787 7 | |
128 | 0.987 7 | 0.938 0 | 0.962 6 | 0.866 2 | 0.817 2 | 0.753 8 | |
PatternNet | 32 | 0.919 7 | 0.876 0 | 0.924 9 | 0.908 7 | 0.793 7 | 0.775 8 |
64 | 0.975 3 | 0.962 5 | 0.970 4 | 0.967 8 | 0.918 2 | 0.920 4 | |
128 | 0.996 7 | 0.988 5 | 0.975 3 | 0.967 2 | 0.918 5 | 0.925 2 |
注:加粗的数值表示本文所提出的方法。 |
表3 AID数据集上的训练时间和检索时间对比Tab. 3 Comparison of training time and retrieval time on AID dataset (s) |
方法 | 训练时间 | 检索时间 |
---|---|---|
DSAH | 704.11 | 16.92 |
AHCL | 715.48 | 17.15 |
ADSH | 743.82 | 17.05 |
FAH | 780.48 | 90.29 |
DPSH | 708.54 | 86.85 |
DHN | 706.78 | 76.32 |
注:加粗数值表示的是本文所提出的方法。 |
[1] |
|
[2] |
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
Sivic, Zisserman. Video Google: A text retrieval approach to object matching in videos[C]// Proceedings Ninth IEEE International Conference on Computer Vision. IEEE, 2003, 2:1470-1477. DOI:10.1109/ICCV.2003.1238663
|
[10] |
|
[11] |
|
[12] |
葛芸, 马琳, 江顺亮, 等. 基于高层特征图组合及池化的高分辨率遥感图像检索[J]. 电子与信息学报, 2019, 41(10):2487-2494.
[
|
[13] |
张建兵, 严泽枭, 马淑芳. 用于遥感影像建筑物变化检测的多尺度交叉对偶注意力网络[J]. 地球信息科学学报, 2023, 25(12):2487-2500.
[
|
[14] |
|
[15] |
|
[16] |
何悦, 陈广胜, 景维鹏, 等. 基于深度多相似性哈希方法的遥感图像检索[J]. 计算机工程, 2023, 49(2):206-212.
[
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
[35] |
|
/
〈 |
|
〉 |