多模态遥感影像匹配的深度学习研究进展与趋势

于瀚洋; 蓝朝桢; 王龙号; 魏紫珺; 高天; 王亦乔; 刘芮萌

doi:10.12082/dqxxkx.2025.250052

地球信息科学学报 >

2025 , Vol. 27 >Issue 8: 1896 - 1919

DOI: https://doi.org/10.12082/dqxxkx.2025.250052

遥感科学与应用技术

多模态遥感影像匹配的深度学习研究进展与趋势

于瀚洋 ^,¹ ,
蓝朝桢 ^,¹^,^* ,
王龙号 ¹ ,
魏紫珺 ¹ ,
高天 ¹ ,
王亦乔 ¹ ,
刘芮萌 ²

展开

1.中国人民解放军网络空间部队信息工程大学地理空间信息学院，郑州 450001
2.91351部队，葫芦岛 125100

*蓝朝桢（1979— ），男，福建上杭人，博士，副教授，主要从事摄影测量学、遥感影像数字化智能处理研究。 E-mail: lan_cz@163.com

作者贡献：Author Contributions

蓝朝桢参与文章结构的指导；王龙号、魏紫珺、高天参与文章相关资料的归纳整理；王龙号、魏紫珺、高天、于瀚洋、王亦乔参与文章内容的撰写和修改，于瀚洋，刘芮萌参与了文章内容的校对和图表修改。所有作者均阅读并同意最终文件的提交。

LAN Chaozhen participated in guiding the structure of the article;WANG Longhao, WEI Zijun, and GAO Tian participated in the summarization and organization of relevant materials for the article; WANG Longhao, WEI Zijun, GAO Tian, YU Hanyang, and WANG Yiqiao participated in the writing and editing of the article content, while YU Hanyang and LIU Ruimeng participated in the proofreading and chart editing of the article content. All the authors have read the last version of the paper and consented for submission.

于瀚洋（2002— ），男，山东青岛人，博士生，主要从事遥感影像数字处理研究。E-mail: 2920131781@qq.com

收稿日期: 2025-01-27

修回日期: 2025-06-10

网络出版日期: 2025-07-23

收起

Research Advances and Development Trends of Deep Learning Methods for Multimodal Remote Sensing Image Matching

YU Hanyang ^,¹ ,
LAN Chaozhen ^,¹^,^* ,
WANG Longhao ¹ ,
WEI Zijun ¹ ,
GAO Tian ¹ ,
WANG Yiqiao ¹ ,
LIU Ruimeng ²

Expand

1. Institute of Geospatial Information, PLA Cyberspace Information Engineering University, Zhengzhou 450001, China
2. Troops 91351, Huludao 125100, China

*LAN Chaozhen, E-mail: lan_cz@163.com

Received date: 2025-01-27

Revised date: 2025-06-10

Online published: 2025-07-23

Fold

摘要

【意义】影像匹配是完成多景影像空间位置对齐的方法和过程，而自动化影像匹配是现代摄影测量与遥感数据处理中关键的一环。【进展】随着对地观测技术的发展和多源遥感数据获取能力的提高，综合协同处理多源数据的能力需求推动多模态遥感影像的匹配技术研究不断深入，近年来基于深度学习的思想深刻影响了影像匹配领域技术的发展。本文在介绍传统遥感影像匹配框架的基础上，分析了多模态遥感影像的类型、特点与匹配难点，重点论述了针对多模态遥感影像不同深度学习方法研究的新进展，并分析了其优缺点，归纳总结了目前适应多模态遥感影像匹配任务的数据集，对深度学习方法在多模态遥感影像匹配中的发展成果和当前挑战进行了总结。成果方面，该领域算法在高效、鲁棒和精度上显著提升，多模态融合策略和多种创新框架与模型推动了研究发展并反映了该领域从模块化适配到整体建模的转变，揭示了数据驱动的表征学习与几何推理的更深度融合。但当前研究仍存在显著瓶颈，多模态差异方面，异构性严重制约匹配效能，模型泛化能力不足；数据与计算层面，高质量标注数据稀缺、计算资源需求大；工程部署层面，算法实战能力欠缺，误匹配剔除困难，模型在混合模态数据处理中泛化性差。【展望】最后对多模态遥感影像深度学习匹配方法领域的发展趋势与未来展望进行了深入探讨，包括模态无关的设计、物理信息约束的网络架构以及适应复杂环境的轻量化方案等。

关键词： 多模态遥感影像; 图像匹配; 深度学习; 单环节方法; 端到端方法; 匹配数据集; 研究进展

本文引用格式

于瀚洋 , 蓝朝桢 , 王龙号 , 魏紫珺 , 高天 , 王亦乔 , 刘芮萌 . 多模态遥感影像匹配的深度学习研究进展与趋势[J]. 地球信息科学学报, 2025 , 27(8) : 1896 -1919 . DOI: 10.12082/dqxxkx.2025.250052

Abstract

[Significance] Multimodal remote sensing image matching has become a fundamental task in integrated Earth observation, enabling precise spatial alignment across heterogeneous image sources. [Progress] As the diversity of sensing modalities, acquisition geometries, and temporal conditions increases, traditional matching frameworks have proven inadequate for capturing complex variations in radiometric responses, geometric configurations, and semantic representations. This technological gap has driven a significant paradigm shift from handcrafted feature engineering to deep learning-based solutions, which now form the core of current research and application development. This paper provides a comprehensive and structured review of recent advances in deep learning methods for multimodal remote sensing image matching, with an emphasis on the evolution of methodological paradigms and technical frameworks. It establishes a clear dual-path classification: the single-session approach and the end-to-end approach. The former selectively replaces or enhances individual components of traditional pipelines, such as feature encoding or similarity estimation, using neural network modules. The latter integrates the entire matching process into a unified network architecture, enabling joint optimization of feature learning, transformation modeling, and correspondence inference within a closed loop. This progression reflects the field's transition from modular adaptation to holistic modeling, revealing a deeper integration of data-driven representation learning with geometric reasoning. The review further examines the development of architectural strategies supporting this evolution, including attention mechanisms, graph-based structures, hierarchical feature fusion, and modality-bridging transformations. These innovations contribute to improved robustness, semantic consistency, and adaptability across diverse matching scenarios. Recent trends also demonstrate a growing reliance on pretrained vision foundation models, which provide transferable feature spaces and reduce the dependence on large-scale labeled datasets. In addition to summarizing technical advancements, the paper analyzes representative datasets, performance evaluation strategies, and the current challenges that constrain real-world deployment. These include limited data availability, weak cross-scene generalization, computational inefficiency, and insufficient interpretability. [Prospect] By synthesizing methodological progress with practical demands, the review identifies key directions for future research, including the design of modality-invariant representations, physically-informed neural architectures, and lightweight solutions tailored for scalable, real-time image registration in complex operational environments.

Key words： multimodal remote sensing imagery; image matching; deep learning; single-session approach; end-to-end approach; matching datasets; research advancements

利益冲突：Conflicts of Interest 所有作者声明不存在利益冲突。

All authors disclose no relevant conflicts of interest.

参考文献

原文顺序 | 文献年度倒序 | 文中引用次数倒序

[1]	付琨, 卢宛萱, 刘小煜, 等. 遥感基础模型发展综述与未来设想[J]. 遥感学报, 2024, 28(7):1667-1680. [Fu K, Lu W X, Liu X Y, et al. A comprehensive survey and assumption of remote sensing foundation modal[J]. National Remote Sensing Bulletin, 2024, 28(7):1667-1680. ] DOI:10.11834/jrs.20233313

[2]	陆锋, 诸云强, 张雪英. 时空知识图谱研究进展与展望[J]. 地球信息科学学报, 2023, 25(6):1091-1105. DOI [Lu F, Zhu Y Q, Zhang X Y. Spatiotemporal knowledge graph: Advances and perspectives[J]. Journal of Geo-information Science, 2023, 25(6):1091-1105. ] DOI:10.12082/dqxxkx.2023.230154

[3]

张永军, 张祖勋, 龚健雅. 天空地多源遥感数据的广义摄影测量学[J]. 测绘学报, 2021, 50(1):1-11.

DOI

[Zhang

Y J

, Zhang

Z X

, Gong

J Y

. Generalized photogrammetry of spaceborne, airborne and terrestrial multi-source remote sensing datasets[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(1):1-11. ] DOI:10.11947/j.AGCS.2021.20200245

[4]	李德仁, 张良培, 夏桂松. 遥感大数据自动分析与数据挖掘[J]. 测绘学报, 2014, 43(12):1211-1216. DOI [Li D R, Zhang L P, Xia G S. Automatic analysis and mining of remote sensing big data[J]. Acta Geodaetica et Cartographica Sinica, 2014, 43(12):1211-1216. ] DOI:10.13485/j.cnki.11-2089.2014.0187

[5]	Zhang J X. Multi-source remote sensing data fusion: Status and trends[J]. International Journal of Image and Data Fusion, 2010, 1(1):5-24. DOI:10.1080/19479830903561035

[6]	眭海刚, 刘畅, 干哲, 等. 多模态遥感图像匹配方法综述[J]. 测绘学报, 2022, 51(9):1848-1861. DOI [Sui H G, Liu C, Gan Z, et al. Overview of multi-modal remote sensing image matching methods[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(9):1848-1861.] DOI:10.11947/j.AGCS.2022.20220126

[7]	Li X H, Ai W H, Feng R T, et al. Survey of remote sensing image registration based on deep learning[J]. National Remote Sensing Bulletin, 2023, 27(2):267-284. DOI: 10.11834/jrs.20235012

[8]	Archana R, Eliahim Jeevaraj P S. Deep learning models for digital image processing: A review[J]. Artificial Intelligence Review, 2024, 57(1):11. DOI:10.1007/s10462-023-10631-z

[9]	Velesaca H O, Bastidas G, Rouhani M, et al. Multimodal image registration techniques: A comprehensive survey[J]. Multimedia Tools and Applications, 2024, 83(23):63919-63947. DOI:10.1007/s11042-023-17991-2

[10]	Ma L, Liu Y, Zhang X L, et al. Deep learning in remote sensing applications: A meta-analysis and review[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2019, 152:166-177. DOI:10.1016/j.isprsjprs.2019.04.015

[11]	张永生, 张振超, 童晓冲, 等. 地理空间智能研究进展和面临的若干挑战[J]. 测绘学报, 2021, 50(9):1137-1146. DOI [Zhang Y S, Zhang Z C, Tong X C, et al. Progress and challenges of geospatial artificial intelligence[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(9):1137-1146.] DOI:10.11947/j.AGCS.2021.20200420

[12]	贾迪, 朱宁丹, 杨宁华, 等. 图像匹配方法研究综述[J]. 中国图象图形学报, 2019, 24(5):677-699. [Jia D, Zhu N D, Yang N H, et al. Image matching methods[J]. Journal of Image and Graphics, 2019, 24(5):677-699. ]

[13]	Zitová B, Flusser J. Image registration methods: A survey[J]. Image and Vision Computing, 2003, 21(11):977-1000. DOI:10.1016/S0262-8856(03)00137-9

[14]	Misra I, Rohil M K, Moorthi S M, et al. Feature based remote sensing image registration techniques: A comprehensive and comparative review[J]. International Journal of Remote Sensing, 2022, 43(12):4477-4516. DOI:10.1080/01431161.2022.2114112

[15]	Ma J Y, Jiang X Y, Fan A X, et al. Image matching from handcrafted to deep features: A survey[J]. International Journal of Computer Vision, 2021, 129(1):23-79. DOI: 10.1007/s11263-020-01359-2

[16]	Dawn S, Saxena V, Sharma B. Remote sensing image registration techniques: A survey[M]// Image and Signal Processing. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010:103-112. DOI:10.1007/978-3-642-13681-8_13

[17]	朱柏, 叶沅鑫. 多模态遥感图像配准方法研究综述[J]. 中国图象图形学报, 2024, 29(8):2137-2161. [Zhu B, Ye Y X. Multimodal remote sensing image registration: A survey[J]. Journal of Image and Graphics, 2024, 29(8):2137-2161. ]

[18]	李云红, 刘宇栋, 苏雪平, 等. 红外与可见光图像配准技术研究综述[J]. 红外技术, 2022, 44(7):641-651. [Li Y H, Liu Y D, Su X P, et al. Review of infrared and visible image registration[J]. Infrared Technology, 2022, 44(7):641-651. ]

[19]	Jiang X Y, Ma J Y, Xiao G B, et al. A review of multimodal image matching: Methods and applications[J]. Information Fusion, 2021, 73:22-71. DOI:10.1016/j.inffus.2021.02.012

[20]	Zhu B, Zhou L, Pu S M, et al. Advances and challenges in multimodal remote sensing image registration[J]. IEEE Journal on Miniaturization for Air and Space Systems, 2023, 4(2):165-174. DOI:10.1109/JMASS.2023.3244848

[21]	Zhang X Y, Leng C C, Hong Y M, et al. Multimodal remote sensing image registration methods and advancements: A survey[J]. Remote Sensing, 2021, 13(24):5128. DOI:10.3390/rs13245128

[22]	刘海桥, 刘萌, 龚子超, 等. 基于深度学习的图像匹配方法综述[J]. 航空学报, 2024, 45(3):028796. [Liu H Q, Liu M, Gong Z C, et al. A review of image matching methods based on deep learning[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(3):028796. ]

[23]	Wu Y, Liu J W, Zhu C Z, et al. Computational intelligence in remote sensing image registration: A survey[J]. International Journal of Automation and Computing, 2021, 18(1):1-17. DOI:10.1007/s11633-020-1248-x

[24]	李培, 姜刚, 马千里, 等. 结合张量与互信息的混合模型多模态图像配准方法[J]. 测绘学报, 2021, 50(7):916-929. DOI [Li P, Jiang G, Ma Q L, et al. A hybrid model combining tensor and mutual information for multi-modal image registration[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(7):916-929.] DOI:10.11947/j.AGCS.2021.20200492

[25]	Gao T, Lan C Z, Huang W J, et al. Multiscale template matching for multimodal remote sensing image[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16:10132-10147

[26]	Lowe D G. Object recognition from local scale-invariant features[C]// Proceedings of the Seventh IEEE International Conference on Computer Vision. IEEE, 1999:1150-1157. DOI:10.1109/ICCV.1999.790410

[27]

肖雄武, 郭丙轩, 潘飞, 等. 利用泰勒展开的点特征子像素定位方法[J]. 武汉大学学报(信息科学版), 2014, 39(10):1231-1235.

[Xiao

X W

, Guo

B X

, Pan

, et al. Sub-pixel location of feature point based on Taylor expansion and its application[J]. Geomatics and Information Science of Wuhan University, 2014, 39(10):1231-1235. ] DOI: 10.13203/j.whugis20130047

[28]	宋佳璇, 范大昭, 董杨, 等. 神经网络学习与灰度信息结合的跨视角影像线特征匹配算法[J]. 测绘学报, 2023, 52(6):990-999. DOI [Song J X, Fan D Z, Dong Y, et al. Line matching algorithm for cross-view images combining neural network learning with grayscale information[J]. Acta Geodaetica et Cartographica Sinica, 2023, 52(6):990-999. ] DOI

[29]	Li H, Manjunath B S, Mitra S K. A contour-based approach to multisensor image registration[J]. IEEE Transactions on Image Processing, 1995, 4(3):320-334. DOI: 10.1109/83.366480 PMID

[30]	Lu J Y, Jia H G, Li T, et al. An instance segmentation based framework for large-sized high-resolution remote sensing images registration[J]. Remote Sensing, 2021, 13(9):1657. DOI:10.3390/rs13091657

[31]	Zeng L, Du Y L, Lin H P, et al. A novel region-based image registration method for multisource remote sensing images via CNN[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 14:1821-1831

[32]	Hou Z L, Liu Y X, Zhang L. POS-GIFT: A geometric and intensity-invariant feature transformation for multimodal images[J]. Information Fusion, 2024, 102:102027. DOI: 10.1016/j.inffus.2023.102027

[33]	Fan B, Huo C L, Pan C H, et al. Registration of optical and SAR satellite images by exploring the spatial relationship of the improved SIFT[J]. IEEE Geoscience and Remote Sensing Letters, 2013, 10(4):657-661. DOI:10.1109/LGRS.2012.2216500

[34]	Yao Y X, Zhang Y J, Wan Y, et al. Multi-modal remote sensing image matching considering co-occurrence filter[J]. IEEE Transactions on Image Processing, 2022, 31:2584-2597. DOI:10.1109/TIP.2022.3157450 PMID

[35]	Chen C P, Yu G R, Bao H Z, et al. A novel multimodal remote-sensing image registration algorithm using phase symmetry and rank-based local self-similarity[J]. Remote Sensing Letters, 2024, 15(12):1270-1281. DOI:10.1080/2150704X.2024.2433749

[36]	Ye Y B, Wang Q W, Zhao H, et al. Fast and robust optical-to-SAR remote sensing image registration using region-aware phase descriptor[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5208512. DOI:10.1109/TGRS.2024.3379370

[37]

姚永祥, 张永军, 万一, 等. 顾及各向异性加权力矩与绝对相位方向的异源影像匹配[J]. 武汉大学学报(信息科学版), 2021, 46(11):1727-1736.

[Yao

Y X

, Zhang

Y J

, Wan

, et al. Heterologous images matching considering anisotropic weighted moment and absolute phase orientation[J]. Geomatics and Information Science of Wuhan University, 2021, 46(11):1727-1736. ] DOI:10.13203/j.whugis20200702

[38]	Li J Y, Hu Q W, Ai M Y. RIFT: Multi-modal image matching based on radiation-variation insensitive feature transform[J]. IEEE Transactions on Image Processing, 2019, 29:3296-3310. DOI:10.1109/TIP.2019.2959244

[39]	李国, 范大昭, 郭海涛. 一种改进的自适应遗传算法在影像匹配中的应用[J]. 测绘科学, 2009, 34(5):188-189,80. [Li G, Fan D Z, Guo H T. Application of an improved adaptive genetic algorithm in image matching[J]. Science of Surveying and Mapping, 2009, 34(5):188-189,80. ]

[40]	Han X F, Leung T, Jia Y Q, et al. MatchNet: Unifying feature and metric learning for patch-based matching[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015:3279-3286. DOI:10.1109/CVPR.2015.7298948

[41]	Merkle N, Auer S, Müller R, et al. Exploring the potential of conditional adversarial networks for optical and SAR image matching[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(6):1811-1820. DOI:10.1109/JSTARS.2018.2803212

[42]

南轲, 齐华, 叶沅鑫. 深度卷积特征表达的多模态遥感影像模板匹配方法[J]. 测绘学报, 2019, 48(6):727-736.

DOI

[Nan

, Qi

, Ye

Y X

. A template matching method of multimodal remote sensing images based on deep convolutional feature representation[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48(6):727-736. ] DOI:10.11834/jrs.20233313

[43]	Zhang H, Lei L, Ni W P, et al. Explore better network framework for high-resolution optical and SAR image matching[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60:4704418. DOI:10.1109/TGRS.2021.3126939

[44]	Cao S Y, Yu B N, Luo L, et al. PCNet: A structure similarity enhancement method for multispectral and multimodal image registration[J]. Information Fusion, 2023, 94:200-214. DOI:10.1016/j.inffus.2023.02.004

[45]	Lyu C H, Wang W, Quan D, et al. Fourier domain adaptive multi-modal remote sensing image template matching based on Siamese network[C]// IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2024:7325-7329. DOI:10.1109/IGARSS53475.2024.10642905

[46]	Ye Y X, Yang C, Gong G Q, et al. Robust optical and SAR image matching using attention-enhanced structural features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5610212. DOI:10.1109/TGRS.2024.3366247

[47]	Simonovsky M, Gutiérrez-Becker B, Mateus D, et al. A deep metric for multimodal registration[M]//Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. Cham: Springer International Publishing, 2016:10-18. DOI:10.1007/978-3-319-46726-9_2

[48]	Haskins G, Kruecker J, Kruger U, et al. Learning deep similarity metric for 3D MR-TRUS image registration[J]. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(3):417-425. DOI:10.1007/s11548-018-1875-7 PMID

[49]	Smith S M, Brady J M. SUSAN—a new approach to low level image processing[J]. International Journal of Computer Vision, 1997, 23(1):45-78. DOI:10.1023/A:1007963824710

[50]	Dusmanu M, Rocco I, Pajdla T, et al. D2-net: A trainable CNN for joint description and detection of local features[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019:8084-8093. DOI:10.1109/CVPR.2019.00828

[51]	蓝朝桢, 卢万杰, 于君明, 等. 异源遥感影像特征匹配的深度学习算法[J]. 测绘学报, 2021, 50(2):189-202. DOI [Lan C Z, Lu W J, Yu J M, et al. Deep learning algorithm for feature matching of cross modality remote sensing images[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(2):189-202. ] DOI:10.11947/j.AGCS.2021.20200048

[52]	Fischler M A, Bolles R C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6):381-395. DOI:10.1016/B978-0-08-051581-6.50070-2

[53]	Noh H, Araujo A, Sim J, et al. Large-scale image retrieval with attentive deep local features[C]// 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017:3476-3485. DOI:10.1109/ICCV.2017.374

[54]	王龙号, 蓝朝桢, 姚富山, 等. 多源遥感影像深度特征融合匹配算法[J]. 地球信息科学学报, 2023, 25(2):380-395. DOI [Wang L H, Lan C Z, Yao F S, et al. Multi-source remote sensing image deep feature fusion matching algorithm[J]. Journal of Geo-information Science, 2023, 25(2):380-395. ] DOI:10.12082/dqxxkx.2023.220197

[55]	DeTone D, Malisiewicz T, Rabinovich A. SuperPoint: Self-supervised interest point detection and description[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2018:337-33712. DOI:10.1109/CVPRW.2018.00060

[56]	Zhang H, Yue Y X, Li H J, et al. Shared contents alignment across multiple granularities for robust SAR-optical image matching[J]. Information Fusion, 2024, 106:102298. DOI: 10.1016/j.inffus.2024.102298

[57]	Bay H, Ess A, Tuytelaars T, et al. Speeded-Up Robust Features[J]. Computer Vision and Image Understanding, 2008, 110(3):346-359.

[58]	Rublee E, Rabaud V, Konolige K, et al. ORB: An efficient alternative to SIFT or SURF[C]// 2011 International Conference on Computer Vision. IEEE, 2011:2564-2571. DOI:10.1109/ICCV.2011.6126544

[59]	Alcantarilla P, Nuevo J, Bartoli A. Fast explicit diffusion for accelerated features in nonlinear scale spaces[C]// Proceedings ofthe British Machine Vision Conference 2013. British Machine Vision Association, 2013:13.1-13.11. DOI:10.5244/c.27.13

[60]	Ye F M, Su Y F, Xiao H, et al. Remote sensing image registration using convolutional neural network features[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(2):232-236. DOI:10.1109/LGRS.2017.2781741

[61]	Dong Y Y, Jiao W L, Long T F, et al. Local deep descriptor for remote sensing image feature matching[J]. Remote Sensing, 2019, 11(4):430. DOI: 10.3390/rs11040430

[62]	Yang Z Q, Dan T T, Yang Y. Multi-temporal remote sensing image registration using deep convolutional features[J]. IEEE Access, 2018, 6:38544-38555

[63]	Wang C Y, Mark Liao H Y, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020:1571-1580. DOI:10.1109/cvprw50498.2020.00203

[64]	Xiang D L, Ding H Y, Sun X K, et al. PolSAR image registration combining Siamese multiscale attention network and joint filter[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5208414. DOI:10.1109/TGRS.2024.3379987

[65]	Zhang H, Lei L, Ni W P, et al. Optical and SAR image matching using pixelwise deep dense features[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 19:6000705. DOI:10.1109/LGRS.2020.3039473

[66]	Fan R B, Hou B C, Liu J B, et al. Registration of multiresolution remote sensing images based on L2-Siamese model[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 14:237-248

[67]	Blendowski M, Heinrich M P. Combining MRF-based deformable registration and deep binary 3D-CNN descriptors for large lung motion estimation in COPD patients[J]. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(1):43-52. DOI:10.1007/s11548-018-1888-2 PMID

[68]	Xiang Y M, Tao R S, Wan L, et al. OS-PC: Combining feature representation and 3-D phase correlation for subpixel optical and SAR image registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(9):6451-6466. DOI:10.1109/TGRS.2020.2976865

[69]	Sarlin P E, DeTone D, Malisiewicz T, et al. SuperGlue: Learning feature matching with graph neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 4937-4946. DOI:10.1109/cvpr42600.2020.00499

[70]	Zhu H, Jiao L C, Ma W P, et al. A novel neural network for remote sensing image matching[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(9):2853-2865. DOI:10.1109/TNNLS.2018.2888757 PMID

[71]	Wang S, Quan D, Liang X F, et al. A deep learning framework for remote sensing image registration[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 145:148-164. DOI:10.1016/j.isprsjprs.2017.12.012

[72]	Rocco I, Arandjelović R, Sivic J. Convolutional neural network architecture for geometric matching[C]// IEEE Transactions on Pattern Analysis and Machine Intelligence. IEEE, 2019:2553-2567. DOI:10.1109/TPAMI.2018.2865351

[73]	Lee W, Sim D, Oh S J. A CNN-based high-accuracy registration for remote sensing images[J]. Remote Sensing, 2021, 13(8):1482. DOI:10.3390/rs13081482

[74]	Li Z Y, Zhang H T, Huang Y H. A rotation-invariant optical and SAR image registration algorithm based on deep and Gaussian features[J]. Remote Sensing, 2021, 13(13):2628. DOI:10.3390/rs13132628

[75]	Nie H, Luo B, Liu J, et al. A novel rotation and scale equivariant network for optical-SAR image matching[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5221314. DOI:10.1109/TGRS.2024.3452929

[76]	Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016:2414-2423. DOI:10.1109/CVPR.2016.265

[77]	余佩伦, 施佺, 王晗. 并行生成网络的红外—可见光图像转换[J]. 中国图象图形学报, 2021, 26(10):2346-2356. [,. [Yu P L, Shi Q, Wang H. Infrared-to-visible image translation based on parallel generator network[J]. Journal of Image and Graphics, 2021, 26(10):2346-2356. ]

[78]	Du W L, Zhou Y, Zhao J Q, et al. K-means clustering guided generative adversarial networks for SAR-optical image matching[J]. IEEE Access, 2020, 8:217554-217572

[79]	Hänsch R, Hellwich O, Tu X H. Machine-learning based detection of corresponding interest points in optical and SAR images[C]// 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2016:1492-1495. DOI:10.1109/IGARSS.2016.7729381

[80]	Li J Y, Xu W Y, Shi P C, et al. LNIFT: Locally normalized image for rotation invariant multimodal feature matching[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5621314. DOI:10.1109/TGRS.2022.3165940

[81]	宋智礼, 张家齐, 熊亮, 等. 利用风格迁移和特征点的多模态图像配准算法[J]. 遥感信息, 2021, 36(1):1-6. [Song Z L, Zhang J Q, Xiong L, et al. Multimodal image registration algorithm using style transfer and feature points[J]. Remote Sensing Information, 2021, 36(1):1-6. ] DOI:10.3969/j.issn.1000-3177.2021.01.001

[82]	项德良, 徐益豪, 程建达, 等. 一种基于特征交汇关键点检测和Sim-CSPNet的SAR图像配准算法[J]. 雷达学报, 2022, 11(6):1081-1097. [Xiang D L, Xu Y H, Cheng J D, et al. An algorithm based on a feature interaction-based keypoint detector and sim-CSPNet for SAR image registration[J]. Journal of Radars, 2022, 11(6):1081-1097. ]

[83]	Hughes L H, Marcos D, Lobry S, et al. A deep learning framework for matching of SAR and optical imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169:166-179. DOI:10.1016/j.isprsjprs.2020.09.012

[84]	Ma J Y, Jiang X Y, Jiang J J, et al. LMR: Learning a two-class classifier for mismatch removal[J]. IEEE Transactions on Image Processing, 2019, 28(8):4045-4059. DOI: 10.1109/TIP.2019.2906490 PMID

[85]	Li L Z, Han L, Ye Y X. Self-supervised keypoint detection and cross-fusion matching networks for multimodal remote sensing image registration[J]. Remote Sensing, 2022, 14(15):3599. DOI:10.3390/rs14153599

[86]	Zhang H, Ni W P, Yan W D, et al. Registration of multimodal remote sensing image based on deep fully convolutional neural network[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12(8):3028-3042. DOI:10.1109/JSTARS.2019.2916560

[87]	Tian Y R, Fan B, Wu F C. L2-net: Deep learning of discriminative patch descriptor in euclidean space[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6128-6136. DOI: 10.1109/CVPR.2017.649

[88]	Mishchuk A, Mishkin D, Radenovic F, et al. Working hard to know your neighbor' s margins: Local descriptor learning loss[C]// Advances in Neural Information Processing Systems: Vol. 30. Curran Associates, Inc., 2017.

[89]	Zhan Y, Fu K, Yan M L, et al. Change detection based on deep Siamese convolutional network for optical aerial images[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14(10):1845-1849. DOI:10.1109/LGRS.2017.2738149

[90]	Sun J M, Shen Z H, Wang Y A, et al. LoFTR: Detector-free local feature matching with transformers[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021:8918-8927. DOI: 10.1109/CVPR46437.2021.00881

[91]	Yan H, Ma A L, Zhong Y F. Amalgamating convolutional and graph neural networks for fast multimodal remote sensing image registration[C]// IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2024:2982-2985. DOI:10.1109/IGARSS53475.2024.10642643

[92]	孙晓坤, 贠泽楷, 胡粲彬, 等. 面向高分辨率多视角SAR图像的端到端配准算法[J]. 雷达学报(中英文), 2025, 14(2):389-404. [Sun X K, Yun Z K, Hu C B, et al. End-to-end registration algorithm for high-resolution multi-view SAR images[J]. Journal of Radars, 2025, 14(2):389-404. ]

[93]	Liu M, Zhou G X, Ma L F, et al. SIFNet: A self-attention interaction fusion network for multisource satellite imagery template matching[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 118:103247. DOI:10.1016/j.jag.2023.103247

[94]	Wang Q, Zhang J M, Yang K L, et al. MatchFormer: Interleaving attention inTransformers forFeature matching[C]// Computer Vision - ACCV 2022. Cham: Springer, 2023: 256-273. DOI:10.1007/978-3-031-26313-2_16

[95]	Chen J X, Chen X X, Chen S, et al. Shape-Former: Bridging CNN and Transformer via ShapeConv for multimodal image matching[J]. Information Fusion, 2023, 91:445-457. DOI: 10.1016/j.inffus.2022.10.030

[96]	Zhang Y X, Lan C Z, Zhang H M, et al. Multimodal remote sensing image matching via learning features and attention mechanism[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5603620. DOI:10.1109/TGRS.2023.3348980

[97]	Wei Z J, Lan C Z, Xu Q, et al. SatellStitch: Satellite imagery-assisted UAV image seamless stitching for emergency response without GCP and GNSS[J]. Remote Sensing, 2024, 16(2):309. DOI:10.3390/rs16020309

[98]	Zampieri A, Charpiat G, Girard N, et al. Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing[C]// Computer Vision-ECCV 2018. Cham: Springer, 2018:679-696. DOI:10.1007/978-3-030-01270-0_40

[99]	Girard N, Charpiat G, Tarabalka Y. Aligning and updating cadaster maps with aerial images by multi-task, multi-resolution deep learning[M]//Computer Vision-ACCV 2018. Cham: Springer International Publishing, 2019:675-690. DOI: 10.1007/978-3-030-20873-8_43

[100]

Y X

, Tang

T F

, Zhu

, et al. A multiscale framework with unsupervised learning for remote sensing image registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:5622215. DOI:10.1109/TGRS.2022.3167644

[101]

L Z

, Han

, Ding

M T

, et al. Remote sensing image registration based on deep learning regression model[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 19:8002905. DOI:10.1109/LGRS.2020.3032439

[102]

de Vos

B D

, Berendsen

F F

, Viergever

M A

, et al. A deep learning framework for unsupervised affine and deformable image registration[J]. Medical Image Analysis, 2019, 52:128-143. DOI:10.1016/j.media.2018.11.010

PMID

[103]

Xie

H S

, Zhang

Y K

, Qiu

J H

, et al. Semantics lead all: Towards unified image registration and fusion from a semantic perspective[J]. Information Fusion, 2023, 98:101835. DOI:10.1016/j.inffus.2023.101835

[104]

Zhou

R F

, Quan

, Wang

, et al. A unified deep learning network for remote sensing image registration and change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 62:5101216. DOI:10.1109/TGRS.2023.3344751

[105]

Liao

Y F

, Xi

, Fu

H J

, et al. Refining multi-modal remote sensing image matching with repetitive feature optimization[J]. International Journal of Applied Earth Observation and Geoinformation, 2024, 134:104186. DOI: 10.1016/j.jag.2024.104186

[106]

Yan

, Ma

A L

, Zhong

Y F

. Progressive symmetric registration for multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 63:5600317. DOI:10.1109/TGRS.2024.3514305

[107]

Tian

L Y

, Luo

, Yu

. Enhancing the capability of registration networks in multimodal remote sensing image registration using synthetic data[C]// Fourth International Conference on Image Processing and Intelligent Control (IPIC 2024). SPIE, 2024:81. DOI:10.1117/12.3038564

[108]

Quan

, Wang

, Lv

C H

, et al. LM-net: A lightweight matching network for remote sensing image matching and registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5229313. DOI:10.1109/TGRS.2024.3509638

[109]

Zhang

, Cai

M X

, Zhang

, et al. EarthGPT: A universal multimodal large language model for multisensor image comprehension in remote sensing domain[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:5917820. DOI:10.1109/TGRS.2024.3409624

[110]

Guo

, Lao

J W

, Dang

, et al. SkySense: A multi-modal remote sensing foundation model towards universal interpretation for earth observation imagery[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2024:27662-27673. DOI: 10.1109/CVPR52733.2024.02613

[111]

Radford

, Kim

J W

, Hallacy

, et al. Learning transferable visual models from natural language supervision[EB/OL]. 2021:2103.00020. 3.00020v1

[112]

Caron

, Touvron

, Misra

, et al. Emerging properties in self-supervised vision transformers[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021:9630-9640. DOI:10.1109/ICCV48922.2021.00951

[113]

Kirillov

, Mintun

, Ravi

, et al. Segment anything[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2023:3992-4003. DOI:10.1109/ICCV51070.2023.00371

[114]

Oquab

, Darcet

, Moutakanni

, et al. DINOv2: Learning Robust Visual Features without Supervision[A]. arXiv, 2024.

[115]

Zhang

J S

, Yao

D Y

, Pi

R J

, et al. VLM2-bench: A closer look at how well VLMs implicitly link explicit matching visual cues[J]. ArXiv e-Prints, 2025:arXiv:2502.12084. DOI:10.48550/arXiv.2502.12084

[116]

Huang

G S

, Zhou

, Hu

X F

, et al. DINO-Mix enhancing visual place recognition with foundational vision model and feature mixing[J]. Scientific Reports, 2024, 14:22100. DOI:10.1038/s41598-024-73853-3

PMID

[117]

Zheng

Z D

, Wei

Y C

, Yang

. University-1652: A multi-view multi-source benchmark for drone-based geo-localization[J]. ArXiv e-Prints, 2020: arXiv:2002.12186. DOI:10.48550/arXiv.2002.12186

[118]

Schmitt

, Hughes

L H

, Zhu

X X

. The sen1-2 dataset for deep learning in sar-optical data fusion[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2018,IV-1:141-146. DOI:10.5194/isprs-annals-iv-1-141-2018

[119]

Wang

Y Y

, Zhu

X X

. The SARptical dataset for joint analysis of SAR and optical image in dense urban area[C]// IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2018:6840-6843. DOI:10.1109/IGARSS.2018.8518298

[120]

W Y

, Yuan

X H

, Hu

Q W

, et al. SAR-optical feature matching: A large-scale patch dataset and a deep local descriptor[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 122:103433. DOI: 10.1016/j.jag.2023.103433

[121]

Xiang

Y M

, Wang

X Q

, Wang

, et al. A global-to-local algorithm for high-resolution optical and SAR image registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61:5215320

[122]

Xiang

Y M

, Tao

R S

, Wang

, et al. Automatic registration of optical and SAR images via improved phase congruency model[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13:5847-5861

[123]

Huang

, Xu

, Qian

, et al. The QXS-SAROPT Dataset for Deep Learning in SAR-Optical Data Fusion[A]. arXiv, 2021.

[124]

P H

, Yao

Y X

, Zhang

W F

, et al. MapGlue: Multimodal remote sensing image matching[EB/OL]. 2025: 2503.16185. https://arxiv.org/abs/2503.16185v1

[125]

Jiang

, Ren

, Li

, et al. MINIMA: Modality Invariant Image Matching[A]. arXiv, 2024.

[126]

Y B

, Teng

X C

, Chen

, et al. 3MOS:Multi-sources, Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching[EB/OL]. 2024: 2404.00838. https://arxiv.org/abs/2404.00838v1

[127]

Fan

Z L

, Pi

Y D

, Wang

, et al. GLS-MIFT: A modality invariant feature transform with global-to-local searching[J]. Information Fusion, 2024, 105:102252. DOI:10.1016/j.inffus.2024.102252

[128]

张永军, 李彦胜, 党博, 等. 多模态遥感基础大模型:研究现状与未来展望[J]. 测绘学报, 2024, 53(10):1942-1954.

DOI

[Zhang

Y J

, Li

Y S

, Dang

, et al. Multi-modal remote sensing large foundation models: Current research status and future prospect[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(10):1942-1954. ]

DOI

[129]

S B

, Chen

S P

, Xu

R T

, et al. Local feature matching using deep learning: A survey[J]. Information Fusion, 2024, 107:102344. DOI:10.1016/j.inffus.2024.102344

[130]

Vaswani

, Shazeer

, Parmar

, et al. Attention Is All You Need[A]. arXiv, 2017.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献