多模态遥感影像匹配的深度学习研究进展与趋势
作者贡献:Author Contributions
蓝朝桢参与文章结构的指导;王龙号、魏紫珺、高天参与文章相关资料的归纳整理;王龙号、魏紫珺、高天、于瀚洋、王亦乔参与文章内容的撰写和修改,于瀚洋,刘芮萌参与了文章内容的校对和图表修改。所有作者均阅读并同意最终文件的提交。
LAN Chaozhen participated in guiding the structure of the article;WANG Longhao, WEI Zijun, and GAO Tian participated in the summarization and organization of relevant materials for the article; WANG Longhao, WEI Zijun, GAO Tian, YU Hanyang, and WANG Yiqiao participated in the writing and editing of the article content, while YU Hanyang and LIU Ruimeng participated in the proofreading and chart editing of the article content. All the authors have read the last version of the paper and consented for submission.
于瀚洋(2002— ),男,山东青岛人,博士生,主要从事遥感影像数字处理研究。E-mail: 2920131781@qq.com |
收稿日期: 2025-01-27
修回日期: 2025-06-10
网络出版日期: 2025-07-23
Research Advances and Development Trends of Deep Learning Methods for Multimodal Remote Sensing Image Matching
Received date: 2025-01-27
Revised date: 2025-06-10
Online published: 2025-07-23
【意义】影像匹配是完成多景影像空间位置对齐的方法和过程,而自动化影像匹配是现代摄影测量与遥感数据处理中关键的一环。【进展】随着对地观测技术的发展和多源遥感数据获取能力的提高,综合协同处理多源数据的能力需求推动多模态遥感影像的匹配技术研究不断深入,近年来基于深度学习的思想深刻影响了影像匹配领域技术的发展。本文在介绍传统遥感影像匹配框架的基础上,分析了多模态遥感影像的类型、特点与匹配难点,重点论述了针对多模态遥感影像不同深度学习方法研究的新进展,并分析了其优缺点,归纳总结了目前适应多模态遥感影像匹配任务的数据集,对深度学习方法在多模态遥感影像匹配中的发展成果和当前挑战进行了总结。成果方面,该领域算法在高效、鲁棒和精度上显著提升,多模态融合策略和多种创新框架与模型推动了研究发展并反映了该领域从模块化适配到整体建模的转变,揭示了数据驱动的表征学习与几何推理的更深度融合。但当前研究仍存在显著瓶颈,多模态差异方面,异构性严重制约匹配效能,模型泛化能力不足;数据与计算层面,高质量标注数据稀缺、计算资源需求大;工程部署层面,算法实战能力欠缺,误匹配剔除困难,模型在混合模态数据处理中泛化性差。【展望】最后对多模态遥感影像深度学习匹配方法领域的发展趋势与未来展望进行了深入探讨,包括模态无关的设计、物理信息约束的网络架构以及适应复杂环境的轻量化方案等。
于瀚洋 , 蓝朝桢 , 王龙号 , 魏紫珺 , 高天 , 王亦乔 , 刘芮萌 . 多模态遥感影像匹配的深度学习研究进展与趋势[J]. 地球信息科学学报, 2025 , 27(8) : 1896 -1919 . DOI: 10.12082/dqxxkx.2025.250052
[Significance] Multimodal remote sensing image matching has become a fundamental task in integrated Earth observation, enabling precise spatial alignment across heterogeneous image sources. [Progress] As the diversity of sensing modalities, acquisition geometries, and temporal conditions increases, traditional matching frameworks have proven inadequate for capturing complex variations in radiometric responses, geometric configurations, and semantic representations. This technological gap has driven a significant paradigm shift from handcrafted feature engineering to deep learning-based solutions, which now form the core of current research and application development. This paper provides a comprehensive and structured review of recent advances in deep learning methods for multimodal remote sensing image matching, with an emphasis on the evolution of methodological paradigms and technical frameworks. It establishes a clear dual-path classification: the single-session approach and the end-to-end approach. The former selectively replaces or enhances individual components of traditional pipelines, such as feature encoding or similarity estimation, using neural network modules. The latter integrates the entire matching process into a unified network architecture, enabling joint optimization of feature learning, transformation modeling, and correspondence inference within a closed loop. This progression reflects the field's transition from modular adaptation to holistic modeling, revealing a deeper integration of data-driven representation learning with geometric reasoning. The review further examines the development of architectural strategies supporting this evolution, including attention mechanisms, graph-based structures, hierarchical feature fusion, and modality-bridging transformations. These innovations contribute to improved robustness, semantic consistency, and adaptability across diverse matching scenarios. Recent trends also demonstrate a growing reliance on pretrained vision foundation models, which provide transferable feature spaces and reduce the dependence on large-scale labeled datasets. In addition to summarizing technical advancements, the paper analyzes representative datasets, performance evaluation strategies, and the current challenges that constrain real-world deployment. These include limited data availability, weak cross-scene generalization, computational inefficiency, and insufficient interpretability. [Prospect] By synthesizing methodological progress with practical demands, the review identifies key directions for future research, including the design of modality-invariant representations, physically-informed neural architectures, and lightweight solutions tailored for scalable, real-time image registration in complex operational environments.
利益冲突:Conflicts of Interest 所有作者声明不存在利益冲突。
All authors disclose no relevant conflicts of interest.
[1] |
付琨, 卢宛萱, 刘小煜, 等. 遥感基础模型发展综述与未来设想[J]. 遥感学报, 2024, 28(7):1667-1680.
[
|
[2] |
陆锋, 诸云强, 张雪英. 时空知识图谱研究进展与展望[J]. 地球信息科学学报, 2023, 25(6):1091-1105.
[
|
[3] |
张永军, 张祖勋, 龚健雅. 天空地多源遥感数据的广义摄影测量学[J]. 测绘学报, 2021, 50(1):1-11.
[
|
[4] |
李德仁, 张良培, 夏桂松. 遥感大数据自动分析与数据挖掘[J]. 测绘学报, 2014, 43(12):1211-1216.
[
|
[5] |
|
[6] |
眭海刚, 刘畅, 干哲, 等. 多模态遥感图像匹配方法综述[J]. 测绘学报, 2022, 51(9):1848-1861.
[
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
张永生, 张振超, 童晓冲, 等. 地理空间智能研究进展和面临的若干挑战[J]. 测绘学报, 2021, 50(9):1137-1146.
[
|
[12] |
贾迪, 朱宁丹, 杨宁华, 等. 图像匹配方法研究综述[J]. 中国图象图形学报, 2019, 24(5):677-699.
[
|
[13] |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
朱柏, 叶沅鑫. 多模态遥感图像配准方法研究综述[J]. 中国图象图形学报, 2024, 29(8):2137-2161.
[
|
[18] |
李云红, 刘宇栋, 苏雪平, 等. 红外与可见光图像配准技术研究综述[J]. 红外技术, 2022, 44(7):641-651.
[
|
[19] |
|
[20] |
|
[21] |
|
[22] |
刘海桥, 刘萌, 龚子超, 等. 基于深度学习的图像匹配方法综述[J]. 航空学报, 2024, 45(3):028796.
[
|
[23] |
|
[24] |
李培, 姜刚, 马千里, 等. 结合张量与互信息的混合模型多模态图像配准方法[J]. 测绘学报, 2021, 50(7):916-929.
[
|
[25] |
|
[26] |
|
[27] |
肖雄武, 郭丙轩, 潘飞, 等. 利用泰勒展开的点特征子像素定位方法[J]. 武汉大学学报(信息科学版), 2014, 39(10):1231-1235.
[
|
[28] |
宋佳璇, 范大昭, 董杨, 等. 神经网络学习与灰度信息结合的跨视角影像线特征匹配算法[J]. 测绘学报, 2023, 52(6):990-999.
[
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
[35] |
|
[36] |
|
[37] |
姚永祥, 张永军, 万一, 等. 顾及各向异性加权力矩与绝对相位方向的异源影像匹配[J]. 武汉大学学报(信息科学版), 2021, 46(11):1727-1736.
[
|
[38] |
|
[39] |
李国, 范大昭, 郭海涛. 一种改进的自适应遗传算法在影像匹配中的应用[J]. 测绘科学, 2009, 34(5):188-189,80.
[
|
[40] |
|
[41] |
|
[42] |
南轲, 齐华, 叶沅鑫. 深度卷积特征表达的多模态遥感影像模板匹配方法[J]. 测绘学报, 2019, 48(6):727-736.
[
|
[43] |
|
[44] |
|
[45] |
|
[46] |
|
[47] |
|
[48] |
|
[49] |
|
[50] |
|
[51] |
蓝朝桢, 卢万杰, 于君明, 等. 异源遥感影像特征匹配的深度学习算法[J]. 测绘学报, 2021, 50(2):189-202.
[
|
[52] |
|
[53] |
|
[54] |
王龙号, 蓝朝桢, 姚富山, 等. 多源遥感影像深度特征融合匹配算法[J]. 地球信息科学学报, 2023, 25(2):380-395.
[
|
[55] |
|
[56] |
|
[57] |
|
[58] |
|
[59] |
|
[60] |
|
[61] |
|
[62] |
|
[63] |
|
[64] |
|
[65] |
|
[66] |
|
[67] |
|
[68] |
|
[69] |
|
[70] |
|
[71] |
|
[72] |
|
[73] |
|
[74] |
|
[75] |
|
[76] |
|
[77] |
余佩伦, 施佺, 王晗. 并行生成网络的红外—可见光图像转换[J]. 中国图象图形学报, 2021, 26(10):2346-2356. [,.
[
|
[78] |
|
[79] |
|
[80] |
|
[81] |
宋智礼, 张家齐, 熊亮, 等. 利用风格迁移和特征点的多模态图像配准算法[J]. 遥感信息, 2021, 36(1):1-6.
[
|
[82] |
项德良, 徐益豪, 程建达, 等. 一种基于特征交汇关键点检测和Sim-CSPNet的SAR图像配准算法[J]. 雷达学报, 2022, 11(6):1081-1097.
[
|
[83] |
|
[84] |
|
[85] |
|
[86] |
|
[87] |
|
[88] |
|
[89] |
|
[90] |
|
[91] |
|
[92] |
孙晓坤, 贠泽楷, 胡粲彬, 等. 面向高分辨率多视角SAR图像的端到端配准算法[J]. 雷达学报(中英文), 2025, 14(2):389-404.
[
|
[93] |
|
[94] |
|
[95] |
|
[96] |
|
[97] |
|
[98] |
|
[99] |
|
[100] |
|
[101] |
|
[102] |
|
[103] |
|
[104] |
|
[105] |
|
[106] |
|
[107] |
|
[108] |
|
[109] |
|
[110] |
|
[111] |
|
[112] |
|
[113] |
|
[114] |
|
[115] |
|
[116] |
|
[117] |
|
[118] |
|
[119] |
|
[120] |
|
[121] |
|
[122] |
|
[123] |
|
[124] |
|
[125] |
|
[126] |
|
[127] |
|
[128] |
张永军, 李彦胜, 党博, 等. 多模态遥感基础大模型:研究现状与未来展望[J]. 测绘学报, 2024, 53(10):1942-1954.
[
|
[129] |
|
[130] |
|
/
〈 |
|
〉 |