地球信息科学学报 ›› 2022, Vol. 24 ›› Issue (7): 1375-1390.doi: 10.12082/dqxxkx.2022.210809

• 遥感科学与应用技术 • 上一篇    下一篇

基于多尺度对比学习的弱监督遥感场景分类

彭瑞1,2(), 赵文智1,2,*(), 张立强1, 陈学泓1,2   

  1. 1.北京师范大学地理科学学部,北京 100875
    2.北京师范大学 遥感科学国家重点实验室,北京 100875
  • 收稿日期:2021-12-15 修回日期:2022-03-13 出版日期:2022-07-25 发布日期:2022-09-25
  • 通讯作者: * 赵文智(1990—),男,山东菏泽人,博士,硕士生导师,主要从事深度学习与遥感影像智能理解、地表异常即时探测等方面研究。E-mail: wenzhi.zhao@bnu.edu.cnmailto
  • 作者简介:彭 瑞(1998— ),男,湖南常德人,硕士生,主要从事深度学习与遥感影像智能理解、地表异常即时探测等方面研究。E-mail: pengrui@mail.bnu.edu.cn
  • 基金资助:
    国家自然科学基金项目(42192584);北京市自然科学基金项目(4214065)

Multi-Scale Contrastive Learning based Weakly Supervised Learning for Remote Sensing Scene Classification

PENG Rui1,2(), ZHAO Wenzhi1,2,*(), ZHANG Liqiang1, CHEN Xuehong1,2   

  1. 1. Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
    2. State Key Laboratory of Remote Sensing Science, Beijing Normal University, Beijing 100875, China
  • Received:2021-12-15 Revised:2022-03-13 Online:2022-07-25 Published:2022-09-25
  • Contact: ZHAO Wenzhi
  • Supported by:
    National Natural Science Foundation of China(42192584);Natural Science Foundation of Beijing Province(4214065)

摘要:

遥感场景分类作为一种理解遥感影像的重要方式,在目标检测、影像快速检索等方向有着重要的应用,当前主流的场景分类方法多关注影像深层次特征的准确提取,忽略了场景目标在不同分布尺度下的差异性。此外,有限的高质量场景标签进一步限制了模型分类性能。为了解决以上问题,本研究提出了基于多尺度对比学习的弱监督遥感场景分类方法,首先利用多尺度对比学习的自监督策略,从大量无标注数据中自动获取影像不同尺度下的特征表示。其次,基于多尺度稳健特征对分类模型利用少量标签进行微调,并结合标签传播方法生成高质量样本标签。最后,结合大量无标签数据构建弱监督分类模型,进一步提升场景分类的能力。本研究在遥感场景AID数据集和NWPU-RESISC45数据集上分别使用1%、5%和10%的标注样本下分类精度分别达到了87.7%、93.67%、95.56%和86.02%、93.15%和95.38%,在有限标注样本条件下与其他基准模型相比有着明显的优势,证明了本文模型的有效性。

关键词: 场景分类, 多尺度, 深度学习, 对比学习, 弱监督学习, 有限样本, 标签传播, 遥感

Abstract:

Remote sensing scene classification is a significant approach to comprehending remote sensing images and has several applications in the areas such as target recognition and quick image retrieval. Currently, although many deep-learning-based scene classification algorithms have achieved excellent results, these methods only extract deep features of scene images on a specific scale and ignore the instability of extracted scene image features at different scales. Furthermore, the shortage of annotation data limits the performance improvement of these scene classification methods, which remains unsolved. As a result, for multi-scale remote sensing scene classification with limited labels, this article proposes a Multi-Scale Contrastive Learning Label Propagation based Weakly Supervised Learning (MSCLLP-WSL) approach. Firstly, a multi-scale contrastive learning method is utilized which effectively improves the ability of the model to obtain invariant features of scene images at different scales. Secondly, to address the problem of insufficient reliable labels, inspired by the Weakly Supervised Learning (WSL) method which supports a small number of labeled data and unlabeled data for training at the same time, this research further introduces WSL methods to make full use of the limited labels that exist in the data usage and production process. Label propagation is also used in this study to complete the tasks of annotating unlabeled data, which improves the performance of the proposed scene classification model even further. The proposed MSCLLP-WSL method has been extensively tested on the AID dataset and the NWPU-RESISC45 dataset with limited annotated data and compared with other benchmark algorithms named finetuned VGG16, finetuned Wideresnet50, and Skip-Connected Covariance (SCCov) network. Experiments demonstrate that multi-scale comparative learning enhances label propagation accuracy, which further improves the classification precision of complicated scenes with limited labeling samples. Hence, we set 1%, 5%, and 10% annotated data to represent the case of limited labels, accordingly. The results demonstrate that the proposed MSCLLP-WSL method in this study achieves an overall accuracy of 85.85%, 93.94%, and 95.65% on the AID dataset using 1%, 5%, and 10% labeled samples, respectively. Similarly, on the NWPU-RESISC45 dataset, the overall classification accuracy of 1%, 5%, and 10% annotated samples reaches 87.83%, 93.67%, and 95.47%, respectively. Although the overall accuracy of the latter dataset is lower than the former, the smaller amount of misclassification also indicates the stability of our proposal in the scene classification of large-scale datasets. The experiments results show that our proposed method achieves impressive performance on these two large-scale scenes datasets with limited annotated samples, which outperforms the benchmark methods in this article.

Key words: remote sensing scene classification, multi-scale, deep learning, contrastive learning, weakly supervised learning, limited training samples, label propagation, remote sensing