地球信息科学学报 ›› 2023, Vol. 25 ›› Issue (10): 2012-2025.doi: 10.12082/dqxxkx.2023.230171

• 地球信息科学理论与方法 • 上一篇    下一篇

AED-Net:滑坡灾害遥感影像语义分割模型

蒋伟杰(), 张春菊*(), 徐兵, 罗晨晨, 周晗, 周康   

  1. 合肥工业大学土木与水利工程学院,合肥 230009
  • 收稿日期:2023-04-03 修回日期:2023-06-01 出版日期:2023-10-25 发布日期:2023-09-22
  • 通讯作者: * 张春菊(1984—),女,安徽宿州人,博士,副教授,主要从事地理信息智能处理与服务研究。 E-mail: zcjtwz@sina.com
  • 作者简介:蒋伟杰(1998—),男,安徽阜阳人,硕士生,主要从事遥感影像智能处理与研究。E-mail: jwj_1219@163.com
  • 基金资助:
    国家自然科学基金项目(42171453);国家重点研发计划项目(2022YFB3904200)

AED-Net: Semantic Segmentation Model for Landslide Recognition from Remote Sensing Images

JIANG Weijie(), ZAHNG Chunju*(), XU Bing, LUO Chenchen, ZHOU Han, ZHOU Kang   

  1. School of Civil Engineering, Hefei University of Technology, Hefei 230009, China
  • Received:2023-04-03 Revised:2023-06-01 Online:2023-10-25 Published:2023-09-22
  • Contact: * ZHANG Chunju, E-mail: zcjtwz@163.com
  • Supported by:
    National Natural Science Foundation of China(42171453);National Key Research and Development Program of China(2022YFB3904200)

摘要:

遥感影像蕴含丰富的语义信息,在滑坡灾害监测任务中发挥出了重要的作用。传统的滑坡识别主要通过遥感目视解译和人机交互识别,存在耗时费力、主观性强和提取精度低等问题。语义分割作为深度学习中的一项重要任务,因其端到端的像素级分类能力,已在遥感影像自动化识别任务中发挥出了重要作用。现有遥感影像滑坡灾害语义分割模型通常无法顾及多尺度地物特征,且随着网络深度增加会造成边界模糊等问题。本文提出了AED-Net(Attention combined with Encoder-Decoder Network),使用浅层特征提取网络缓解深度神经网络造成的边界模糊问题,利用空洞空间卷积池化金字塔结构的多尺度特征提取能力,结合编码器-解码器结构的特征还原能力还原边界信息,并使用通道注意力机制强化模型的关键特征学习能力。利用GID-5数据集针对模型中空洞卷积的膨胀率设置、通道注意力机制的选择进行对比试验以得到最优解,最终得到的模型在毕节市滑坡灾害数据集上获得了最优表现,像素准确度为95.58%,平均像素精度为89.24%,平均交互比为82.68%,相比PSP-Net、Attention U-Net、加入ECA注意力机制的DeeplabV3+、PA-Fov、LandsNet等语义分割模型,PA提升了0.73%~1.97%,MPA提升了1.0%~2.84%,MIoU提升了2.25%~5.11%,达到了最优分割效果。

关键词: 深度学习, 滑坡, 遥感影像, 编码器-解码器, 特征增强, 多尺度特征, 语义分割, 注意力机制

Abstract:

Remote sensing images contain rich semantic information and play an important role in landslide disaster monitoring. Traditional landslide recognition is mainly based on remote sensing visual interpretation and human-computer interaction recognition, which is time and labor consuming, with strong subjectivity and low extraction accuracy. Semantic segmentation, as an important task in deep learning, has played an important role in automatic recognition tasks using remote sensing images due to its end-to-end, pixel-level classification capability and has great potential in automatic recognition of landslides. The existing semantic segmentation models for landslides using remote sensing images usually lack the feature information of multi-scale ground objects, and the boundary will be blurred with the increase of network depth. In this paper, Attention combined with Encoder-Decoder Network (AED-Net) is proposed for landslide recognition. A shallow feature extraction network is used to alleviate the boundary ambiguity caused by deep neural network. Multi-scale feature extraction capability of convolution pool pyramid structure in void space is utilized. Combined with the feature restoring ability of the encoder-decoder structure, the boundary information is restored, and the channel attention mechanism is used to enhance the key feature learning ability of the model. The focal-loss function is used to alleviate the imbalance of positive and negative samples. In our study, firstly, the GID-5 data set is used to conduct comparative tests on the expansion rate setting of void convolutions and the selection of channel attention mechanism in the model to get the optimal solution. Then, the feature weight is transferred to the semantic segmentation task for landslide disaster by using transfer learning method, and the hyperparameter discussion and ablation experiment are carried out. The resulting model achieves the optimal segmentation performance on the landslide disaster data set of Bijie City, with a Pixel Accuracy (PA) of 95.58%, the Mean Pixel Accuracy (MPA) of 89.24%, and the Mean Intersection over Union (MIoU) of 82.68%. Compared with classical semantic segmentation networks such as PSP-Net, Attention U-Net, DeeplabV3+ with ECA attention mechanism, and semantic segmentation models such as PA-Fov and LandsNet for classfifying landslide disasters, the pixel accuracy of our model increases by 0.73%~1.97%. The average pixel accuracy of all categories increases by 1.0%~2.84%, and the average interaction ratio increases by 2.25%~5.11%. Moreover, the edge information of landslide image is smoother and the multi-scale landslide segmentation accuracy is better than other deep learning models, which demonstrates the effectiveness of the proposed model in semantic segmentation of landslides from remote sensing images.

Key words: deep learning, landslide, remote sensing image, encoder-decoder, feature enhancement, multi-scale characteristics, semantic segmentation, attention mechanism