地球信息科学学报 ›› 2022, Vol. 24 ›› Issue (4): 792-801.doi: 10.12082/dqxxkx.2022.210530

• 遥感科学与应用技术 • 上一篇    下一篇

利用基于残差多注意力和ACON激活函数的神经网络提取建筑物

吴新辉1,2(), 毛政元1,2,*(), 翁谦3,4, 施文灶5,6   

  1. 1.福州大学数字中国研究院(福建),福州 350108
    2.福州大学空间数据挖掘与信息共享教育部重点实验室,福州 350108
    3.福州大学计算机与大数据学院,福州 350108
    4.福建省网络计算与智能信息处理省重点实验室(福州大学),福州 350108
    5.福建师范大学光电与信息工程学院,福州 350007
    6.福建师范大学福建省光电传感应用工程技术研究中心,福州 350007
  • 收稿日期:2021-09-02 修回日期:2021-10-10 出版日期:2022-04-25 发布日期:2022-06-25
  • 通讯作者: *毛政元(1964— ),男,湖南邵阳人,博士,教授,博士生导师,主要从事时空系统认知与测度、高分影像信息提取与地表变化检测、地理空间数据不确定性分析及其应用、土地资源信息化管理与决策服务研究。 E-mail: zymao@fzu.edu.cn
  • 作者简介:吴新辉(1995— ),男,福建莆田人,硕士生,主要从事深度学习、遥感影像的分析与应用研究。E-mail: wdlan4869@163.com
  • 基金资助:
    国家自然科学基金项目(41801324);国家自然科学基金项目(41701491);福建省自然科学基金面上项目(2019J01244);福建省自然科学基金面上项目(2019J01791)

A Neural Network based on Residual Multi-attention and ACON Activation Function for Extract Buildings

WU Xinhui1,2(), MAO Zhengyuan1,2,*(), WENG Qian3,4, SHI Wenzao5,6   

  1. 1. Academy of Digital China, Fuzhou University, Fuzhou 350108, China
    2. Key Laboratory of Spatial Data Mining & Information Sharing of Ministry of Education, Fuzhou University, Fuzhou 350108, China
    3. College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
    4. Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing (Fuzhou University), Fuzhou 350108, China
    5. College of Opto-Electronic and Information Engineering, Fujian Normal University, Fuzhou 350007, China
    6. Fujian Engineering Technology Research Center of Photoelectric Sensing Application, Fujian Normal University, Fuzhou 350007, China
  • Received:2021-09-02 Revised:2021-10-10 Online:2022-04-25 Published:2022-06-25
  • Contact: MAO Zhengyuan
  • Supported by:
    Youth Project of National Natural Science Foundation of China(41801324);Youth Project of National Natural Science Foundation of China(41701491);General project of Natural Science Foundation of Fujian Province(2019J01244);General project of Natural Science Foundation of Fujian Province(2019J01791)

摘要:

针对目前主流深度学习网络模型应用于高空间分辩率遥感影像建筑物提取存在的内部空洞、不连续以及边缘缺失与边界不规则等问题,本文在U-Net模型结构的基础上通过设计新的激活函数(ACON)、集成残差以及通道-空间与十字注意力模块,提出RMAU-Net模型。该模型中的ACON激活函数允许每个神经元自适应地激活或不激活,有利于提高模型的泛化能力和传输性能;残差模块用于拓宽网络深度并降低训练和学习的难度,获取深层次语义特征信息;通道-空间注意力模块用于增强编码段与解码段信息的关联、抑制无关背景区域的影响,提高模型的灵敏度;十字注意力模块聚合交叉路径上所有像素的上下文信息,通过循环操作捕获全局上下文信息,提高像素间的全局相关性。以Massachusetts数据集为样本的建筑物提取实验表明,在所有参与比对的7个模型中,本文提出的RMAU-Net模型交并比与F1分数2项指标最优、查准率和查全率两项指标接近最优, RMA-UNet总体效果优于同类模型。通过逐步添加每个模块来进一步验证各模块的有效性以及本文所提方法的可靠性。

关键词: 高分影像, 建筑物提取, 卷积神经网络, ACON激活函数, 残差块, 空间注意力, 通道注意力模块, 十字注意力

Abstract:

Current mainstream deep learning network models have many problems such as inner cavity, discontinuity, missed periphery, and irregular boundaries when applied to building extraction from high spatial resolution remote sensing images. This paper proposed the RMAU-Net model by designing a new activation function (Activate Customized or Not, ACON) and integrating residuals block with channel-space and criss-cross attention module based on the U-Net model structure. The ACON activation function in the model allows each neuron to be activated or not activated adaptively, which helps improve the generalization ability and transmission performance of the model. The residual module is used to broaden the depth of the network, reduce the difficulty in training and learning, and obtain deep semantic feature information. The channel-spatial attention module is used to enhance the correlation between encoding and decoding information, suppress the influence of irrelevant background region, and improve the sensitivity of the model. The cross attention module aggregates the context information of all pixels on the cross path and captures the global context information by circular operation to improve the global correlation between pixels. The building extraction experiment using the Massachusetts dataset as samples shows that among all the 7 comparison models, the proposed RMA-UNET model is optimal in terms of intersection of union and F1-score, as well as indexes of precision and recall, and the overall performance of RMAU-Net is better than similar models. Each module is added step by step to further verify the validity of each module and the reliability of the proposed method.

Key words: High resolution image, Building extraction, Convolutional neural network, ACON activation function, Residual Block, Spatial Attention, Channel Attention, criss-cross attention