地球信息科学学报 ›› 2021, Vol. 23 ›› Issue (9): 1662-1674.doi: 10.12082/dqxxkx.2021.200711

• 遥感科学与应用技术 • 上一篇    下一篇

基于优化随机森林回归模型的土壤盐渍化反演

杨练兵1,2,3(), 陈春波1,2,3,4, 郑宏伟1,2,3,4,*(), 罗格平1,2,3,4, 尚白军1,3, Olaf Hellwich1,5   

  1. 1. 中国科学院新疆生态与地理研究所荒漠与绿洲生态国家重点实验室,乌鲁木齐 830011
    2. 新疆维吾尔自治区遥感与地理信息系统应用重点实验室,乌鲁木齐 830011
    3. 中国科学院大学,北京 100049
    4. 中国科学院中亚生态与环境研究中心, 乌鲁木齐 830011
    5. 德国柏林工业大学计算机视觉和遥感研究所,柏林 10623
  • 收稿日期:2020-11-26 出版日期:2021-09-25 发布日期:2021-11-25
  • 通讯作者: *郑宏伟(1972— ),男,山东潍坊人,博士,研究员,主要从事机器智能和模式分析,生态与地理及气候效应、遥感 与GIS应用研究。E-mail: hzheng@ms.xjb.ac.cn
  • 作者简介:杨练兵(1993— ),男,湖北鄂州人,硕士生,主要从事遥感与地理信息系统应用研究。E-mail: yanglianbing18@mails. ucas.ac.cn
  • 基金资助:
    国家自然科学基金项目(41877012);中国科学院一带一路项目(2018-YDYLTD-002);中国科学院特色研究所项目(TSS-2015-014-FW-1-3)

Retrieval of Soil Salinity Content based on Random Forests Regression Optimized by Bayesian Optimization Algorithm and Genetic Algorithm

YANG Lianbing1,2,3(), CHEN Chunbo1,2,3,4, ZHENG Hongwei1,2,3,4,*(), LUO Geping1,2,3,4, SHANG Baijun1,3, Olaf Hellwich1,5   

  1. 1. State Key Laboratory of Dessert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China
    2. Key Laboratory of GIS & RS Application Xinjiang Uygur Autonomous Region, Urumqi 830011, China
    3. University of Chinese Academy of Sciences, Beijing 100049, China
    4. Research Center for Ecology and Environment of Central Asia, Chinese Academy of Sciences, Urumqi 830011, China
    5. Technical University of Berlin Tech Univ Berlin, Comp Vis & Remote Sensing, Berlin 10623, Germany
  • Received:2020-11-26 Online:2021-09-25 Published:2021-11-25
  • Supported by:
    National Natural Science Foundation of China(41877012);The Belt and Road program of Chinese academy of sciences(2018-YDYLTD-002);Characteristic Institutes Main Service Program(Program1,Topic3) of Chinese Academy of Sciences(TSS-2015-014-FW-1-3)

摘要:

当前应用于土壤盐分含量(Soil Salinity Content, SSC)反演的随机森林回归(Random Forests Regression, RFR)较少关注对模型精度影响较大的反演参数子集和模型参数的同步优化。本研究选择渭-库绿洲和奇台绿洲为实验区,基于Landsat-5 TM、SRTM、MOD11A2.006遥感数据构建反演参数。首先,利用弹性网络(Elastic Net, EN)筛选出反演参数子集,然后利用遗传算法(Genetic Algorithm, GA)和贝叶斯优化算法(Bayesian Optimization Algorithm, BOA)分别优化随机森林回归(Random Forests Regression, RFR)参数,建立反演参数子集和模型参数分步优化的RFR模型(EN-GA-RFR、EN-BOA-RFR)。建立利用GA和BOA分别同步优化反演参数子集和模型参数的RFR模型(GA-RFR、BOA-RFR)。在每个实验区,对比EN-GA-RFR、EN-BOA-RFR、GA-RFR、BOA-RFR的预测精度。最后分析每个实验区各类盐渍土的空间分布,并对2个实验区的反演参数进行对比分析。结果表明:每个实验区模型预测精度由高到低的排序均为BOA-RFR>GA-RFR>EN-BOA-RFR=EN-GA-RFR,整体上BOA的优化性能均好于GA;渭-库绿洲和奇台绿洲面积占比最大的盐渍土类型分别为盐渍土和中度盐渍土;反演参数对SSC的表征能力存在空间分异性。

关键词: 土壤盐分含量, 同步优化, 随机森林回归, 贝叶斯优化算法, 遗传算法, 弹性网络, 反演参数子集, 模型参数

Abstract:

Random Forests Regression (RFR) is often used to inverse Soil Salinity Content (SSC)nowadays. However, the most important impact factors on the model accuracy such as the synchronization optimization of the inversion parameters subset and the model parameters have not been studied carefully in the applications of RFR. In this study, we selected Weiku Oasis and Qitai Oasis as experiment areas. The inversion parameters were constructed based on remote sensing data, including Landsat-5 TM, SRTM, and MOD11A2.006. Firstly, we applied Elastic Net (EN) to select a subset of the inversion parameters, developed Genetic Algorithm (GA) and Bayesian Optimization Algorithm (BOA) to optimize RFR, and established RFR models (EN-GA-RFR, EN-BOA-RFR) for stepwise optimization of inversion parameters subset and model parameters. Then we used GA and BOA to simultaneously optimize the inversion parameters subset and model parameters based on the combination methods of RFR, including GA-RFR and BOA-RFR methods. Furthermore, in each experiment area, we compared the prediction accuracy of EN-GA-RFR, EN-BOA-RFR, GA-RFR, and BOA-RFR. In this way, the spatial distributions of various saline soils in each experiment area were analyzed. The inversion parameters of the two experiment areas were also compared and analyzed. The results show that the order of model prediction accuracy in each study area from high to low is BOA-RFR>GA-RFR>EN-BOA-RFR=EN-GA-RFR. Overall, BOA had a better optimization performance than GA. Finally, the results show that the types of saline soils with the largest area in Ku Oasis and Qitai Oasis are saline soil and moderate saline soil, respectively. The inversion parameters have spatial differentiation in the characterization ability of SSC.

Key words: soil salinity content, synchronization optimization, Random Forests Regression, Bayesian Optimization Algorithm, Genetic Algorithm, Elastic Net, inversion parameters subset, model parameters