地球信息科学学报 ›› 2021, Vol. 23 ›› Issue (7): 1312-1324.doi: 10.12082/dqxxkx.2021.200605

• 遥感科学与应用技术 • 上一篇    下一篇

基于随机森林算法的草原地上生物量遥感估算方法研究

邢晓语1(), 杨秀春1,2,*(), 徐斌1,2, 金云翔1, 郭剑3,4, 陈昂2, 杨东1, 王平2, 朱立博5   

  1. 1.农业部农业信息技术重点实验室 中国农业科学院农业资源与农业区划研究所,北京 100081
    2.北京林业大学草业与草原学院 草地资源与生态研究中心,北京 100083
    3.遥感科学国家重点实验室 北京师范大学地理科学学部,北京 100875
    4.环境遥感与数字城市北京市重点实验室 北京师范大学地理科学学部,北京 100875
    5.内蒙古呼伦贝尔市畜牧科学研究所, 海拉尔 021008
  • 收稿日期:2020-10-15 修回日期:2021-01-21 出版日期:2021-07-25 发布日期:2021-09-25
  • 通讯作者: * 杨秀春(1975— ),女,河北迁安人,博士,研究员,主要从事草地资源遥感研究。E-mail: Yangxiuchun@bjfu.edu.cn
  • 作者简介:邢晓语(1996— ),女,山东淄博人,硕士生,主要从事生态系统服务研究。E-mail: xingxiaoyu51@163.com
  • 基金资助:
    国家重点研发计划项目(2017YFC0506504);国家自然科学基金项目(41571105)

Remote Sensing Estimation of Grassland Aboveground Biomass based on Random Forest

XING Xiaoyu1(), YANG Xiuchun1,2,*(), XU Bin1,2, JIN Yunxiang1, GUO Jian3,4, CHEN Ang2, YANG Dong1, WANG Ping2, ZHU Libo5   

  1. 1. Key Laboratory of Agri-informatics, Ministry of Agriculture/Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    2. Research Center of Grassland Ecology and Resources, School of Grassland Science, Beijing Forestry University, Beijing 100083, China
    3. State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
    4. Beijing Key Laboratory for Remote Sensing of Environment and Digital Cities, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
    5. Hulunbeier Institute of animal husbandry, Inner Mongolia, Hailar 021008, China
  • Received:2020-10-15 Revised:2021-01-21 Online:2021-07-25 Published:2021-09-25
  • Supported by:
    National Key Research and Development Program of China(2017YFC0506504);National Natural Science Foundation of China(41571105)

摘要:

草原是我国面积最大的陆地生态系统,生物量是反映生态系统质量和功能的关键指标,准确地掌握草原生物量对草原资源合理利用、生态修复、畜牧业高质量发展都具有重要的意义和作用。本研究以内蒙古锡林郭勒盟为研究区,利用高分一号遥感卫星影像,结合216个野外样本数据,采用随机森林算法(Random Forest,RF)对草原地上生物量(Aboveground Biomass,AGB)遥感估算进行了适用性分析与应用。在运用随机森林算法的过程中,进行了K-折交叉验证、多元共线性诊断、偏效应等一系列分析,完成了随机森林模型的构建,同时,将建模结果与其它模型进行了对比,最终实现了锡林郭勒盟草原AGB的反演估算。结果表明:① 随机森林算法能够较好地规避生物量建模中自变量多元共线性的问题;② 随机森林模型在草原AGB估算中较其它模型具有更好的适用性,模型精度达85%,RMSE为202.13 kg/hm2;③ 应用构建的随机森林算法估算了研究区2017年草原AGB,从结果来看,其空间分布上呈现为自东向西逐渐递减的趋势;从草地类型上看,山地草甸类AGB单产最高,温性草原类总产量最高。研究结果将对草原生态系统监测评估和草原宏观管理具有一定的参考价值。

关键词: 草原地上生物量, 随机森林, 支持向量机, 高分一号, 多元共线性, 偏效应, 机器学习, 回归模型

Abstract:

Grassland is the largest terrestrial ecosystem in China. Biomass is a key indicator of ecosystem quality and ecosystem function. It is of great significance for us to accurately estimate the grassland biomass for the effective and rational use of grassland resources, the restoration of damaged grassland ecosystem, and the high-quality development of animal husbandry. In this study, we took Xilinguole league of Inner Mongolia autonomous region as the research area. We used GF-1 satellite images, ground sample data of 216 sites, and Random Forest (RF) algorithm to estimate Grassland Aboveground Biomass (AGB) and explore the applicability of the algorithm in grassland biomass estimation. Moreover, in order to evaluate the applicability of random forest algorithm in aboveground biomass estimation, we carried out a series of analysis when using the algorithm, such as k-fold cross validation, multicollinearity diagnosis, partial effect and so on. Based this, we completed the construction of the random forest model and compared the modeling results with those from other models. Then, we selected the best model to realize the inversion estimation of grassland aboveground biomass in Xilinguole league. The main conclusions are as follows: (1) In the process of biomass model construction in Xilinguole league, random forest algorithm can avoid multicollinearity problem if there are multiple input variables; (2) The random forest model has better applicability than other models in the estimation of grassland biomass. The accuracy of the random forest model is 85% while the RMSE is 202.13 kg/hm2; (3) Using the random forest model, we estimated the grassland aboveground biomass of the whole study area in 2017. The results indicated that the spatial distribution had a decreasing trend from east to west. When grassland types are concerned, the grassland aboveground biomass yield of mountain meadow was the highest among all grassland types while the total yield of temperate grassland was the highest among all grassland types. The results are not only beneficial to the monitoring and evaluation of grassland ecosystem, but also have a certain reference value for grassland macro management.

Key words: grassland aboveground biomass, random forest, support vector machine, GF-1, multicollinearity, partial effect, machine learning, regression model