基于随机森林算法的近地表气温遥感反演研究
作者简介:白 琳(1991-),女,硕士生,研究方向为3S集成与气象应用。E-mail:bailin@nuist.edu.cn
收稿日期: 2016-06-13
要求修回日期: 2016-08-26
网络出版日期: 2017-03-20
基金资助
国家自然科学基金项目(41201369)
高分辨率对地观测系统重大专项
Remote Sensing Inversion of Near Surface Air Temperature Based on Random Forest
Received date: 2016-06-13
Request revised date: 2016-08-26
Online published: 2017-03-20
Copyright
近地表气温是城市热环境的重要表征,是改变和影响城区气候的重要因素。为获得空间上连续的近地表气温,本文以北京市为研究区,利用Landsat5/TM数据计算分别得到地表温度、归一化植被指数、改进的归一化差异水体指数、地表反照率、不透水面盖度,并结合气象站点气温和高程作为输入参数建立随机森林模型反演近地表气温。结果表明,随机森林反演的近地表气温平均绝对误差(MAE)为0.80 ℃,均方根误差(RMSE)为1.06 ℃,与传统多元线性气温回归方法相比,平均绝对误差(MAE)和均方根误差(RMSE)分别提高0.06 ℃和0.09 ℃。研究表明,利用随机森林模型反演近地表气温是可行的,并且具有一定的优越性。此外,对随机森林模型的输入参数进行重要性分析,地表温度对气温反演模型的影响最大,其次为高程。
白琳 , 徐永明 , 何苗 , 李宁 . 基于随机森林算法的近地表气温遥感反演研究[J]. 地球信息科学学报, 2017 , 19(3) : 390 -397 . DOI: 10.3724/SP.J.1047.2017.00390
Near-surface air temperature is an important symbol of urban thermal environment, which is also an important factor affecting and changing the climate of the city. The data of near-surface air temperature is often in absence because the number of meteorological stations is few. In order to obtain spatial continuous near surface air temperature data, this study takes Beijing city as the research area, using Landsat5/TM data to retrieve land surface temperature, normalized difference vegetation index, modified normalized difference water index, albedo and impervious surface cover. These are combined with the meteorological station temperature and DEM as the input parameters into random forest regression model to retrieve near surface air temperature. In this study, land surface temperature was retrieved by single-channel algorithm which was proposed by Jiménez-Muoz in 2003. The imperious surface cover was calculated by the linear spectral unmixing method and Vegetation-Impervious surface-Soil (VIS) model. The random forest is one of the most effective methods of classification and it runs by constructing multiple decision tree while training and outputting the class. This study uses the R language which is a free software environment for statistical computing and graphics to achieve random forest. The results show that the random forest method has good applicability in the near surface temperature retrieval. The mean absolute error (MAE) and root mean square error (RMSE) of the random forest method are 0.80 and 1.07, respectively. Compared with the ordinary regression model, the MAE and (RMSE) accuracy increased by 0.06 and 0.09. Using R language to analyze the importance of variables, land surface temperature has the greatest influence on the results. The increase in Mean Square Error of land surface temperature is 14% and the increase in node purity of land surface temperature is 241.36%.
Fig. 1 Distribution map of meteorological observingstation in the study area图1 研究区气象站点分布图 |
Fig. 2 The building process of Random Forest图2 随机森林模型建立过程 |
Fig. 3 Model error changes with the number of Decision Tree图3 模型误差随决策树数目的变化 |
Tab. 1 Equations of the correlation index and albedo表1 相关指数及反照率计算方程 |
自变量 | 方程 | 参考文献 |
---|---|---|
NDVI | 文献[24] | |
MNDWI | 文献[25] | |
Albedo | 文献[26] |
Fig. 4 Scatter plot of measured air temperature versus derived air temperature from Random Forest and Linear Regression图4 随机森林反演和线性回归反演的气温值与观测值的散点图 |
Fig. 5 Distributions of absolute error of the estimatednear-surface air temperature in Beijing图5 北京市近地表气温反演绝对误差分布图 |
Tab. 2 The importance of forests random variables表2 随机森林变量重要性 |
精度平均减少值/% | 节点不纯度平均减少值 | |
---|---|---|
LST | 14.28 | 241.36 |
Altitude | 12.82 | 213.30 |
NDVI | 3.43 | 60.15 |
MNDWI | 3.77 | 82.89 |
Albedo | 4.82 | 61.09 |
ISC | 2.50 | 42.74 |
Fig. 6 Map of near surface air temperature in Beijing图6 北京市近地表气温图 |
The authors have declared that no competing interests exist.
[1] |
|
[2] |
[
|
[3] |
[
|
[4] |
[
|
[5] |
[
|
[6] |
|
[7] |
|
[8] |
[
|
[9] |
|
[10] |
|
[11] |
|
[12] |
[
|
[13] |
|
[14] |
|
[15] |
[
|
[16] |
[
|
[17] |
[
|
[18] |
|
[19] |
|
[20] |
[
|
[21] |
|
[22] |
|
[23] |
[
|
[24] |
|
[25] |
[
|
[26] |
[
|
/
〈 | 〉 |