基于随机森林算法的近地表气温遥感反演研究

  • 白琳 ,
  • 徐永明 , * ,
  • 何苗 ,
  • 李宁
展开
  • 南京信息工程大学地理与遥感学院,南京 210044
*通讯作者:徐永明(1980-),男,博士,副教授,研究方向为热红外遥感和资源环境遥感。E-mail:

作者简介:白 琳(1991-),女,硕士生,研究方向为3S集成与气象应用。E-mail:

收稿日期: 2016-06-13

  要求修回日期: 2016-08-26

  网络出版日期: 2017-03-20

基金资助

国家自然科学基金项目(41201369)

高分辨率对地观测系统重大专项

Remote Sensing Inversion of Near Surface Air Temperature Based on Random Forest

  • BAI Lin ,
  • XU Yongming , * ,
  • HE Miao ,
  • LI Ning
Expand
  • Nanjing University of Information Science and Technology School of Geography and Remote Sensing, Nanjing 210044
*Corresponding author: XU Yongming, E-mail:

Received date: 2016-06-13

  Request revised date: 2016-08-26

  Online published: 2017-03-20

Copyright

《地球信息科学学报》编辑部 所有

摘要

近地表气温是城市热环境的重要表征,是改变和影响城区气候的重要因素。为获得空间上连续的近地表气温,本文以北京市为研究区,利用Landsat5/TM数据计算分别得到地表温度、归一化植被指数、改进的归一化差异水体指数、地表反照率、不透水面盖度,并结合气象站点气温和高程作为输入参数建立随机森林模型反演近地表气温。结果表明,随机森林反演的近地表气温平均绝对误差(MAE)为0.80 ℃,均方根误差(RMSE)为1.06 ℃,与传统多元线性气温回归方法相比,平均绝对误差(MAE)和均方根误差(RMSE)分别提高0.06 ℃和0.09 ℃。研究表明,利用随机森林模型反演近地表气温是可行的,并且具有一定的优越性。此外,对随机森林模型的输入参数进行重要性分析,地表温度对气温反演模型的影响最大,其次为高程。

本文引用格式

白琳 , 徐永明 , 何苗 , 李宁 . 基于随机森林算法的近地表气温遥感反演研究[J]. 地球信息科学学报, 2017 , 19(3) : 390 -397 . DOI: 10.3724/SP.J.1047.2017.00390

Abstract

Near-surface air temperature is an important symbol of urban thermal environment, which is also an important factor affecting and changing the climate of the city. The data of near-surface air temperature is often in absence because the number of meteorological stations is few. In order to obtain spatial continuous near surface air temperature data, this study takes Beijing city as the research area, using Landsat5/TM data to retrieve land surface temperature, normalized difference vegetation index, modified normalized difference water index, albedo and impervious surface cover. These are combined with the meteorological station temperature and DEM as the input parameters into random forest regression model to retrieve near surface air temperature. In this study, land surface temperature was retrieved by single-channel algorithm which was proposed by Jiménez-Muoz in 2003. The imperious surface cover was calculated by the linear spectral unmixing method and Vegetation-Impervious surface-Soil (VIS) model. The random forest is one of the most effective methods of classification and it runs by constructing multiple decision tree while training and outputting the class. This study uses the R language which is a free software environment for statistical computing and graphics to achieve random forest. The results show that the random forest method has good applicability in the near surface temperature retrieval. The mean absolute error (MAE) and root mean square error (RMSE) of the random forest method are 0.80 and 1.07, respectively. Compared with the ordinary regression model, the MAE and (RMSE) accuracy increased by 0.06 and 0.09. Using R language to analyze the importance of variables, land surface temperature has the greatest influence on the results. The increase in Mean Square Error of land surface temperature is 14% and the increase in node purity of land surface temperature is 241.36%.

1 引言

近地表气温是陆面能量平衡模型中重要的气候参数[1],是气象观测中最基本的观测项目之一[2],同时也是各种气象、水文和环境等模型中一个重要输入因子[3]。目前,近地表气温数据来源主要依赖于气象站点的观测,气象站点虽能提供精确的气温数据,但气象站点的分布使其只能提供离散的有限点状数据,站点的数量以及城乡气象站点划分准确程度都对近地表气温在相关研究中的应用造成了限制[4]。相较于气象站点的观测数据,遥感数据能够提供大范围且空间连续的地表信息和大气状况,可更好地反映空间异质度信息[5]
近年来,国内外学者在利用遥感数据反演近地表气温方面开展了大量研究。气温反演方法可以大致归纳为:常规统计方法、温度-植被指数法(Temperature Vegetation Index,TVX)、神经网络方法和能量平衡方法。常规统计方法是通过建立地表温度与站点观测气温之间的线性关系来计算气温。Zhao等[6]建立了月气温的多元回归模型,并将之与多种地统计插值方法对比,结果表明线性回归模型有更好的精度。Cresswell等[7]在地表温度外,考虑到太阳天顶角对气温的影响,建立地温、太阳天顶角和气温的多元回归模型,估算误差在0.09~1.69 ℃之间。曲培青等[8]利用Terra/MODIS和Aqua/MODIS数据分别和其它地理数据因子建立回归方程,对不同时刻估算气温的遥感数据进行最优分析。温度-植被指数法是指在浓密植被冠层表面温度近似地表温度的前提下,利用地表温度和植被指数的关系反演气温,其关键在于确定邻域窗口大小,对于空间分辨率为3 km的SEVIRI数据采用7像元×7像元[9],空间分辨率为1 km的MODIS和AVHRR数据采用13像元×13像元[10]。Stisen等[11]将温度-植被指数方法与正弦函数插值相结合估算近地表气温,均方根误差在2.55~2.99 ℃。徐永明等[12]改进了温度-植被指数方法,提高了该方法的精度与适用范围。能量平衡方法利用的是能量平衡原理来进行气温反演研究。Sun等[13]利用能量平衡方程等多个计算公式推导出地表温度与气温之间的定量关系,误差范围在0.3~3.16 ℃之间。
目前还未有学者将随机森林方法运用于反演近地表气温。本文基于2011年北京市Landsat5/TM的遥感影像提取了地表温度(LST),由于地表温度和近地表气温二者之间存在差异,另加入归一化植被指数(NDVI)、改进的归一化差异水体指数(MNDWI)、地表反照率(Albedo)、不透水面盖度(ISC)和高程(Altitude)6个对近地表气温有影响的因子作为随机森林的输入因子,反演近地表气温,并对反演结果及变量重要性进行了讨论。

2 研究区概况与数据源

2.1 研究区概况

北京市位于华北平原北部,毗邻渤海湾,三面环山。地理坐标为东经115.7~117.4°,北纬39.4~41.6°,中心位于北纬39°54′20″,东经116°25′29″。其总面积16 410.54 km2,山区面积约10 200 km2,占总面积的62%,平原区面积约6200 km2,占总面积的38%。山地海拔在1000~1500 m,平原海拔在20~60 m。北京属于北温带半湿润大陆性季风气候,夏季炎热多雨,冬季寒冷干燥,春秋较短,冬夏长。平原地区年均气温11~13 ℃,年极端最高温一般在35~40 ℃左右,年降水量在470~600 mm之间。

2.2 数据源

遥感数据选取Landsat5/TM数据,成像时间为2011年7月26日上午10时24分,预处理阶段结合地形图对影像进行几何精校正,利用6S辐射传输模型对第1-5波段和第7波段进行大气校正,消除大气对数据的影响。以北京市空间分辨率为30 m的数字高程模型(Digital Elevation Model,DEM)数据,作为研究区的高程数据。另外,利用对应时相的MODIS水汽含量产品MOD05_L2提取北京市的水汽数据。
气象数据为北京市2011年7月26日的地面自动气象站逐小时的气象观测数据,研究区范围内共有224个气象站点。由于卫星过境时不是整点时刻,所以取过境时相邻整点数据进行线性插值计算出卫星过境时的气温。图1给出了北京市的气象站点的分布状况。
Fig. 1 Distribution map of meteorological observingstation in the study area

图1 研究区气象站点分布图

3 研究方法

3.1 随机森林模型

随机森林(Random Forest)是2001年由Leo Breiman和Culter Adele开发的一种数据挖掘方 法[14],是一种现代分类与回归的机器学习技术,同时也是一种组合式的自学习技术。随机森林的基本组成单元是决策树,其优越性体现在同等运算率下的高预测精度,以及相较于传统的统计方法,对非线性的数据有更好的拟合效果[15],并且能够进行变量重要性分析,对比神经网络和支持向量机等其他暗箱方法在分析变量关系上存在优势[16]。随机森林在遥感方面的应用主要集中在遥感图像分类上,比传统的遥感分类方法提供更好的精度[17]。但是目前随机森林算法较少被应用于遥感定量反演方面的研究[18]
本文通过R语言中的random Forest数据包构建随机森林模型来反演北京市近地表气温。模型输入自变量包括地表温度(LST)、归一化植被指数(NDVI)、改进的归一化差异水体指数(MNDWI)以及地表反照率(Albedo)、高程(Altitude)和不透水面盖度(ISC),因变量为气象站点的观测气温。模型构建过程如图2,具体步骤为:
Fig. 2 The building process of Random Forest

图2 随机森林模型建立过程

(1)在N个总样本中有放回的随机抽取n次,得到n个新的训练集,未抽取的部分组成袋外数据(OOB);
(2)每个训练集生成一个决策树,决策树每个节点从自变量中选择mtry个,按照节点不纯洁度最小原则进行分支生长;
(3)重复步骤(2)n次,得到n棵决策树组成随机森林;
(4)随机森林的结果为每棵决策树通过简单平均法得到的结果,预测精度利用每棵决策树的平均OOB来确定。
构建随机森林模型需确定树节点预选的变量个数和决策树数目2个关键参数,以此得到最优化的随机森林模型。树节点预选的变量个数应小于输入的参数数量,根据模型误判率最低原则,将选取mtry为2。决策树的数目n利用R语言绘制出相关误差与随机森林中决策树数量的关系图(图3)进行判断,从图3可见,在决策树数量小于70时模型误差会出现较大的波动。当决策树数量大于70后模型误差趋于平稳,因此将决策树的数目设置为70。
Fig. 3 Model error changes with the number of Decision Tree

图3 模型误差随决策树数目的变化

3.2 自变量

(1)地表温度
利用Jiménez-Muoz等[19]提出的单通道算法进行地表温度的反演。
T s = γ [ ε - 1 ( ψ 1 L + ψ 2 ) + ψ 3 ] + δ (1)
其中,
γ = c 2 L T 6 2 λ 4 c 1 L + λ - 1 - 1 (2)
δ = - γ L + T 6 (3)
式中:ε是地表比辐射率,利用混合像元法进行计算;L是传感器所接收到的辐射强度/W·m2·sr-1·μm-1;T是亮度温度/K;λ是有效作用波长(对于第6波段来说为11.457μm);c1、c2 是辐射常量,分别为1.19104×108W·m2·sr-1·μm4和 1.43877*104μmk;ψ1、ψ2、ψ3是大气参数,可以由大气剖面总水汽含量w来获得,对于Landsat5/TM第6波段,公式如下:
ψ 1 = 0.14717 · w 2 - 0.15583 · w + 1.1234 ψ 2 = - 1.1836 · w 2 - 0.37607 · w + 0.52894 ψ 3 = - 0.04554 · w 2 - 1.8719 · w - 0.39071 (4)
(2)不透水面盖度
城市化程度和范围可利用不透水面量化表 征[20]。不透水表面指水不能直接通过且不能下渗到土壤中的人为景观[21],不透水面直接改变地表特性,对城市生态环境,尤其是城市热环境有直接影响。本文利用V-I-S模型(Vegetation-Impervious Surface-soil)进行不透水面盖度的计算,该方法由Ridd[22]于1995年提出,认为在剔除景观水体外,城市下垫面构成类型主要包含植被、土壤、不透水面3种典型土地覆盖类型。为了确定端元光谱特征,对遥感影像进行MNF变换以减少数据冗余和波段之间的相关性,再通过像元纯净度PPI计算和N维散度分析提高植被、土壤、不透水面端元光谱特征的精度。确定像元内不同端元在不同光谱波段的特征值,从而确定不同端元的所占比例[23]
(3)其他变量
考虑到地表海拔高度、植被覆盖、水体分布以及太阳的入射辐射对近地表气温反演的影响,除LST和ISC外,将Altitude、NDVI、MNDWI和Albedo也作为输入变量共同构建基于随机森林算法的气温反演模型。表1给出基于Landsat5/TM数据的NDVI、MNDWI和Albedo共3个地表特征参数计算方法。
Tab. 1 Equations of the correlation index and albedo

表1 相关指数及反照率计算方程

自变量 方程 参考文献
NDVI NDVI=(NIR-R)(NIR+R) 文献[24]
MNDWI MNDWI=(Green-MIR)(Green+MIR) 文献[25]
Albedo αshort=0.356α1+0.13α3+0.373α4+ 0.085α5+0.072α7 文献[26]

3.3 模型验证方法

随机森林在样本选取上的随机性使其本身具有交叉验证的优点,当决策树的数目足够多时,基本可以保证每个样本分别作为训练样本和测试样本,有效地避免了过度拟合的结果。但是为了进一步验证算法,本文从数据集中随机抽取了3/4的样本(168个样本)作为训练数据集,剩下的1/4样本(56个样本)作为测试数据集。首先利用训练集数据建立随机森林模型,然后利用测试集数据对建立的模型进行精度评价,根据平均绝对误差(MAE)和均方根误差(RMSE)来判断模型的优劣。

4 结果与分析

4.1 验证结果

基于168个样本的LST、Altitude、NDVI、MNDWI、Albedo和ISC与对应站点气温构建随机森林模型,再利用另外56个样本对由该168个样本建立的随机森林模型进行验证。MAE为0.80 ℃,RMSE为1.06 ℃,反演精度较好,随机森林对于近地表气温的反演有较好的适用性,对气温的估算效果较好。图4(a)给出随机森林方法反演的测试集气温与实际观测气温的散点图。图中样本大部分聚集在1:1线周围,有较高拟合度,以30 ℃为界线,温度高于30 ℃的样本分布比低于30 ℃的样本更贴近1:1线,表明随机森林在温度较高时反演精度更好。
Fig. 4 Scatter plot of measured air temperature versus derived air temperature from Random Forest and Linear Regression

图4 随机森林反演和线性回归反演的气温值与观测值的散点图

另外,基于相同的训练集和测试集使用了传统的多元线性回归方法对气温进行估算,建立了以LST、NDVI、MNDWI、Albedo、Altitude及ISC为自变量、以气温为因变量的多元线性方程,与随机森林算法进行对比分析。图4(b)给出了线性回归方程反演的测试集气温与站点观测气温散点图。从图可见,大部分样本也分布在1:1线周围,但比随机森林反演结果略微松散,拟合程度没有随机森林好。线性回归方程的MAE为0.86 ℃,RMSE为1.15 ℃。总体上看,随机森林模型的反演精度要高于线性回归方法。这是因为随机森林并不是单纯的线性拟合,所以在针对较多因子时,具有更好的灵活性和预测性。另外,在气温较低时,无论是随机森林模型还是线性回归方法气象站点的气温值和估算气温值都相差较大,说明在温度较低时,反演误差相对较高。
计算各个气象站点随机森林反演近地表气温和观测气温之间的绝对值,得到北京市近地表气温绝对误差空间分布图(图5)。从图5可见,北京市气温的反演误差具有较明显的空间分布特征:中心城区的误差总体上比较低,而郊区的误差比较高,在高海拔地区这一特征尤为明显。这可能是由于山区地形多变,地-气能量交换过程更复杂所导致的。这也与图4反映的随机森林模型在气温低值区误差更大的特征相吻合。
Fig. 5 Distributions of absolute error of the estimatednear-surface air temperature in Beijing

图5 北京市近地表气温反演绝对误差分布图

4.2 变量重要性分析

R语言提供的重要性函数可以直接对变量重要性进行分析,主要评价指标为精度平均减少值IncMSE和节点不纯度平均减少值IncNodePurity。IncMSE指将该变量随机取值后随机森林模型估算误差相对于原来误差的升高幅度。IncMSE值越大,说明该变量越重要。IncNode Purity是指该变量对各个决策树节点的影响程度。IncNodePurity值越大,说明该变量越重要。表2给出气温随机森林模型的变量重要性。从表2可以看出,在输入的自变量中地表温度是最重要的输入参数,地表通过长波辐射、蒸散、湍流交换等形式与近地表气温进行能量交换,地表温度和近地表气温之间有很强的相关性,因此地表温度对于模型的影响最大。高程也是影响气温空间分布及地气温关系的重要因子,研究区域内包含山地与平原,存在海拔差异,从而使其重要性仅次于地表温度。NDVI、MNDWI、ISC和Albedo这4个参数表征了地表的植被、水体、不透水面覆盖信息及地表反射太阳辐射的能力,这些下垫面特征通过对地气温关系的影响而间接影响模型反演精度,相对而言重要性要明显低于地表温度和海拔2个变量。
Tab. 2 The importance of forests random variables

表2 随机森林变量重要性

精度平均减少值/% 节点不纯度平均减少值
LST 14.28 241.36
Altitude 12.82 213.30
NDVI 3.43 60.15
MNDWI 3.77 82.89
Albedo 4.82 61.09
ISC 2.50 42.74

4.3 计算结果

将北京市的LST、NDVI、Altitude、MNDWI、Albedo以及ISC这6个自变量代入随机森林模型,计算得到北京市近地表气温空间分布图(图6)。从图6可见,北京市近地表气温呈现出显著的空间差异性:中心城区的气温较高,呈现出城市热岛特征;从中心城区到郊区气温逐渐降低,城区周围农田的温度通常低于城区3~5 ℃,而山地的气温则显著低于农田,并且山地呈现海拔越高气温越低的趋势。北京市反演气温与实际气象站点气温分布情况总体上一致,最高气温和最低气温都在合理范围内,无异常值出现,很好地反映了北京市的气温分布状况。
Fig. 6 Map of near surface air temperature in Beijing

图6 北京市近地表气温图

5 结论

本文首次采用随机森林方法对北京市近地表气温进行遥感反演,证明了随机森林在定量遥感中的可利用性和其在气温反演上的优越性。结果表明:① 随机森林模型适用于近地表气温的反演,平均绝对误差为0.80 ℃、均方根误差为1.06 ℃,与多元线性回归模型相比随机森林的反演精度更为理想(多元回归模型的平均绝对误差为0.86 ℃、均方根误差为1.15 ℃);② 在气温反演模型的输入参数中,地表温度对模型反演精度的影响最大,其次是高程,二者在随机森林模型中占有决定性地位。北京市近几年夏季无云的Landsat/TM遥感影像较少,加之对应时相卫星过境时气温数据不易获取,限制本文目前只能针对单个时相的数据进行探讨,存在局限性,后续可利用其他遥感数据进行更进一步的分析。

The authors have declared that no competing interests exist.

[1]
Mao K B, Tang H J, Wang X F, et al.Near-surface air temperature estimation from ASTER data based on neural network algorithm[J]. International Journal of Remote Sensing, 2008,29(20):6021-6028.An algorithm based on the radiance transfer model (MODTRAN4) and a dynamic learning neural network for estimation of near-surface air temperature from ASTER data are developed in this paper. MODTRAN4 is used to simulate radiance transfer from the ground with different combinations of land surface temperature, near surface air temperature, emissivity and water vapour content. The dynamic learning neural network is used to estimate near surface air temperature. The analysis indicates that near surface air temperature cannot be directly and accurately estimated from thermal remote sensing data. If the land surface temperature and emissivity were made as prior knowledge, the mean and the standard deviation of estimation error are both about 1.0 K. The mean and the standard deviation of estimation error are about 2.0 K and 2.3 K, considering the estimation error of land surface temperature and emissivity. Finally, the comparison of estimation results with ground measurement data at meteorological stations indicates that the RM-NN can be used to estimate near surface air temperature from ASTER data.

DOI

[2]
齐述华,王军邦,张庆员,等.利用MODIS遥感影像获取近地层气温的方法研究[J].遥感学报,2005,9(5):570-575.由于冠层叶片群体效应,在1km的空间尺度上遥感获取浓密植被陆面温度与气温近似相等.根据这个原理对利用遥感手段获取气温进行了尝试,提出利用NDVI-Ts空间获取气温的方法,计算气温空间分布模式,同时对Prihodko和 Goward提出的气温遥感获取模型(简称P-G模型)进行试验并与NDVI-Ts空间法进行了对比.根据Parton和Logan提出的气温尺度转换模型,利用气象站观测最高气温和最低气温获取Terra卫星过境时刻气温作为"测定值",对遥感获取的气温进行检验,得到以下结论:P-G模型计算气温与观测结果相比偏高,而NDVI-Ts法计算结果偏低,但是其总体误差范围相当,大约为+4℃;与P-G模型相比,尽管NDVI-Ts空间法获得的气温在精度上对P-G模型没有多大的改善,但这种方法能够更加充分利用遥感获取的信息,而且在计算机运算效率上也有很大的改进,NDVI-Ts空间法相对于P-G模型具有一定优势.

DOI

[Qi S H, Wang J B, Zhang Q Y, et al.Study on the estimation of air temperature from MODIS data[J]. Journal of Remote Sensing, 2005,9(5):570-575. ]

[3]
祝善友,张桂欣.近地表气温遥感反演研究进展[J].地球科学进展,2011,26(7):724-730.高时间分辨率的近地表气温空间分布数据是许多陆面过程模型中非常重要的输入参数之一。在常规气象观测站点稀少或没有的情况下,利用遥感技术进行较高时空分辨率的近地表气温估算与反演,在理论方法与业务实践上都具有重要研究意义。根据地表能量平衡与辐射平衡原理,在气温遥感反演物理机制分析的基础上,总结了国内外近年来气温遥感反演的研究进展,主要方法可归纳为5类单因子统计方法、多因子统计方法、神经网络方法、地表温度—植被指数方法和地表能量平衡方法,并从遥感反演气温的时空分辨率、反演模型中影响因子的考虑、模型的可移植性与实用性角度,讨论了已有研究方法中存在的困难与问题,最后对未来可能的研究方向做出了展望。

DOI

[Zhu S Y, Zhang G X.Progress in near surface air temperature retrieved by remote sensing technonlgy[J]. Advances in Earth Science, 2011,26(7):724-730. ]

[4]
曲培青,施润和,刘剋,等.基于遥感和BP人工神经网络的城乡气象站点划分分析[J].地球信息科学学报,2010,12(5):726-732.城市热岛是城市环境和全球变化研究的重要组成部分,利用气象观测资料研究城市热岛的影响一般采用城市和乡村气象站的同步实测气温,并计算其平均气温差,因此,城乡气象站点划分的准确性,将直接影响城市热岛研究的科学性。鉴于以行政单元统计人口为依据的划分方式未考虑人口在行政单元内的实际空间分布,本文以安徽省为例,利用从遥感影像上提取的土地利用信息,采用BP人工神经网络方法,建立站点缓冲区内土地利用类型比例的城乡站点划分模型,并利用空间化后的人口格网数据对该模型的精度进行了验证。结果表明,该模型有效地建立了气象站点周边缓冲区内的土地利用类型比例与城乡站点类型之间的定量关系,避免直接采用行政单元统计人口数据的不足,客观地模拟了缓冲区内土地利用对气象站点的综合作用,科学地划分出城市和乡村气象站点,为城市热岛研究提供科学、可靠的数据保障,并可用于大区域研究。另外,本文利用划分出的乡村站点建立背景温度场,得出2000年安徽省各城市站点平均热岛强度为0.4℃。

[Qu P Q, Shi R H, Liu K, et al.Discrimination of urban and rural meteorological stations based on remote sensing and BP artificial neural network[J]. Journal of Geo-Information Science, 2010,12(5):726-732. ]

[5]
徐永明,覃志豪,万洪秀. 热红外遥感反演近地层气温的研究进展[J].国土资源遥感,2011(1):9-14.近地层气温是生态环境的重要因子,是描述地表与大气能量交换与水分循环的关键变量.气象站点观测能够提供点尺度上的准确气温资料,但是大多数地球系统模型需要空间连续的参数来模拟物理过程.遥感提供了比地表气象观测数据更理想的空间异质度信息,为快速获取大尺度的气温时空信息提供了新的途径.主要介绍了目前常用的几种遥感气温估算方法,包括温度-植被指数(TVX)方法、经验统计方法、神经网络方法和能量平衡方法等等,并对这些方法的优、缺点分别进行了评述.最后,指出今后应该加强辐射传输过程的机理研究、气温的时空尺度转换以及云检测算法等方面的研究.

DOI

[Xu Y M, Qin Z H, Wan H X.Advances in the study of near surface air surface air temperature retrieval from thermal remote sensing[J]. Remote Sensing for Land & Resources, 2011,1:9-14. ]

[6]
Zhao C, Nan Z, Cheng G.Methods for modelling of temporal and spatial distribution of air temperature at landscape scale in the southern Qilian mountains, China[J]. Ecological Modelling, 2005,189(s1-2):209-220.Understanding temporal and spatial distribution of surface air temperature (SAT) at the landscape scale is essential in assessing the potential ecological conditions for ecological restoration and in making decisions for regional management in the Qilian mountains, northwest China. Based on the measurement of air temperature, this study developed a linear regression relationship between the monthly mean SAT and elevation and locational/topographic factors. On average over the year, the model had a higher accuracy to predict SAT in the southern Qilian mountainous terrain of the Heihe River Basin. The study also compared the built linear regression model with geostatistical methods (i.e., ordinary kriging, splines and inverse distance weight), generally, the predictions errors obtained by the geostatistical methods were larger than that by regression method. The worst results were produced by spline. It was noteworthy that for several months (i.e. growing seasons) ordinary kriging yielded smaller prediction errors than the linear regression of temperature against elevation and locational/topographic factors did. We selected the OK method to estimate the SAT in the growing seasons, because accurately estimating surface air temperature during the ecologically meaningful time period was very important to model future ecological processes. Modeled SAT increased from northeast to southwest with highest value occurring in the Yinglou gorge, the outlet of Heihe River in the study area. Temporally, highest SAT value, ranging from 9.2 to 18.7 掳C, appeared in the July, and the lowest SAT value, from 3.3 to 13.3 掳C, was seen in May.

DOI

[7]
Cresswell M P, Morse A P, Thomson M C,et al.Estimating surface air temperatures from Meteosat land surface temperatures using an empirical solar zenith angle model[J]. International Journal of Remote Sensing,1999,20(6):1125-1132.Temperature values derived from Meteosat are an indication of emitted long-wave radiation, and are not a true indication of ambient air temperature. The authors believe that Solar Zenith Angle (SZA) can be used as a proxy for solar energy reaching the ground surface, and its subsequent effects upon the land surface temperature detected by Meteosat. Raw satellite temperatures often overestimate the actual screen temperature during the day, and underestimate at night. By using a statistical model which relates Meteosat and WMO screen temperature deviations, and SZA values, it has been possible to generate a correction algorithm which minimizes these differences. The algorithm generates a new proxy value, being a simulated ambient (screen) air temperature. The algorithms achieve an accuracy of within 3 C for over 70% of the Meteosat temperatures processed. The operational use of this algorithm requires only the raw Meteosat temperature value, and the SZA. Such temperature corrections are useful for a wide range of environmental monitoring applications. An example is in the field of vector-borne disease modelling which requires proxies for temperature across large regions, and where more conventional meteorological stations are inadequate.

DOI

[8]
曲培青,施润和,刘朝顺,等. 基于MODIS地表参数产品和地理数据的近地层气温估算方法评价——以安徽省为例[J].国土资源遥感,2011(4):78-82.为研究应用MODIS地表参数产品估算近地层气温的可行性,对MODIS地表温度(LST)、反照率(ALBEDO)、植被指数(NDVI)等产品数据和高程(ALT)、纬度(LAT)等地理数据进行主成分分析,并以主成分累积方差较大的前若干个主成分作为自变量,建立自变量与各气象台站气温之间的多元线性关系。结果显示:所建立的多元线性回归模型的均方根误差(RMSE)均在0.5~2.4之间,其中,与月平均最高气温(Tmax)和14时气温(T14)回归得到的RMSE整体较大,与月平均最低气温(Tmin)回归得到的RMSE整体较小;RMSE的波动呈现出冬季大、夏季小的季节特征;利用Terra/MODIS数据得到的结果优于利用Aqua/MODIS数据得到的结果,且其夜间数据对Tmin的估算精度较高,日间数据对Tmax和T14的估算精度较高;各参数对气温回归权值影响从大到小依次为LST、ALT、LAT、NDVI和ALBEDO。因此,利用MODIS地表参数产品可以监测不同时刻的近地层气温空间分布,但对不同时刻的气温回归分析,最优数据选择有所不同。

DOI

[Qu P Q, Shi R H, Liu C S, et al.The evaluation of MODIS data and geographic data for estimating near surface air temperature: A case of Anhui Pravince[J]. Remote Sensing for Land & Resources, 2011,4:78-82. ]

[9]
Prihodko L, Goward S N.Estimation of air temperature from remotely sensed surface observations[J]. Remote Sensing of Environment, 1997,60(3):335-346.Air temperature is an important descriptor of terrestrial environmental conditions across the earth. Standard meteorological observations generally provide reasonable descriptions of temporal variations in air temperature for the site sampled but may not describe the spatial heterogeneity typically encountered in this variable over larger land areas. If a reasonable estimate of spatial patterns of air temperature can be derived from satellite remote sensing, this pattern, in combination with the temporal precision of ground measurements, should significantly improve our knowledge of terrestrial environmental conditions.In this study, we explore a methodology for estimating air temperature directly from remotely sensed observations using the (observed) correlation between a spectral vegetation index and surface temperature (temperature-vegetation index). Inference of air temperature is based on the hypothesis that the bulk temperature of an infinitely thick vegetation canopy is close to ambient air temperature.Advanced very high resolution radiometer observations for five sites in northeastern Kansas were used to estimate air temperatures on 31 days during the 1987 growing season. These air temperature estimates were compared with coincident ground-measured air temperatures recorded at standard meteorological stations.A strong correlation (r=0.93) was found between the satellite estimates and measured air temperatures with a mean error of 2.92 C. However, there was a consistent positive bias in the satellite estimates. It is not clear at this time whether the bias is due to an actual difference between air temperature and the temperature of an infinitely thick canopy or whether it is an artifact of the measurements themselves. Within the errors of the methods used, estimation of standard meteorological shelter height air temperatures recorded at the time of satellite overpass appears possible. Further refinements of the remote sensing methods used here are possible and can be expected in the era of the National Aeronautics and Space Administration's Earth Observing System.

DOI

[10]
Vancutsem C, Ceccato P, Dinku T, et al.Evaluation of MODIS land surface temperature data to estimate air temperature in different ecosystems over Africa[J]. Remote Sensing of Environment, 2010,4(2):449-465.The estimation of near surface air temperature (Ta) is useful for a wide range of applications such as agriculture, climate related diseases and climate change studies. Air temperature is commonly obtained from synoptic measurements in weather stations. In Africa, the spatial distribution of weather stations is often limited and the dissemination of temperature data is variable, therefore limiting their use for real-time applications. Compensation for this paucity of information may be obtained by using satellite-based methods. However, the derivation of near surface air temperature (Ta), from the land surface temperature (Ts) derived from satellite is far from straight forward. Some studies have tried to derive maximum Ta from satellites through regression analysis but the accuracy obtained is quite variable according to the study. The main objective of this study was to explore the possibility of retrieving high-resolution Ta data from the Moderate Resolution Imaging Spectroradiometer (MODIS) Ts products over different ecosystems in Africa. First, comparisons between night MODIS Ts data with minimum Ta showed that MODIS nighttime products provide a good estimation of minimum Ta over different ecosystems (with (螖Ts − Ta) centered at 0 °C, a mean absolute error (MAE) = 1.73 °C and a standard deviation = 2.4 °C). Secondly, comparisons between day MODIS Ts data with maximum Ta showed that (螖Ts − Ta) strongly varies according to the seasonality, the ecosystems, the solar radiation, and cloud-cover. Two factors proposed in the literature to retrieve maximum Ta from Ts,. the Normalized Difference Vegetation Index (NDVI) and the Solar Zenith Angle (SZA), were analyzed. No strong relationship between (螖Ts − Ta) and (i) NDVI and (ii) SZA was observed, therefore requiring further research on robust methods to retrieve maximum Ta.

DOI

[11]
Stisen S, Sandholt I, Norgaard A, et al.Estimation of diurnal air temperature using MSG SEVIRI data in West Africa[J]. Remote Sensing of Environment, 2007,10(2):262-274.Spatially distributed air temperature data with high temporal resolution are desired for several modeling applications. By exploiting the thermal split window channels in combination with the red and near infrared channels of the geostationary MSG SEVIRI sensor, multiple daily air temperature estimates can be achieved using the contextual temperature–vegetation index method. Air temperature was estimated for 436 image acquisitions during the 2005 rainy season over West Africa and evaluated againstdata from a field test site in Dahra, Northern Senegal. The methodology was adjusted using data from the test site resulting in RMSE = 2.55 K, MBE = − 0.30 K and = 0.63 for the estimated versus observed air temperatures. A spatial validation of the method using 12 synoptic weather stations from Senegal and Mali within the Senegal River basin resulted in overall values of RMSE = 2.96 K, MBE = − 1.11 K and = 0.68. The daytime temperature curve is interpolated using a sine function based on the multiple daily air temperature estimates from the SEVIRI data. These estimates (covering the 8:00–20:00 UCT time window) were in good agreement with observed values with RMSE = 2.99 K, MBE = − 0.70 K and = 0.64. The temperature–vegetation index method was applied as a moving window technique to produce distributed maps of air temperature with 15 min intervals and 3 km spatial resolution for application in a distributed hydrological model.

DOI

[12]
徐永明,覃志豪,沈艳.基于MODIS数据的长江三角洲地区近地表气温遥感反演[J].农业工程学报,2011,27(9):63-68.近地表气温是一个重要的气候参 数,为了给农业研究提供空间上连续的气温信息,以长江三角洲为研究区,根据MODIS地表温度和NDVI数据运用温度-植被指数方法反演了2005年全年 的气温,并通过进一步去除温度-植被指数空间窗口的残余云和水体信息扩大了该方法的适用范围。最后利用气象站点观测气温数据对遥感反演值进行了精度验证, 分析了误差的分布特征和变化规律。常规温度-植被指数方法的气温反演误差为2.39℃,但是只有72.23%的样本能适用该方法。在去除温度-植被指数窗 口内残余云和水体信息之后,温度-植被指数方法适用样本比例提高到了80.15%,误差为2.44℃。温度-植被指数方法的反演精度在很大程度上受到空间 窗口内植被覆盖及地表异质性的影响,在植被覆盖度较高的区域误差明显偏低。论文提出的改进温度-植被指数方法在农田区域及农作物生长期内具有很好的适用性 和精度,为有效获取大范围农田气温提供了一种新的思路。

DOI

[Xu Y M, Qin Z H, Shen Y.Estimation of near surface air temperature from MODIS data in the Yangtze River Delta[J]. Transactions of the CSAE, 2011,27(9):63-68. ]

[13]
Sun Y, Wang J F, Zhang R H, et al.Air temperature retrieval from remote sensing data based on thermodynamics[J]. Theoretical & Applied Climatology, 2005,80(1):37-48.land-surface temperature; soil-water content; energy fluxes; emissivity; validation; difference; stress; model; basin

DOI

[14]
Breiman L. Random Forests[J]. Machine Learning,2001,45(1):5-32.Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning : Proceedings of the Thirteenth International conference , ***, 148鈥156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

DOI

[15]
方匡南,吴见彬,朱建平,等.随机森林方法研究综述[J].统计与信息论坛,2011,26(3):32-38.随机森林(RF)是一种统计学习理论,它是利用bootsrap重抽样方法从原始样本中抽取多个样本,对每个bootsrap样本进行决策树建模,然后组合多棵决策树的预测,通过投票得出最终预测结果。它具有很高的预测准确率,对异常值和噪声具有很好的容忍度,且不容易出现过拟合,在医学、生物信息、管理学等领域有着广泛的应用。为此,介绍了随机森林原理及其有关性质,讨论其最新的发展情况以及一些重要的应用领域。

DOI

[Fang K N, Wu J B, Zhu J P, et al.A review of technologies on random forests[J]. Statistics&Information Forum, 2011,26(3):32-38. ]

[16]
张雷,王琳琳,张旭东,等.随机森林算法基本思想及其在生态学中的应用——以云南松分布模拟为例[J].生态学报,2014,34(3):650-659.通常来讲,生态学者对于解释生态关系、描述格局和过程、进行空间或时间预测比较感兴趣。这些工作可以通过模拟输出值(响应)与一些特征值(即解释变量)的关系来实现。然而,生态数据模拟遇到了挑战,这是因为响应变量和预测变量可能是连续变量或离散变量。需要解释的生态关系通常是非线性的,并且解释变量之间具有复杂的相互作用关系。响应变量和解释变量存在缺失值并不是不常有的现象,奇异值也经常出现在生态数据中。此外,生态学者通常希望生态模型即要易于建立又易要于解释。通常是利用多种统计方法来分析处理各种各样情景中出现的独特的生态问题,这些模型包括(多元)逻辑回归、线性模型、生存模型、方差分析等等。随机森林是一个可以处理所有这些问题的有效方法。随机森林可以用来做分类、聚类、回归和生存分析、评估变量的重要性、检测数据中的奇异值、对缺失数据进行插补等。鉴于随机森林本身在算法上的优势,将就随机森林在生态学中的应用进行总结,对建模过程进行概述,并以云南松分布模拟研究为例,对其主要功能特点进行案例展示。通过对随机森林的一般术语、概念和建模思想进行介绍,有利于读者掌握本方法的应用本质,可以预见随机森林在生态学研究中将得到更多的应用和发展。

DOI

[Zhang L, Wang L L, Zhang X D, et al.The basic principle of random forest and its applications in ecology: A case study of Pinus yunnanensis[J]. Acta Ecologica Sinica, 2014,34(3):650-659. ]

[17]
田绍鸿,张显峰.采用随机森林法的天绘数据干旱区城市土地覆盖分类[J].国土资源遥感,2016,28(1):43-49.基于天绘一号(TH-1,或称MS-1)卫星多光谱数据,采用随机森林分类方法(random forests classification,RFC)对位于中亚干旱区的我国新疆维吾尔族自治区阿勒泰地区北屯市及周边区域的土地覆盖进行了分类研究。针对北屯市不透水层与裸土混杂的情况,将纹理特征与植被信息构建最优组合,建立有效的RFC分类器,提高对易混淆土地覆盖类型的分类识别精度。结果表明,采用RFC的分类精度高于最大似然法分类结果,总体分类精度提高了近10%。经过优化选择的特征组合在对干旱区中小城市土地覆盖进行分类时表现良好,能得到较高精度的分类结果,可满足新疆中小城市发展规划对土地覆盖信息的需求。

DOI

[Tian S H, Zhang X F.Random forest classification of land cover information of urban areas in arid regions based on TH-1 data[J]. Remote Sensing for Land and Resources, 2016,28(1):43-49. ]

[18]
Gleason C J, Im J.Forest biomass estimation from airborne LiDAR data using machine learning approaches[J]. Remote Sensing of Environment, 2012,125(125):80-91.During the past decade, procedures for forest biomass quantification from light detection and ranging (LiDAR) data have been improved at a rapid pace. The scope of these methods ranges from simple regression between LiDAR-derived height metrics and biomass to methods including automated tree crown delineation, stochastic simulation, and machine learning approaches. This study compared the effectiveness of four modeling techniques—linear mixed-effects (LME) regression, random forest (RF), support vector regression (SVR), and Cubist—for estimating biomass in moderately dense forest (40–60% canopy closure) at both tree and plot levels. Tree crowns were delineated to provide model estimates of individual tree biomass and investigate the effects of delineation accuracy on biomass modeling. We used our previously developed method (COTH) to delineate tree crowns. Results indicate that biomass estimation accuracy improves when modeled at the plot level and that SVR produced the most accurate biomass model (67102kg RMSE per 38002m 2 plot when forest plots were modeled as a collection of trees). All models provided similar results when estimating biomass at the individual tree level (505, 506, 457, and 50202kg RMSE per tree). We assessed the effect of crown delineation accuracy on biomass estimation by repeating the modeling procedures with manually delineated crowns as inputs. Results indicated that manually delineated crowns did not always produce superior biomass models and that the relationship between crown delineation accuracy and biomass estimation accuracy is complex and needs to be further investigated.

DOI

[19]
Jiménez-Muñoz J C, Sobrino J A. A generalized single-channel method for retrieving land surface temperature from remote sensing data[J]. Journal of Geophysical Research, 2003,108(D22):2015-2023.Many papers have developed algorithms to retrieve land surface temperature from at-sensor and land surface emissivity data. These algorithms have been specified for different thermal sensors on board satellites, i.e., the algorithm used for one thermal sensor (or a combination of thermal sensors) cannot be used for other thermal sensor. The main goal of this paper is to propose a generalized single-channel algorithm that only uses the total atmospheric water vapour content and the channel effective wavelength (assuming that emissivity is known), and can be applied to thermal sensors characterized with a FWHM (Full-Width Half-Maximum) of around 1 μm actually operative on board satellites. The main advantage of this algorithm compared with the other single-channel methods is that in-situ radiosoundings or effective mean atmospheric temperature values are not needed, whereas the main advantage of this algorithm compared with split-window and dual-angle methods is that it can be applied to different thermal sensors using the same equation and coefficients. The validation for different test sites shows root mean square deviations lower than 2 K for AVHRR channel 4 (λ≈ 10.8 μm) and ATSR-2 channel 2 (λ≈ 11 μm), and lower than 1.5 K for Landsat Thematic Mapper (TM) band 6 (λ≈ 11.5 μm).

DOI

[20]
邹春城,张友水,黄欢欢.福州市城市不透水面景观指数与城市热环境关系分析[J].地球信息科学学报,2014,16(3):490-498.城市化致使城市环境问题的产生,城市热环境问题就是其中之一。本文从不透水面方面研究对城市热环境的影响。根据福州市1989年和2001年LandsatTM/ETM+遥感影像数据,利用线性光谱分解法提取两时相不透水面信息,并离散化分级为中低、中、中高、高密度区4个区域,分别计算这4个区域的地表温度(LST)、归一化植被指数(NDVI),并进行相关性分析;根据阈值法和范围法分别计算不透水面的PD、AI、LPI等景观指数,结果表明两时段内不透水面的面积有所增加,在高密度区增加明显;不透水面与地表温度的呈正相关,相关系数分别为0.66和0.71;不透水面景观指数对FISA敏感,景观指数整体的变化趋势与地表温度的变化趋势相一致,FISA值越大,温度越高,且各斑块的形状越来越复杂,空间的连续性越强;聚集度越高,人类活动也越强。

DOI

[Zou C C, Zhang Y S, Huang H H.Impacts of impervious surface area and landscape metrics on urban heat environment in Fuzhou City, China[J]. Journal of Geo-InformationScience, 2014,16(3):490-498. ]

[21]
Chester L Arnold Jr, C James Gibbons. Impervious surface coverage: The emergence of a key environmental indicator[J]. Journal of the American Planning Association, 1996,62(2):243-258.

[22]
Ridd M K.Exploring a V-I-S (Vegetation-Impervious Surface-Soil) model for urban ecosystem analysis through remote sensing[J]. International Journal of Remote Sensing, 1995,16(12):2165-2185.Growing interest in urban systems as ecological entities calls for some standards in parameterizing biophysical composition of urban environments. A vegetation-impervious surface-soil (V-I-S) model is presented as a possible basis for standardization. The V-I-S model may serve as a foundation for characterizing urban/near-urban environments universally, and for comparison of urban morphology within and between cities. Inasmuch as the model may be driven by satellite digital data, it may serve as a global model of urban ecosystem analysis and comparison world-wide. The V-I-S model may prove useful for urban change detection and growth modelling, for environmental impact analysis from urbanization, for energy- and water-related investigations, and for certain dimensions of human ecosystem analysis of the city as well.

DOI

[23]
徐永明,刘勇洪.基于TM影像的北京市热环境及其与不透水面的关系研究[J].生态环境学报,2013,22(4):639-643.城市化进程将自然景观转换为以不透水面为主体的人工景观,改变了地表与大气间的水分和能量交换过程,导致了城市热岛效应。城市热岛效应对区域气候、生态环境等产生了一系列影响,其空间分布特征以及影响因素分析已经成为城市气候与环境研究的重要内容。基于2011年7月26日的Landsat/TM卫星影像运用单通道算法反演了北京市的地表温度来表征城市热环境,运用线性光谱分解及VIS模型提取了北京市不透水面盖度来,在此基础上对北京城市热环境的空间分布特征及其与不透水面盖度之间的关系进行了分析讨论。研究表明:北京主城区的地表温度明显高于郊区,城市热岛效应非常显著,其空间分布呈现单核特征,且南部城区的热岛效应要强于北部城区。北京市高温热岛区域和不透水面盖度较高的区域基本重合,两者在空间分布上具有显著的一致性。地表温度随着不透水面盖度的增加而升高,并且其变化速率依赖于不透水面盖度。当不透水面盖度低于40%时,地表温度随着不透水面盖度增加呈指数关系迅速上升,而当不透水面盖度高于40%时,地表温度呈线性缓慢上升。研究结果揭示了不透水面与地表温度的关系,表明不透水面盖度可以作为城市热环境的一个重要指示因子,为城市规划建设及环境评价等提供了科学参考。

DOI

[Xu Y M, Liu Y H.Study on the thermal environment and its relationship with impervious surface in Beijing city using TM image[J]. Ecology and Environmental Sciences, 2013,22(4):639-643. ]

[24]
Rouse J W, Haas R H, Schell J A,et al.Monitoring vegetation systems in the great plains with Erts[A]. Third Earth Resources Technology Satellite-1 Symposium[C]. 1974:309-317.

[25]
徐涵秋. 利用改进的归一化差异水体指数(MNDWI)提取水体信息的研究[J].遥感学报,2005,9(5):589-595.在对M cfeeters提出的归一化差异水体指数(NDWI)分析的基础上,对构成该指数的波长组合进行了修改,提出了改进的归一化差异水体指数MNDWI(M odified NDWI),并分别将该指数在含不同水体类型的遥感影像进行了实验,大部分获得了比NDWI好的效果,特别是提取城镇范围内的水体。NDWI指数影像因往往混有城镇建筑用地信息而使得提取的水体范围和面积有所扩大。实验还发现MNDWI比NDWI更能够揭示水体微细特征,如悬浮沉积物的分布、水质的变化。另外,MNDWI可以很容易地区分阴影和水体,解决了水体提取中难于消除阴影的难题。

DOI

[Xu H Q.A study on information extraction of water body with the modified normalized difference water index(MNDWI)[J]. Journal of Remote Sensing, 2005,9(5):589-595. ]

[26]
阿布都瓦斯提·吾拉木,秦其明.基于辐射模拟反演ETM+数据宽波段反照率[J].北京大学学报:自然科学版,2007,43(4):474-483.利用增强的专题制图仪+(Enhanced Thematic Mapper Plus,ETM+)数据,先针对各种下垫面反射率特征不同,对地物类型进行分类,以便减少邻近效应和模拟传感器未包含的谱段数据。借助卫星同步观测的气象数据,通过6S(Second Simulation of Satellite Signal inthe Solar Spectrum)模型对ETM+可见光、近红外6个波段数据进行了大气纠正。以ETM+可见光和近红外波段波谱范围为单位谱段,将整个短波波段(0.3-4μm)分为13个光谱域,利用ASTER光谱数据库和实测地表反射率数据,6S模拟获取各ETM+观测波段和未观测波段地表入射光通量密度和反射光通量密度,计算每一个光谱域入射能量占整个短波入射能量中的权重,并反演地表窄波段反照率。然后,各波段能量权重作为转换参数,实现窄波段反照率向宽波段反照率的转换。结果表明,模型反演和卫星同步观测的实测地表反照率之间最大相对误差17.9%,作者提出的方法可行。

DOI

[Ghulam A, Qin Q M.Calculation of ETM+ broadband albedos by radiative simulations[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2007,43(4):474-483. ]

文章导航

/