Journal of Geo-information Science >
Random Forest Classification of Landsat 8 Imagery for the Complex Terrain Area based on the Combination of Spectral, Topographic and Texture Information
Received date: 2018-07-27
Request revised date: 2018-12-27
Online published: 2019-03-15
Supported by
Natural Science Fund Project of QingHai Science and Technology Department, No.2016-ZJ-907
National Natural Science Foundation of China, No.41550003.
Copyright
Random forest classification has become an effective method in remote sensing classification of machine learning. It is of great significance to combine the Landsat satellite data and random forest method to obtain long time series data in the complex terrain areas and to explore its land use/land cover change. Based on the multi-spectral data of landsat8 OLI satellite, this paper adopted the random forest classification method to classify the land use types of Huangshui basin complex topography areas in Qinghai province. According to the characteristics of complex terrain areas, the study area was divided into different geographical regions. The topographic parameters were then selected, and the optimal feature collection was constructed by extracting spectral and texture information of Landsat8 data. The objective of this papers was to explore the applicability of random forest methods in land use classification on the complex topographic regions. The results showed that RFC classification with the landsat8 OLI data can be well used to obtain the land use types in the Huangshui basin. The combination of spectral, topographic, and texture information performed differently in different areas. In the middle and high mountain areas, the combination of spectral and topographic information can obtain the best results in the random forest classification with the overall accuracy of 91.33% and Kappa coefficient of 0.886. In the shallow mountain areas and valley plain, however, the random forest classification can obtain the best results by combining spectral, topographic, and texture information with the overall accuracy of 92.09% and 87.85% and Kappa coefficient of 0.902 and 0.859, respectively. Using the random forest algorithm to optimize the selection of texture feature combination can extract the land use type information quickly and ensure its accuracy. Random forest classification combined multi-source information can be used effectively to classify land use types, which can provide some enlightenment and reference values for the renewal of land use status and the development of social economy in the study area.
MA Huijuan , GAO Xiaohong , GU Xiaotian . Random Forest Classification of Landsat 8 Imagery for the Complex Terrain Area based on the Combination of Spectral, Topographic and Texture Information[J]. Journal of Geo-information Science, 2019 , 21(3) : 359 -371 . DOI: 10.12082/dpxxkx.2019.180346
Fig. 1 Location of the study area and distribution of sampling site图1 研究区概况与采样点分布 |
Tab. 1 Image data information表1 影像数据信息 |
影像编号 | 接收卫星 | 传感器类型 | 影像获取时间 | 分辨率/m | 波段数 |
---|---|---|---|---|---|
LC81330342016211LGN00 | Landsat8 | Operational Land Image (OLI) | 2016-07-29 | 30 | 2-7波段 |
LC81310352016213LGN00 | 2016-07-31 | ||||
LC81320342015233LGN00 | 2015-08-21 | ||||
LC81320352014198LGN00 | 2014-07-17 |
Fig. 2 Schematic diagram of random forest classification图2 随机森林分类过程示意[8] |
Fig. 3 Training and validation samples of land use types in the Huangshui basin图3 湟水流域土地利用类型训练样本与验证样本空间分布 |
Tab. 2 Statistics of training and validation samples from each geographical region表2 各地理分区训练样本与验证样本统计 |
样本 | 地理分区 | 水浇地 | 旱地 | 林地 | 高覆盖度草地 | 中覆盖度草地 | 低覆盖 度草地 | 水域 | 城乡、工矿、居民用地 | 未利用 土地 | 合计 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
训练 样本 | 脑山区 浅山区 川水区 合计 | - | - | 37 | 43 | 21 | - | 17 | - | 31 | 149 | |
25 | 68 | 46 | 34 | 20 | - | 30 | 29 | - | 252 | |||
35 | 39 | 29 | 27 | 28 | 30 | 30 | 55 | 25 | 298 | |||
60 | 107 | 112 | 104 | 69 | 59 | 77 | 55 | 56 | 699 | |||
验证 样本 | 脑山区 | - | - | 50 | 72 | 29 | - | 28 | - | 36 | 215 | |
浅山区 | 30 | 80 | 61 | 40 | 35 | - | 25 | 40 | - | 311 | ||
川水区 | 50 | 49 | 40 | 35 | 35 | 42 | 35 | 65 | 30 | 381 | ||
合计 | 80 | 129 | 151 | 147 | 99 | 42 | 88 | 105 | 66 | 907 |
Tab. 3 Characteristic parameter characteristics and classification significance表3 特征参数特点及分类意义 |
特征参数 | 特征参数特点及意义 | |
---|---|---|
Landsat8OLI 多光谱2~7 波段 | 2-蓝波段 | 对水体穿透强,可获得更多水下信息;能够区分土壤和植被、分析土地利用结构变化。 |
3-绿波段 | 主要观测植被在绿波段中的反射峰值,这一波段位于叶绿素的两个吸收带之间,利用这一波段增强鉴别植被的能力 | |
4-红波段 | 该波段为叶绿素的主要吸收波段,能增强植被覆盖与无植被覆盖之间的反差,亦能增强同类植被的反差,反映不同植物叶绿素吸收,植物健康状况,用于区分植物种类与植物覆盖率 | |
5-近红外波段 | 对植被类别差异最敏感,可以区别植被类型;由于处于水体强吸收区,因此呈现的水体轮廓清晰,便于与其他地物的区分 | |
7-短波红外2 | 反映植物和土壤水分含量敏感,可以区别雪和云 | |
8-短波红外2 | 可用于区分主要岩石类型;处于水的强吸收带,在影像上该波段的水体呈黑色;对植物水分敏感 | |
指数信息 | NDVI | 归一化差值植被指数,也称为生物量指标变化,可使植被从水和土中分离出来,是植被生长及植被覆盖度最佳指示因子 |
NDBI | 归一化建筑物指数,利用了不透水面的中红外波段反射率高于近红外反射率的规律,该指数有助于城乡、工矿与居住建设用地的提取 | |
MNDWI | 改进归一化差值水体指数,用遥感影像特定的波段进行改进的归一化差值处理,以突显影像中的水体信息 | |
地形信息 | DEM | 数字高程模型,可以提取坡度与坡向,影响研究区土地利用/土地覆被类型分布格局 |
纹理信息 | 纹理特征 | 纹理信息以灰度共生矩阵为主,是一种通过研究灰度的空间相关特性来描述纹理的常用方法;可以反映图像灰度分布均匀程度和纹理粗细程度 |
Tab. 4 Characteristic parameters extracted for each geographical subregion表4 各地理分区特征参数提取 |
地理分区/m | 特征信息 | 特征参数 |
---|---|---|
脑山区(>3200 ) | 光谱信息 地形信息 | B,G,R,NIR,SWIR1,SWIR2,NDVI 高程,坡度,坡向 |
浅山区(2600~3200 ) | 光谱信息 纹理信息 地形信息 | B,G,R,NIR,SWIR1,SWIR2,NDVI,NDBI,MNDWI 均值,方差,同质性,对比度,非相似性,熵,二阶矩,相关性高程,坡度,坡向 |
川水区(<2600 ) | 光谱信息 纹理信息 地形信息 | B,G,R,NIR,SWIR1,SWIR2,NDVI,NDBI,MNDWI,PCA1,PCA2 均值,方差,同质性,对比度,非相似性,熵,二阶矩,相关性,高程,坡度,坡向 |
Tab. 5 Accuracles assessment of sample classification from classification feature sets in middle-high mountain area (%)表5 脑山区分类特征集的样本分类精度评价 |
土地利用类型 | 6MS | 6MS+NDVI | 6MS+DEM | 6MS+DEM+NDVI | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
制图精度 | 用户精度 | 制图精度 | 用户精度 | 制图精度 | 用户精度 | 制图精度 | 用户精度 | ||||
林地 | 84.62 | 97.06 | 84.62 | 97.06 | 85.71 | 93.33 | 85.71 | 95.45 | |||
高草地 | 94.07 | 86.72 | 95.76 | 86.92 | 92.65 | 88.73 | 94.12 | 88.89 | |||
中草地 | 85.71 | 87.50 | 85.71 | 87.50 | 96.30 | 83.87 | 96.30 | 83.87 | |||
水域 | 92.73 | 94.44 | 90.91 | 94.34 | 92.31 | 94.70 | 92.31 | 97.70 | |||
未利用土地 | 93.48 | 91.49 | 93.48 | 91.49 | 88.46 | 92.00 | 88.46 | 92.00 | |||
总体精度 | 90.46 | 90.75 | 90.82 | 91.33 | |||||||
Kappa系数 | 0.876 | 0.879 | 0.880 | 0.886 |
注:6MS表示多光谱影像的第2~7波段; SL表示坡度;AS代表坡向。 |
Fig. 4 The classification results from optimal feature set 6MS+SL+AS+NDVI in the middle-high area图4 脑山区最优特征集6MS+SL+AS+NDVI分类结果 |
Tab. 6 Accuracles assessment of sample classification from classification feature sets in the loess hilly area (%)表6 浅山区分类特征集的样本分类精度评价 |
土地利用类型 | 6MS+SL+AS | 6MS+SL+AS+NDVI | 6MS+SL+AS +OIF+TXT | |||||
---|---|---|---|---|---|---|---|---|
制图精度 | 用户精度 | 制图精度 | 用户精度 | 制图精度 | 用户精度 | |||
水浇地 | 90.91 | 95.00 | 90.91 | 95.24 | 90.91 | 95.24 | ||
旱地 | 94.74 | 88.34 | 94.74 | 91.14 | 94.74 | 90.57 | ||
林地 | 91.78 | 97.10 | 92.47 | 96.43 | 92.47 | 96.43 | ||
高草地 | 88.46 | 92.00 | 88.46 | 90.20 | 94.23 | 96.08 | ||
中草地 | 94.83 | 85.94 | 97.10 | 84.06 | 96.55 | 90.32 | ||
水域 | 88.89 | 94.12 | 88.89 | 94.12 | 91.67 | 80.49 | ||
城乡工矿居民用地 | 83.93 | 85.45 | 80.36 | 90.00 | 78.57 | 89.80 | ||
总体精度 | 91.54 | 91.91 | 92.09 | |||||
Kappa系数 | 0.895 | 0.900 | 0.902 |
注:OIF表示浅山区的指数信息:NDVI、NDBI、MNDWI;TXT表示从多光谱波段提取的8种纹理特征。 |
Tab. 5 The classification results from the optimal feature sets in the loess hilly area图5 浅山区最优特征集6MS+SL+AS+OIF+TXT分类结果 |
Tab. 7 Accuracles assessment of sample classification from classification feature sets in the valley plain area表7 川水区分类特征集的样本分类精度评价 |
土地利用类型 | 6MS+SL+AS | 6MS+SL+AS+NDVI | 6MS+SL+AS +OIF+TXT | |||||
---|---|---|---|---|---|---|---|---|
制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | |||
水浇地 | 88.64 | 87.64 | 88.64 | 89.66 | 92.05 | 94.19 | ||
旱地 | 94.94 | 91.18 | 95.45 | 90.00 | 98.48 | 86.67 | ||
林地 | 72.17 | 73.16 | 72.17 | 76.67 | 77.83 | 78.75 | ||
高草地 | 75.00 | 73.18 | 75.00 | 78.18 | 75.00 | 73.75 | ||
中草地 | 84.44 | 95.00 | 84.44 | 95.00 | 93.33 | 90.00 | ||
低草地 | 94.83 | 84.62 | 94.83 | 82.09 | 89.66 | 77.61 | ||
水域 | 94.70 | 98.08 | 94.70 | 98.08 | 94.70 | 92.73 | ||
城乡工矿居民用地 | 95.45 | 87.50 | 94.32 | 90.22 | 90.91 | 86.96 | ||
未利用土地 | 76.67 | 80.95 | 73.33 | 76.19 | 70.00 | 75.00 | ||
总体精度/% | 87.63 | 87.63 | 87.85 | |||||
Kappa系数 | 0.857 | 0.857 | 0.859 |
Fig.6 The classification results from the optimal feature set 6MS+SL+AS+OIF+TXT in the valley plain area图6 川水区最优特征集6MS+SL+AS +OIF+TXT分类结果 |
Tab. 8 Accuracles assessment of classification from optimized selection of texture feature in the loess hilly area表8 浅山区优化选择纹理特征分类精度评价 |
土地利用类型 | 6MS+DEM+NDVI+TXT(8) | 6MS+DEM+NDVI +TXT(var) | |||
---|---|---|---|---|---|
制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | ||
水浇地 | 90.91 | 95.24 | 90.91 | 97.56 | |
旱地 | 94.74 | 89.44 | 94.74 | 91.14 | |
林地 | 92.47 | 95.74 | 94.52 | 97.18 | |
高草地 | 90.38 | 97.92 | 94.23 | 98.00 | |
中草地 | 98.28 | 90.48 | 96.55 | 87.50 | |
水域 | 91.67 | 80.49 | 91.67 | 89.19 | |
城乡工矿居民用地 | 76.79 | 89.58 | 78.57 | 84.62 | |
总体精度/% | 91.73 | 91.54 | |||
Kappa系数 | 0.898 | 0.895 |
注:TXT(8)表示从多光谱波段提取的8种纹理特征;TXT(Var)表示多光谱波段纹理特征中计算统计的方差纹理。 |
Tab. 9 Accuracles assessment of classification from optimized selection of texture feature in the valley plain area表9 川水区优化选择纹理特征分类精度评价 |
土地利用类型 | 6MS+DEM+NDVI+TXT(8) | 6MS+DEM+NDVI +TXT(var) | 6MS+TXT(Var)+PCA1 | 6MS+TXT(var)+PCA2 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | 制图精度/% | 用户精度/% | ||||
水浇地 | 90.91 | 96.39 | 88.64 | 88.65 | 90.59 | 86.52 | 90.59 | 86.52 | |||
旱地 | 96.97 | 81.01 | 95.45 | 88.73 | 84.29 | 90.77 | 84.29 | 90.77 | |||
林地 | 77.83 | 74.71 | 78.17 | 70.59 | 70.00 | 80.00 | 70.00 | 71.30 | |||
高草地 | 75.00 | 85.00 | 80.00 | 94.12 | 70.77 | 88.89 | 70.77 | 80.00 | |||
中草地 | 91.11 | 88.70 | 93.33 | 95.45 | 85.45 | 87.04 | 83.64 | 90.20 | |||
低草地 | 89.66 | 76.47 | 89.66 | 80.00 | 76.19 | 73.85 | 77.78 | 75.38 | |||
水域 | 92.30 | 92.73 | 94.70 | 89.47 | 98.08 | 78.46 | 70.00 | 77.61 | |||
城乡用地 | 89.77 | 86.81 | 92.05 | 88.04 | 87.95 | 73.00 | 87.95 | 70.19 | |||
未利用 | 76.67 | 70.00 | 78.00 | 73.33 | 82.50 | 75.79 | 75.00 | 72.22 | |||
总体精度/% | 86.78 | 87.42 | 80.44 | 80.85 | |||||||
Kappa系数 | 0.847 | 0.854 | 0.774 | 0.779 |
注:PCA1表示主成分变换后的第一主成分;PCA2表示主成分变换后的第二主成分;城乡用地表示城乡工矿居民用地。 |
Fig. 7 Land use information extraction results in the Huangshui basin based on random forest method图7 基于随机森林方法的湟水流域土地利用信息提取结果 |
The authors have declared that no competing interests exist.
[1] |
[
|
[2] |
[
|
[3] |
[
|
[4] |
[
|
[5] |
|
[6] |
|
[7] |
|
[8] |
[
|
[9] |
[
|
[10] |
[
|
[11] |
[
|
[12] |
[
|
[13] |
[
|
[14] |
[
|
[15] |
[
|
[16] |
[
|
[17] |
[
|
[18] |
|
[19] |
|
[20] |
[
|
[21] |
[
|
[22] |
[
|
[23] |
[
|
[24] |
|
/
〈 | 〉 |