基于随机森林的黄土地貌分类研究
曹泽涛(1997— ),男,江苏扬州人,硕士生,主要从事研究DEM数字地形分析研究。E-mail:zetao_cao_1997@163.com |
收稿日期: 2019-05-23
要求修回日期: 2019-12-09
网络出版日期: 2020-05-18
基金资助
国家自然科学基金项目(41601411)
国家自然科学基金项目(41671389)
江苏高校优势学科建设工程资助项目
版权
Loess Landform Classification based on Random Forest
Received date: 2019-05-23
Request revised date: 2019-12-09
Online published: 2020-05-18
Supported by
National Natural Science Foundation of China(41601411)
National Natural Science Foundation of China(41671389)
Priority Academic Program Development of Jiangsu Higher Education Institutions
Copyright
地貌分类在指导人类建设活动的规模与布局中有着重要的意义。然而,传统的基于数字高程模型(DEM)的地貌分类方法使用的地形因子和考虑到的地貌特征往往比较单一。本文提出了一种基于流域单元的地貌分类方法,该方法考虑了流域单元的多方面特征,包括基本地形因子统计量、地形特征点线统计量、小流域特征和纹理特征。本研究首先基于DEM进行水文分析将研究区域划分成不同的小流域。然后利用数字地形分析提取29个不同方面的特征来表征流域的形态,并基于随机森林(RF)算法进行了特征选择和参数标定。RF是一种基于决策树算法的集成分类器,能有效地处理高维数据,分类精度高。最后选择训练集小流域对RF分类器进行训练,使用训练完成的分类器对整个研究区域的地貌进行分类,研究地貌分异的规律。该实验在我国陕北黄土高原典型黄土地貌区域的地貌分类中取得了较好的结果,结果表明不同的地貌之间存在明显的区域界线,特定的地貌类型在空间上表现出明显的聚集性。通过人工判读进行验证的分类精度达到了85%,Kappa系数为0.83。
曹泽涛 , 方子东 , 姚瑾 , 熊礼阳 . 基于随机森林的黄土地貌分类研究[J]. 地球信息科学学报, 2020 , 22(3) : 452 -463 . DOI: 10.12082/dqxxkx.2020.190247
Landform classification is one of the most important steps tor eveal the mechanisms of surface matter flows and energy conversion, which could inform the scale and layout of human construction activities. However, traditional landform classification methods based on Digital Elevation Model (DEM) often use a small number of topographical derivatives or landform characteristics, resulting in insufficiently precise classification results. However, object-oriented landform classification performs better in that reliable classification can be achieved by maximizing the homogeneity within and between objects. But how to set conditions in object segmentation remains a challenge. In this paper, a geomorphological classification method based on watershed unitwas proposed, by accounting for many characteristics of watershed unit including statistics of basic topographic factors, feature point and feature line, basin and texture characteristics. Firstly, hydrological analysis based on DEM divided the study area into different small basins as the experimental units. Then, 29 features were extracted within each unit to represent watershed morphology using digital terrain analysis; feature selection and parameter calibration were carried out based on Random Forest (RF) algorithm. RF is a supervised integrated learning model aggregating different outputs of a single decision tree to reduce variances that may lead to classification errors in the decision tree. Finally, the watersheds in training set were selected to train the RF classifier, and the trained classifier was used to classify the landform of the whole study area, based on which we studied the landform spatial differentiation pattern. This experiment achieved good results in the landform classification of the Loess Plateau in northern Shaanxi Province. It is one of the areas with the most serious soil erosion and the most fragile eco-environment in the world. Most of them are covered by thick loess, and the topography is fluctuant. Result shows: (1) Compared with manual interpretation, excellent classification results based on small watershed in the study area were obtained, with the classification accuracy reaching 85% and the Kappa coefficient 0.83. (2)All small watersheds were divided into eight types of landforms. The same type of landforms showed obvious spatial aggregation. There were boundaries and transitional zones between different types of landforms. (3) Different geomorphological regions explained different situations of loess deposition and runoff erosion in different regions. Our findings suggest that the combination of RF algorithm and DEM data can achieve better classification results.
表1 初步选取的小流域特征Tab. 1 Preliminary selection of basin features |
类型 | 小流域特征 | 数量/个 |
---|---|---|
基本地形因子统计量 | 平均高程、高程标准差、平均坡度、坡度标准差、平均起伏度、起伏度标准差、平均切割深度、切割深度标准差、平均平面曲率、平面曲率标准差、平均剖面曲率、剖面曲率标准差、平均坡向、坡向标准差 | 14 |
地形特征点线统计量 | 山顶点密度、山顶点高程标准差、沟沿线密度、沟沿线平均高程、割裂度、沟谷线密度、沟谷线平均高程 | 7 |
小流域特征 | 相对高程差、沟谷深度、面积高程积分、坡谱信息熵 | 4 |
纹理特征 | 纹理对比度、纹理角二阶矩、纹理信息熵、纹理逆差矩 | 4 |
表2 各地貌类型中样区数量Tab. 2 Number of sample areas per geomorphological type |
地貌类型 | 样区小流域数量/个 |
---|---|
沙丘草滩盆地 | 30 |
黄土低丘 | 30 |
黄土峁状丘陵沟壑 | 30 |
黄土梁状丘陵沟壑 | 30 |
黄土残塬丘陵沟壑 | 30 |
石质山地 | 30 |
黄土塬 | 16 |
黄土台塬 | 14 |
汇总 | 210 |
图9 随机采样小流域(用于精度验证)的空间分布Fig. 9 Spatial distribution of randomly sampled watersheds (used for accuracy verification) |
表3 RF分类结果与人工判读结果的混淆矩阵Tab. 3 Confusion Matrix between RF classification result and manual interpretation result |
RF分类结果 | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
T1 | T2 | T2/T1 | T3 | T4 | T4/T3 | T5 | T6 | T7 | T8 | ||
人工判读结果 | T1 | 13 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
T2 | 0 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
T2/T1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
T3 | 0 | 1 | 0 | 12 | 0 | 0 | 0 | 0 | 0 | 0 | |
T4 | 0 | 1 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 | |
T4/T3 | 0 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | |
T5 | 0 | 0 | 0 | 0 | 0 | 0 | 13 | 0 | 1 | 0 | |
T6 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 13 | 0 | 0 | |
T7 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 9 | 0 | |
T8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 |
表4 不同地貌分类精度Tab. 4 Classification accuarcies of different landforms |
地貌类型 | 正确分类数量/个 | 错误分类数量/个 | 分类精度/% |
---|---|---|---|
T1 | 13 | 1 | 92.9 |
T2 | 10 | 5 | 66.7 |
T3 | 12 | 3 | 80.0 |
T4 | 10 | 2 | 83.3 |
T5 | 13 | 3 | 81.3 |
T6 | 13 | 0 | 100.0 |
T7 | 9 | 1 | 90.0 |
T8 | 5 | 0 | 100.0 |
汇总 | 85 | 15 | 85.0 |
[1] |
李炳元, 潘保田, 韩嘉福 . 中国陆地基本地貌类型及其划分指标探讨[J]. 第四纪研究, 2008,28(4):535-543.
[
|
[2] |
张寿根 . 现代地貌学[M]. 北京: 科学出版社, 2005.
[
|
[3] |
杨景春, 李有利 等. 地貌学原理[M]. 北京: 北京大学出版社, 2001.
[
|
[4] |
周成虎, 程维明, 钱金凯 , 等. 中国陆地1∶100万数字地貌分类体系研究[J]. 地球信息科学学报, 2009,11(6):707-724.
[
|
[5] |
裘善文, 李风华 . 试论地貌分类问题[J]. 地理科学, 1982,2(4):327-335.
[
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
周廷儒, 施雅风, 陈述彭 . 中国地形区划草案[J]//中国自然区划草案[M]. 北京: 科学出版社, 1956,56:21-56.
[
|
[13] |
|
[14] |
|
[15] |
|
[16] |
周启鸣, 刘学军 . 数字地形分析[M]. 北京: 科学出版社, 2006.
[
|
[17] |
汤国安, 刘学军, 闾国年 . 数字高程模型及地学分析的原理与方法[M]. 北京: 科学出版社, 2006.
[
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
刘双琳, 李发源, 蒋如乔 , 等. 黄土地貌类型的坡谱自动识别分析[J]. 地球信息科学学报, 2015,17(10):1234-1242.
[
|
[29] |
周毅 . 基于DEM的黄土高原正负地形及空间分异研究[D]. 南京:南京师范大学, 2011.
[
|
[30] |
|
[31] |
张磊 . 基于核心地形因子分析的黄土地貌形态空间格局研究[D]. 南京:南京师范大学, 2013.
[
|
[32] |
|
[33] |
|
[34] |
|
[35] |
|
[36] |
|
[37] |
蔡凌雁, 汤国安, 熊礼阳 , 等. 基于DEM的陕北黄土高原典型地貌分形特征研究[J]. 水土保持通报, 2014,34(3):141-144.
[
|
[38] |
汤国安, 李发源, 杨昕 , 等. 黄土高原数字地形分析探索与实践[M]. 北京: 科学出版社, 2015.
[
|
[39] |
|
[40] |
|
[41] |
王春 . 基于DEM的陕北黄土高原地面坡谱不确定性研究[D]. 西安:西北大学, 2005.
[
|
[42] |
贾旖旎 . 基于DEM的黄土高原流域边界剖面谱研究[D]. 南京:南京师范大学, 2010.
[
|
[43] |
祝士杰 . 基于DEM的黄土高原流域面积高程积分谱系研究[D]. 南京:南京师范大学, 2013.
[
|
[44] |
|
[45] |
|
[46] |
陈浩 . 陕北黄土高原沟道小流域形态特征分析[J]. 地理研究, 1986,5(1):82-92.
[
|
[47] |
崔灵周, 李占斌, 朱永清 , 等. 流域地貌分形特征与侵蚀产沙定量耦合关系试验研究[J]. 水土保持学报, 2006,20(2):1-4,9.
[
|
[48] |
张婷, 汤国安, 王春 , 等. 黄土丘陵沟壑区地形定量因子的关联性分析[J]. 地理科学, 2005,25(4):85-90.
[
|
[49] |
朱红春, 刘海英, 张继贤 , 等. 基于DEM的流域地形因子提取与量化关系研究——以陕北黄土高原的实验为例[J]. 测绘科学, 2007(2):138-140,182.
[
|
[50] |
汤国安, 李发源, 刘学军 , 等. 数字高程模型教程[M]. 北京: 科学出版社, 2016.
[
|
[51] |
陶旸 . 基于纹理分析方法的DEM地形特征研究[D]. 南京:南京师范大学, 2011.
[
|
[52] |
谢轶群, 朱红春, 汤国安 , 等. 基于DEM的沟谷特征点提取与分析[J]. 地球信息科学学报, 2013,15(1):61-67.
[
|
[53] |
薛凯凯, 熊礼阳, 祝士杰 , 等. 基于DEM的黄土崾岘提取及其地形特征分析[J]. 地球信息科学学报, 2018,20(12):1710-1720.
[
|
[54] |
|
[55] |
|
[56] |
方匡南, 吴见彬, 朱建平 , 等. 随机森林方法研究综述[J]. 统计与信息论坛, 2011,26(3):32-38.
[
|
/
〈 |
|
〉 |