人工神经网络和决策树模型在滑坡易发性分析中的性能对比
田乃满(1996— ),男,内蒙古通辽人,博士生,主要从事地质灾害与地理信息科学研究。E-mail: tiannm@lreis.ac.cn |
收稿日期: 2019-12-11
要求修回日期: 2020-04-03
网络出版日期: 2021-02-25
基金资助
中国科学院战略性先导科技专项(A类)(XDA23090301)
国家自然科学基金项目(41701458)
国家自然科学基金项目(41525010)
国家自然科学基金项目(41790443)
国家自然科学基金项目(41807291)
版权
Performance Comparison of BP Artificial Neural Network and CART Decision Tree Model in Landslide Susceptibility Prediction
Received date: 2019-12-11
Request revised date: 2020-04-03
Online published: 2021-02-25
Supported by
Strategic Priority Research Program of Chinese Academy of Sciences(XDA23090301)
National Natural Science Foundation of China(41701458)
National Natural Science Foundation of China(41525010)
National Natural Science Foundation of China(41790443)
National Natural Science Foundation of China(41807291)
Copyright
机器学习模型广泛应用于区域性滑坡易发性分析。模型的选择关系到评价结果的可信度、准确率和稳定性。现有滑坡易发性分析模型对比研究侧重模型的预测精度。模型的稳定性和数据量敏感性对机器学习模型的性能评估同样非常重要。本文以福建省南平市蔡源流域为研究区,以四川省绵阳市北川县为验证区,从预测精度、稳定性和数据量敏感性3个方面深入对比BP(Back Propagation)人工神经网络模型和CART(Classification and Regression Tree)决策树模型在滑坡易发性分析中的效果,主要结论如下:① 在逐渐增加一定数量训练样本的过程中,BP人工神经网络模型预测精度的增长率更高。在蔡源流域内,当训练样本数量增加10 000时,BP人工神经网络模型的预测精度上升5.22%,CART决策树模型的预测精度上升2.11%。② BP人工神经网络的预测精度高于CART决策树模型,且较为稳定。在100组数据集上,BP人工神经网络模型验证集预测精度的均值和验证集滑坡样本预测精度的均值分别为81.60%和84.86%,高于CART决策树模型的72.97%和76.59%。与此同时,BP人工神经网络模型对应预测精度的标准差分别是0.32%和0.37%,小于CART决策树模型的0.35%和0.67%。③ BP人工神经网络模型分析的滑坡易发区相比CART决策树模型,更接近实际滑坡的空间分布。最后,北川县的验证实验也出现了相同的现象。
田乃满 , 兰恒星 , 伍宇明 , 李郎平 . 人工神经网络和决策树模型在滑坡易发性分析中的性能对比[J]. 地球信息科学学报, 2020 , 22(12) : 2304 -2316 . DOI: 10.12082/dqxxkx.2020.190766
Machine learning has been widely applied to analyze regional landslide susceptibility, such as the artificial neural network and decision tree model. Model selection depends on both the reliability and accuracy of model results, therefore comprehensively evaluating the performance of a model is necessary. Previous studies of landslide susceptibility focused more on the prediction accuracy of a model. However, model stability and model sensitivity to data volume also reflect important model performances in different aspects. In this study, we employed Back-Propagation (BP) artificial neural network and Classification and Regression Tree (CART) model for model performance comparison in landslide susceptibility prediction. We evaluated model performance from three aspects: Data sensitivity, prediction accuracy, and model stability. The Caiyuan basin in Fujian Province was taken as the study area and 11 landslide-related factors were selected. Additionally, Beichuan county in Sichuan Province was taken as the verification area and 12 landslide-related factors selected. Firstly, two models were both trained using different amounts of data as input. With increasing data volume, the prediction accuracy of BP artificial neural network increased faster than that of CART model. Specifically, in Caiyuan basin, the prediction accuracy of BP artificial neural network and CART decision tree model increased by 5.22% and 2.11%, respectively, for every additional 10 000 samples. In Beichuan county, the prediction accuracy of these two models increased by 4.88% and 3.40%, respectively. Secondly, 100 sets of training data and validation data generated by random sampling were fed into two models for training. The experimental results show that, for Caiyuan basin, the mean prediction accuracy was 81.60% and 72.97% for BP artificial neural network and CART model, respectively, and the standard deviation was 0.32% and 0.35% for BP and CART, respectively. For Beichuan county, the mean prediction accuracy of two models was 77.45% and 72.61%, respectively, and the standard deviation was 0.47% and 0.61%, respectively. Finally, landslide susceptibility maps were generated based on two models. Compared to real landslide spatial distribution map, the result of BP artificial neural network was more consistent with the actual landslide distributions. In general, our study demonstrates that BP artificial neural network is more sensitive to the increase of data volume and has better model stability and prediction accuracy than CART model. But it is worth noting that the performance of two models is close with small data volume. The study provides a new perspective of model selection for landslide susceptibility analysis.
表1 本实验应用数据Tab. 1 Data information of this study |
类型 | 比例尺或分辨率 | 来源 | 时间 |
---|---|---|---|
SPOT影像 | 2.5 m | 福建省地质环境监测中心 | 2010年6月 |
地形数据 | 5 m | 滑坡事件前 | |
归一化植被指数 | 30 m | Landsat影像 | 2010年4月 |
地质图 | 1:500 000 | 中国地质调查局资料 | 滑坡事件前 |
表2 孕灾因子主成分Tab. 2 Principle components of landslide influence factors |
主成分 | PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 |
---|---|---|---|---|---|---|---|---|
特征值 | 2.86 | 2.45 | 1.32 | 1.23 | 0.94 | 0.82 | 0.53 | 0.47 |
贡献率 | 0.26 | 0.22 | 0.12 | 0.11 | 0.09 | 0.07 | 0.05 | 0.04 |
累计贡献率 | 0.26 | 0.48 | 0.60 | 0.71 | 0.80 | 0.87 | 0.92 | 0.96 |
图8 决策树和人工神经网络模型预测精度对比Fig. 8 Predictionaccuracy of CART decision tree model and BP artificial neural network |
表3 验证集预测精度和标准差Tab. 3 Prediction accuracy and standard deviation of validationsets (%) |
模型 | 均值 | 95%置信区间 | 标准差 | 95%置信区间 | |
---|---|---|---|---|---|
验证集 | BP | 81.60 | (81.53,81.66) | 0.32 | (0.28,0.38) |
预测精度 | CART | 72.97 | (72.90,73.04) | 0.35 | (0.31,0.40) |
验证集正样 | BP | 84.86 | (84.78,84.93) | 0.37 | (0.33,0.43) |
本预测精度 | CART | 76.59 | (76.45,76.72) | 0.67 | (0.59,0.78) |
[1] |
王治华. 滑坡图像自动识别浅议[J]. 地球信息科学学报, 2013,15(5):726-733,782.
[
|
[2] |
黄润秋. 20世纪以来中国的大型滑坡及其发生机制[J]. 岩石力学与工程学, 2007,26(3):433-454.
[
|
[3] |
章诗芳, 王玉芬, 贾蓓, 等. 中国2005-2016年地质灾害的时空变化及影响因素分析[J]. 地球信息科学学报, 2017,19(12):1567-1574.
[
|
[4] |
|
[5] |
陶舒, 胡德勇, 赵文吉, 等. 基于信息量与逻辑回归模型的次生滑坡灾害敏感性评价——以汶川县北部为例[J]. 地理研究, 2010,29(9):1594-1605.
[
|
[6] |
林金煌, 张岸, 邓超, 等. 闽三角城市群地质灾害敏感性评价[J]. 地球信息科学学报, 2018,20(9):1286-1297.
[
|
[7] |
|
[8] |
杨根云, 周伟, 方教勇. 基于信息量模型和数据标准化的滑坡易发性评价[J]. 地球信息科学学报, 2018,20(5):674-683.
[
|
[9] |
许冲, 徐锡伟. 逻辑回归模型在玉树地震滑坡危险性评价中的应用与检验[J]. 工程地质学报, 2012,20(3):326-333.
[
|
[10] |
叶超凡, 张一驰, 熊俊楠, 等. 湖南省山丘区小流域山洪灾害危险性评价[J]. 地球信息科学学报, 2017,19(12):1593-1603.
[
|
[11] |
|
[12] |
许冲, 徐锡伟. 基于GIS与ANN模型的地震滑坡易发性区划[J]. 地质科技情报, 2012,31(3):116-121.
[
|
[13] |
|
[14] |
戴福初, 姚鑫, 谭国焕. 滑坡灾害空间预测支持向量机模型及其应用[J]. 地学前缘, 2007,14(6):153-159.
[
|
[15] |
李远远, 梅红波, 任晓杰, 等. 基于确定性系数和支持向量机的地质灾害易发性评价[J]. 地球信息科学学报, 2018,20(12):1699-1709.
[
|
[16] |
兰恒星, 王苓涓, 周成虎. 地理信息系统支持下的滑坡灾害分析模型研究[J]. 工程地质学报, 2002(4):421-427.
[
|
[17] |
|
[18] |
|
[19] |
|
[20] |
邱海军, 曹明明, 刘闻, 等. 基于三种不同模型的区域滑坡灾害敏感性评价及结果检验研究[J]. 地理科学, 2014,34(1):110-115.
[
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[
|
[29] |
|
[30] |
伍宇明, 兰恒星, 高星, 等. 一种基于贝叶斯理论的区域斜坡稳定性评价模型[J]. 工程地质学报, 2014,22(6):1227-1233.
[
|
[31] |
兰恒星, 伍法权, 周成虎, 等. 基于GIS的云南小江流域滑坡因子敏感性分析[J]. 岩石力学与工程学报, 2002(10):1500-1506.
[
|
[32] |
杨城, 林广发, 张明锋, 等. 基于DEM的福建省土质滑坡敏感性评价[J]. 地球信息科学学报, 2016,18(12):1624-1633.
[
|
[33] |
李郎平, 兰恒星, 郭长宝, 等. 基于改进频率比法的川藏铁路沿线及邻区地质灾害易发性分区评价[J]. 现代地质, 2017,31(5):911-929.
[
|
[34] |
林齐根, 刘燕仪, 刘连友, 等. 支持向量机与Newmark模型结合的地震滑坡易发性评估研究[J]. 地球信息科学学报, 2017,19(12):1623-1633.
[
|
[35] |
陈霄燕, 潘军, 邢立新, 等. 桂林-阳朔地区DEM地形特征与岩性相关性分析及分类研究[J]. 地球信息科学学报, 2019,21(12):1867-1876.
[
|
[36] |
袁东, 池永翔, 程刚. 闽北地区不同植被类型下滑坡体土层入渗性能研究[J]. 长江科学院院报, 2010,27(5):8-12.
[
|
[37] |
|
[38] |
|
/
〈 | 〉 |