The Random Forest Classification of Wetland from GF-2 Imagery Based on the Optimized Feature Space

ZHAN Guoqi; YANG Guodong; WANG Fengyan; XIN Xiuwen; GUO Ce; ZHAO Qiang

doi:10.12082/dqxxkx.2018.180119

Journal of Geo-information Science >

2018 , Vol. 20 >Issue 10: 1520 - 1528

DOI: https://doi.org/10.12082/dqxxkx.2018.180119

The Random Forest Classification of Wetland from GF-2 Imagery Based on the Optimized Feature Space

ZHAN Guoqi ,
YANG Guodong ^,^* ,
WANG Fengyan ,
XIN Xiuwen ,
GUO Ce ,
ZHAO Qiang

Expand

College of Geo-exploration Science and Technology, Jilin University, Changchun 130012, China

*Corresponding author: YANG Guodong, E-mail: gqzhan16@mails.jlu.edu.cn

Received date: 2018-03-01

Request revised date: 2018-07-25

Online published: 2018-10-17

Supported by

National Natural Science Foundation of China, No.41472243.

Copyright

《地球信息科学学报》编辑部所有

Fold

Abstract

Due to seasonal vegetation dynamics and hydrological fluctuations, classification of wetland from remote sensing images is often more difficult. In this paper, a pretreated GF-2 image in the east of Tongyu Country, Baicheng City, Jilin Province, was analyzed by Random Forest with optimized feature space. The key method is divided into two steps. The first step is to perform multi-scale segmentation and extraction of object features in the remote sensing image of the study area. For a situation that some scholars obtain the best segmentation scale subjected to subjective factors, this paper obtains the best segmentation scale by improving the global optimal segmentation method, The second step is based on optimal segmentation, to optimize the feature space of the random forest classification algorithm on the basis of the importance of features to obtain the best random forest classification results, and then the classification results of the K-NN, SVM, and CART algorithms with the same data, the same segmentation scale, the same training sample and the same feature space, and the RF algorithm with unoptimized feature space are compared. The results show that the total classification accuracy and Kappa coefficient of the RF algorithm based on optimized feature space are 93.038% and 0.9177, respectively, while the total accuracy of the classification results of K-NN, SVM and CART are 83.357% and 78.068%, respectively, 77.136%, the total accuracy of the classification results of RF algorithm with unoptimized feature space is 90.937%. Compared with K-NN, SVM and CART classification algorithms, the RF algorithm has better classification performance in GF-2 wetland image data. At the same time, the accuracy of the RF algorithm with the optimized feature space has been improved, and it can play a very important role in wetland resource management.

Key words： GF-2 Imagery; object-oriented; random forest; wetland classification; optimal segmentation scale; feature space optimization

Cite this article

ZHAN Guoqi , YANG Guodong , WANG Fengyan , XIN Xiuwen , GUO Ce , ZHAO Qiang . The Random Forest Classification of Wetland from GF-2 Imagery Based on the Optimized Feature Space[J]. Journal of Geo-information Science, 2018 , 20(10) : 1520 -1528 . DOI: 10.12082/dqxxkx.2018.180119

1 引言

湿地是物种最丰富的生态系统,也是人类最重要的生存环境之一^[1]。近年来,由于人类活动的影响,中国的一些湿地遭到破坏,如吉林省西北部地区,人们在湿地区域开垦耕地,盐碱化也不断加重,造成湿地功能和作用不断退化。因此,湿地监测工作至关紧要。研究已经证明,遥感卫星影像可以提供湿地监测的重要信息。然而,由于季节性的植被动态和水文波动,湿地遥感影像分类是比较困难的,其精度受到多方面的影响,如数据源和分类方法等。

在数据源方面,国外Landsat光学遥感影像、MODIS数据、SPOT卫星数据以及RADARSAT等卫星数据已经在土地覆盖分类和湿地分类中得到可靠应用^[2,3,4,5],国内学者也使用这些数据对湿地分类进行了很多研究^[6,7,8]。国内有高分二号遥感卫星可以为用户提供0.8 m全色和3.2 m多光谱图像数据,但是其在湿地分类中的研究和应用还不是很多。相比于低、中分辨率的遥感影像,GF-2等高分辨率遥感数据对地理对象的几何信息呈现程度更高。为了充分利用高分辨率遥感影像中的光谱以及纹理、几何和空间等特征信息,面向对象的图像分析策略（object-oriented）已经被提出并广泛应用^[9,10,11,12]。

分割是面向对象图像分析方法的重要环节之一,对于影像分类精度的提高具有重要的意义^[13],而分割质量与分割尺度参数密切相关。乔婷等^[14]将NDVI应用到多尺度分割中,来提高分割的效果,但是这种分割方式需要多次尝试,效率低,对分割质量没有一个定量的评价。针对这一问题,国内外学者提出了面向对象的全局最优分割尺度计算模型^[15,16,17],如殷瑞娟等^[18]进一步改进了全局最优分割算法,以主成分变换得到的主成分栅格数据作为影像分割的依据,每个主成分的特征值百分比作为这一主成分参与分割时的权重和计算分割质量评价值的权重,并利用三次样条插值计算最优分割尺度;这种改进增强了全局最优分割尺度算法的适用性。

在分割的基础上进行分类的方法有随机森林（RF）、最邻近法、支持向量机（SVM）、贝叶斯（Bayes）、决策树（CART）以及神经网络算法等。其中,随机森林算法作为机器学习分类算法的一种,在遥感影像分类中表现出了较高的分类速度、分类精度以及较好的稳定性^[19]。研究表明,随机森林可以处理高维数据并且适用于大数据量的分类算法,尤其是高维数据分类中,更能体现出其速度快、精度高、稳定性好的优势^{[20,21,22,23,24]}。因此,近年来国内外学者也渐渐将随机森林算法引入湿地分类应用研究中。例如,刘舒等^[25]提出一种多目标遗传随机森林组合式特征选择算法（MOGARF）对南瓮河流域进行了面向对象湿地分类;刘家福等^[26]采用Relief（relevant features）-F算法进行随机森林模型的特征优化,对黄河口滨海湿地信息提取进行了研究;李方方等^[27]也提出了面向对象随机森林湿地植被分类方法;Masoud等^[28]基于随机森林分类算法使用加拿大纽芬兰的3种不同传感器的SAR数据进行了湿地分类研究。从上述研究可看出,特征空间的构成对随机森林算法分类精度的影响非常大。

本文研究的目标是建立一种适用于GF-2遥感影像湿地分类的方法。着重研究通过改进全局最优分割方法,获取最佳分割尺度,然后优化特征空间,利用随机森林算法得到高精度的分类结果,并与基于面向对象的K-NN、SVM和CART 3种分类算法进行比较,分析随机森林算法在GF-2影像湿地分类中的性能,也为GF-2数据在湿地分类中的应用提供借鉴。

2 研究区概况与数据源

2.1 研究区概况

研究区位于吉林省东北部,属于白城市通榆县,如图1所示,经纬度范围为44°58′36″~45°01′49″ N,123°22′26″~123°27′10″ E,面积约36.977 km²,平均海拔为160 m,地势平坦。研究区属北温带大陆性季节气候,年平均气温为6.6 ℃,年均降水量为3324 mm。研究区东北部有查干湖湿地,北部有莫莫格湿地,主要是盐碱湿地类型,区域内建筑较少,地物覆盖类型较复杂。本研究在充分考虑研究区湿地类型特征和分布特点的基础上,充分挖掘高分影像的高维信息特征对湿地植被、水体以及组合表达的优势,依据湿地遥感分类原则即：等级性原则、可操作性原则、可扩展性原则和可行性原则建立本文湿地遥感分类体系,将其划分为盐碱泡、盐碱裸地、杂草盐生沼泽、香蒲沼泽、草丛沼泽、旱地、树林、房屋、道路9类（表1）。

View original graphic|Download|PPT slide

Fig.1 The geographical location map of the study area

图1 研究区地理位置图

Tab.1 Classification system of wetland

表1 湿地遥感分类体系

类型名称		类型含义
自然湿地	盐碱泡	盐碱化区域、含盐分较高的湖泊坑塘
	杂草盐生沼泽	位于盐碱泡周边、盐碱地上,耐盐植被覆盖
	香蒲沼泽	距离水泡附近,以香蒲植被为主的植被沼泽,呈现暗绿色
	草丛沼泽	分布在香蒲沼泽与耕地之间,覆盖草丛植被
非湿地	盐碱裸地	表层盐碱聚集,基本没有植被的土地
	旱地	无灌溉设施,依靠天然水源的耕地
	树林	包含旱地周边的人工树林和天然树林
	房屋	包括达到2×2个像元的建筑
	道路	主要为农村之间修筑的水泥道路

研究区湿地分类有以下难点：① 因为人类活动的影响,许多靠近坑塘的土地被转化为了农业用地,使湿地分类复杂化;② 研究区湿地存在“同物异谱”和“同谱异物”现象,严重影响分类精度;③ 杂草盐生沼泽与草丛沼泽具有相似的纹理特征。这些难点造成了常规方法的分类精度比较低。因此,本文采用基于面向对象的随机森林分类算法对影像对象的光谱、纹理、几何和空间等信息进行综合利用和分析,来提高湿地分类的准确性和稳定性。

2.2 数据源

GF-2卫星是在2014年8月19日由中国太原卫星发射中心发射地面分辨率优于1 m的民用遥感卫星。本研究采用2016年7月29日的GF-2影像数据^[29],包括3.2 m分辨率的多光谱数据（包含蓝、绿、红、近红4个波段）以及0.8 m分辨率的全色波段数据,大小为7813像元×7395像元,数据坐标参考系为WGS 84/UTM ZONE 51N。本文首先对已有的多光谱以及全色波段数据正射校正和配准后进行融合,得到了0.8 m分辨率的多光谱数据,然后通过裁剪得到最终研究区遥感数据。

3 研究方法

在数据预处理的基础上,本文首先通过改进全局最优分割算法,在殷瑞娟等^[18]改进基础上,加入NDVI和NDWI指数特征层,与主成分特征层一起作为分割编辑层,计算多个尺度下的分割质量评价指数,获取最优分割尺度;然后基于随机森林算法的OOB误分率对特征重要性进行度量,并依据特征重要性优化特征空间;最后得到最优结果并进行精度评价,同时与其他机器学习分类算法进行对比,具体技术路线如图2所示。

View original graphic|Download|PPT slide

Fig. 2 The technology roadmap of the study

图2 技术路线图

3.1 计算GF-2归一化植被指数NDVI和归一化水指数NDWI

GF-2影像预处理得到了分辨率为0.8 m的4个波段,分别为蓝波段（450~520 nm）,绿波段（530~590 nm）,红波段（630~690 nm）,近红波段（770~890 nm）。NDVI指数是用非线性拉伸的方式增强了近红波段和红波段的反射率的对比度,可以反映植物冠层的背景影响,增强对植被的响应能力。同时,归一化水指数（NDWI）是广泛应用于检测水体,水体在绿波段有较强的反射率,而在近红波段为强吸收,相反的植被和土壤在近红波段强反射,所以在多光谱影像中利用NDWI很容易区分出植被覆盖区中的水体。

湿地植被覆盖丰富,水体分布众多。在GF-2影像预处理的基础上,使用eCognition Developer计算其多光谱数据的NDVI值和NDWI值并制作NDVI和NDWI特征值层。在进行多尺度分割时,将植被指数NDVI特征值层和水体指数NDWI特征值层参与进去,可以避免较大水域面积和植被覆盖区被分割的过于破碎,提高分割效果。

3.2 主成分变换PCA

通过主成分变换将多波段的图像信息压缩到比原波段更有效的少数几个转换波段中^[30],不仅有利于减少后续处理和分析的数据量,同时主成分变换得到的特征值可以作为图像分割时进行权重设置的客观指标^[18]。本研究使用ENVI软件对研究区影像进行主成分变换,由表2可知,因为用前2个主成分就可以解译99%以上的影像信息,因此选择输出前2个主成分的综合光谱图像。

Tab. 2 The statistical properties of principal component transformation in the study area

表2 研究区主成分变换后的统计属性

主成分PC	特征值	累积特征值百分比/%	特征值百分比/%
1	40 101.956 3	91.27	91.27
2	3 572.032 7	99.40	8.13
3	222.299 9	99.91	0.51
4	40.822 0	100.00	0.09

3.3 改进全局最优分割算法

本文改进全局最优分割算法,首先利用主成分变换后的特征值百分比作为各个主成分图像的分割权重,即分别为0.92, 0.08;然后创建NDVI指数特征层和NDWI指数特征层,分割权重都设置为1;其次图像分割完成后,计算对象内部同质性V和相邻对象之间异质性MI;最后将这2个参数指标集合到一个综合的分割质量函数GS中^[31]。设置起始分割尺度为20,以10为步长,到尺度为200为止,计算每个尺度下的GS值,共计19个GS值,并通过三次样条插值的方式,拟合预测分割质量与分割尺度的关系光滑曲线,得到分割质量最高时候的分割尺度,即为最优分割尺度。具体计算过程如下：

（1）对象内部的同质性（式（1））

V = ∑ i = 1 n a i v i ∑ i = 1 n a i

（1）

式中：a_i为分割对象i的面积,以对象的像元数来表示;v_i为分割对象i的标准差;n为该尺度下整幅影像的分割对象总数。V值越小,对象内部异质性越低,即对象内部越均一。

（2）对象之间的异质性（式（2））

MI = n ∑ i = 1 n ∑ j = 1 n ω ij (y i - y 0) (y j - y 0) [∑ i = 1 n y i - y 0 2] (∑ i≠j ∑ ω ij)

（2）

式中：n为分割对象总数;ω_ij为空间关系权重,如果对象i和对象j相邻接,ω_ij=1,否则ω_ij=0;y_i为对象i的特征均值;y₀为整幅图像的特征均值。MI值越小,对象之间的空间相关性越低,即对象之间的可分性越好。

（3）归一化处理

为了合并计算对象内部同质性和对象之间异质性,必须先对不同尺度下得到的V和MI进行归一化处理,控制值范围在0-1,计算方法如式（3）所示。

(X max - X) (X max - X min)

（3）

经式（3）归一化处理后的V和MI越接近于1,表明对象内部的均一性和对象之间的可分性越好。

（4）分割质量的计算（式（4））

GS = ∑ i = 1 4 ω i (V i + M I i)

（4）

式中：ω_i为参与分割的图层的权重;V_i为对应尺度下的对象内部同质性归一化后的值;MI_i为对应尺度下的对象之间异质性归一化后的值。

（5）三次样条插值（spline插值）

三次样条插值法具有良好的光滑性和稳定性,本文利用MATLAB来实现。

3.4 随机森林算法

RF分类是一个集成学习分类算法,它通过自助法重采样技术,从原始训练样本集N中有放回地重复随机抽取k个样本生成新的训练样本集合,然后从M个特征中随机抽取m个（m<<M）,之后对采样之后的数据使用完全分裂的方式建立出决策树,最终根据自助样本集生成k个分类决策树组成随机森林,新数据的分类结果按分类树投票多少形成的分数而定。因为生长每棵决策树而得到的新训练数据中有三分之一的样本未被选中过,这部分样本数据称为袋外（OOB）样本。这些OOB样本可以用来评估模型的性能,并且已经证明OOB估计是无偏估计^[32]。

3.5 特征空间建立和优化

湿地遥感影像存在很多“同物异谱”、“同谱异物”以及“不同物纹理同”的现象,这些现象反映出特征对湿地分类结果的重要性。通过优化特征空间,可以提高湿地分类的精度。

（1）特征提取

针对湿地分类难点,在完成最优分割的基础上,提取影像对象的光谱、纹理和几何形状等方面的51个特征构成初始特征空间（表3）。

Tab. 3 The initial feature space

表3 初始特征空间

特征名称
光谱特征	Mean B、Mean G、Mean R、Mean NIR、Standard deviation B、Standard deviation G、Standard deviation R、Standard deviation NIR、Brightness、AVE(B、R、G 3波段均值) 、Ratio to scene R、Ratio to scene G、Ratio to scene B、Ratio to scene NIR
几何特征	Area、Length/Width、Width、Asymmetry、Border index、Compactness、Density、Rectangular Fit、Shape index、Number of edges(polygon) 、Stddev of length of edges(polygon)
纹理特征	GLCM Homogeneity PC2(all dir.) 、GLCM Contrast PC2(all dir.) 、GLCM Dissimilarity PC2(all dir.) 、GLCM Entropy PC2(all dir.) 、GLCM Ang.2nd moment PC2(all dir.) 、GLCM Mean PC2(all dir.) 、GLCM StdDev PC2(all dir.) 、GLDV Entropy PC2(all dir.) 、GLDV Ang.2nd moment PC2(all dir.) 、GLDV Mean PC2(all dir.) 、GLDV Contrast PC2(all dir.)
自定义特征	BRITHTEN DIFFER(相邻对象亮度差)、SR、SRWC、percent(B)、percent(G)、percent(R)、Max.diff、Mean NDVI、Mean NDWI、Mean PC1、Mean PC2、Standard deviation NDVI、Standard deviation NDWI、Standard deviation PC1、Standard deviation PC2

（2）特征空间优化

随机森林通过OOB误分率对特征进行重要性度量,然后依据特征重要性对特征空间进行优化。具体计算方法如下：① 对每一棵决策树,选择相应的OOB数据计算袋外数据误差,记为err1;② 随机对袋外数据OOB所有样本的特征F加入噪声干扰,再次计算袋外数据误差,记为err2;③ 假设森林中有N棵树,则估计特征F的重要性（式（5））。

τ f = ∑ (err 2 - err 1) N

（5）

通过对特征按重要性排序,然后以5为步长,分别选取前5个特征,前十个特征,前15个特征,直至所有特征都被选取到,分别采用每次选取的特征集进行随机森林分类并精度评价,选择分类精度最大时的特征集作为本次最终优化的特征空间。

4 实验和结果

依据本文建立的湿地分类体系,通过研究区影像影像观察可知,除了房屋和旱地形状较为规则外,湿地主要类型形状并不规则,而光谱特征却较为明显。因此,对影像进行分割时设置形状因子权重为0.4和紧致性因子权重为0.5,而光谱因子总权重为0.6（其中NDVI、NDWI权重分别都为1,PC1、PC2权重分别为0.92和0.08）,在此条件下通过3.2节讲述的方法,设置不同的分割尺度,对GF-2影像进行了19次分割实验,最终插值得到最佳分割尺度为78.5,如图3所示。本文通过实地调查以及图像目视判读的方法选取了382个样本点,并在影像最佳分割的基础上利用样本点创建了365个以对象为单位的训练样本,共计1 625 563个像元。然后使用第3节讲述的方法,从GF-2影像中进行了特征提取和优化,初始提取51个特征并基于特征重要性（图4）经过实验（图5）选取了前40个特征组成用于随机森林分类的特征空间,最终得到分类总体精度为93.038%,kappa指数为0.9177,而优化特征空间之前得到的分类总精度为90.937%,精度提高了2.101%。同时,本文使用同样的训练样本集,对研究区影像分别采用SVM、CART、KNN分类算法进行了实验,得到的分类结果表明随机森林算法在湿地分类中精度最高（表4）,分类结果如图6所示。

View original graphic|Download|PPT slide

Fig. 3 The relationship between quality of the divide and scale

图3 分割质量与尺度的关系曲线

View original graphic|Download|PPT slide

Fig. 4 The importance of features

图4 特征重要性

View original graphic|Download|PPT slide

Fig. 5 The relationship between the overall accuracy and the number of features

图5 总体精度与特征数量关系折线图

Tab. 4 Comparison of the overall accuracies of the four classifiers

表4 4种分类算法总体精度对比

分类器	RF	KNN	SVM	CART
总体精度/%	93.038	83.357	78.068	77.136
Kappa指数	0.9177	0.8029	0.7406	0.7290

View original graphic|Download|PPT slide

Fig. 6 The classification results of the four models

图6 4种算法分类结果

随机森林的总体精度比其他3种分类的准确度高约10%,证明了基于联合投票的RF决策策略优于KNN、SVM和CART分类器。本文通过计算混淆矩阵来分析随机森林算法的分类性能,各类分类精度如表5所示。

Tab. 5 The User's and Producer's accuracy of the RF classification

表5 随机森林分类的用户精度和制图精度(%)

类别	错分误差	漏分误差	制图精度	用户精度
盐碱泡	0	0	100	100
香蒲沼泽	6.02	26.6	73.4	93.98
杂草盐生沼泽	13.2	12.79	87.21	86.8
草丛沼泽	24.19	3.3	96.7	75.81
盐碱裸地	0	2.01	97.99	100
林地	0	0	100	100
旱地	2.63	1.19	98.81	97.37
建筑物	0	0	100	100
道路	8.11	0	100	91.87

从表5可看出,盐碱泡、盐碱裸地、林地、旱地和建筑物等地物分类的准确率在95%以上,香蒲沼泽和杂草盐生沼泽分类准确率在85%~95%之间,而草丛沼泽的准确率相对较低。虽然实验的的总体精度令人满意,但是在杂草盐生沼泽和草丛沼泽上仍有很大的错分误差,以及在香蒲沼泽和杂草盐生沼泽上出现了较大的漏分误差。杂草盐生沼泽与草丛沼泽有较大错分是因为二者地理分布相互交错（图6）,二者都是白色裸地与深绿色草丛混杂,区别是草丛沼泽中草丛面积较大,与裸地边界明显,而杂草盐生沼泽总体呈现类似鱼鳞状纹理,草丛面积小,且与裸地边界不明显（图7）。总体表现上二者光谱和几何特征非常相似。同时香蒲沼泽在盐碱泡附近分布,与草丛沼泽邻近,且二者光谱特征接近,因此在二者交界处出现一些漏分和错分现象。

View original graphic|Download|PPT slide

Fig. 7 Brushwood marsh and weed salt marshes

图7 草丛沼泽和杂草盐生沼泽

5 结论

本研究以GF-2遥感数据为基础,论证了随机森林算法在湿地分类中的应用。为了解决湿地分类中存在的一些困难点,本文首先改进全局最优分割方法对影像进行了最优分割,然后根据特征重要性建立了一个更加有效的特征空间,包括从高分辨率影像中提取的光谱、几何、纹理以及自定义的波段百分比、植被指数和水体指数等特征,实现了基于随机森林算法的高精度分类结果,并从中发现,随机森林算法分类的总体精度与特征数量不是正相关关系。本实验中,当特征数量少于40时,不足以提供区分地物所需的信息时,分类精度较低,但随着数量的增加,分类精度也不断提高,当数量达到40时,精度最高;当数量大于40时,噪音信息也增加到一定程度,干扰分类器的决策,从而分类精度降低。而这40个特征是依据重要性进行选择的,其中几何特征所占比重不是很大,这是因为湿地中的大多数地物形状并不规则。最后,随机森林分类的总体精度达到93.038%,比KNN、SVM和CART分类器的精度分别高约10%、15%、16%,证明本文改进后的随机森林算法在GF-2影像湿地分类上具有最优的性能。

本研究发现,当把本研究中初始特征空间的特征数量从51去掉3个特征之后剩下的48个特征作为初始特征空间进行优化,最后得到的最优特征空间的特征数量并不是40个,而是35个,特征构成也不同,总体精度却更高,这表明单纯基于特征重要性对特征空间进行优化并不是最优的方法。因此,接下来的研究目标就是建立一种普适性强的最优特征空间优化方法来构建随机森林算法的最优特征空间,从而进一步提高分类精度,使其在湿地分类管理中得到更加可靠的应用。

The authors have declared that no competing interests exist.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Tong L, Xu X, Fu Y, et al.Wetland changes and their responses to climate change in the “Three-River Headwaters” region of China since the 1990s[J]. Energies, 2014,7(4):2515-2534. DOI

[2]

Thakur J

, Srivastava P

, Singh S

, et al.Ecological monitwring of wetlands in semiarid region of Konya closed Basin Turkey[J]. Regional Environmental Change, 2012,12(1):133-144.

Wetland ecosystems are of global significance having productive, regulatory and informative function. These wetlands are crucial for the long-term protection of water sources, as well as the survival of its unique biodiversity. Most of the wetlands of Turkey are now facing serious threat from the anthropogenic sources and now near to the verge of extinction. This study has been carried out to monitor vegetation dynamics and ecological status of wetlands of Koyna basin at spatial and temporal scale. This study has involved MODerate-resolution Imaging Spectroradiometer (MODIS) images of the year 2000, 2004 and 2008 on daily basis with spatial resolution of 1 km. The MODIS 16 days composite NDVI time series products of 250-m spatial resolution from year 2000 to 2008 has been utilized to monitor the ecological status of the wetlands. The European Nature Information System habitat classification map, meteorological data (precipitation, temperature) coupled with field data has been utilized to validate NDVI values of nine habitats in the wetlands. The time series analyses of NDVI data values have been correlated with the groundwater level depth from 1996 to 2004. The overall analysis has shown a declining trend of NDVI over the year 2000 to 2008, indicated a degraded wetland condition in span of 9 years.

DOI

[3]

Davranche

, Lefebvre

, Poulin

.Wetland monitoring using classification trees and SPORT-5 seasonal time series[J]. Remote Sensing Environment, 2010,114(3):552-562.

Multiseason reflectance data from radiometrically and geometrically corrected multispectral SPOT-5 images of 10-m resolution were combined with thorough field campaigns and land cover digitizing using a binary classification tree algorithm to estimate the area of marshes covered with common reeds ( Phragmites australis) and submerged macrophytes ( Potamogeton pectinatus, P. pusillus, Myriophyllum spicatum, Ruppia maritima, Chara sp.) over an area of 145,000 ha. Accuracy of these models was estimated by cross-validation and by calculating the percentage of correctly classified pixels on the resulting maps. Robustness of this approach was assessed by applying these models to an independent set of images using independent field data for validation. Biophysical parameters of both habitat types were used to interpret the misclassifications. The resulting trees provided a cross-validation accuracy of 98.7% for common reed and 97.4% for submerged macrophytes. Variables discriminating reed marshes from other land covers were the difference in the near-infrared band between March and June, the Optimized Soil Adjusted Vegetation Index of December, and the Normalized Difference Water Index (NDWI) of September. Submerged macrophyte beds were discriminated with the shortwave-infrared band of December, the NDWI of September, the red band of September and the Simple Ratio index of March. Mapping validations provided accuracies of 98.6% (2005) and 98.1% (2006) for common reed, and 86.7% (2005) and 85.9% (2006) for submerged macrophytes. The combination of multispectral and multiseasonal satellite data thus discriminated these wetland vegetation types efficiently. Misclassifications were partly explained by digitizing inaccuracies, and were not related to biophysical parameters for reedbeds. The classification accuracy of submerged macrophytes was influenced by the proportion of plants showing on the water surface, percent cover of submerged species, water turbidity, and salinity. Classification trees applied to time series of SPOT-5 images appear as a powerful and reliable tool for monitoring wetland vegetation experiencing different hydrological regimes even with a small training sample ( N = 25) when initially combined with thorough field measurements.

DOI

[4]	Hong S, Jang H, Kim N, Sohn H.Water area extraction using RADARSAT SAR imagery combined with Landsat imagery and terrain information[J]. Sensors, 2015,15(3):6652-6667. DOI

[5]

Chen

, He

, Wang

.Classification of coastal wetlands in eastern China using polarimetric SAR data[J]. Arabian Journal of Geosciences, 2015,8(12):10203-10211.

This study was initiated to classify Jiangsu coastal wetlands, which are situated on the north bank of the Yangtze River in eastern China, using fully polarimetric synthetic aperture radar (PolSAR)...

DOI

[6]	Han X, Chen X, Feng L.Four decades of winter wetland changes in Poyang Lake based on Landsat observations between 1973 and 2013[J]. Remote Sensing Environment, 2015,156:426-437. DOI

[7]	Zhang S, Zhang C, Zhang L, et al.Wetland remote sensing classification using support vector machine optimized with genetic algorithm: A case study in Honghe Nature National Reserve[J]. Scientia Geograpgica Sinica, 2012,32(4):435-441.

[8]	Liu L, Zang S, Na X, et al.Wetland mapping in Zhalong Natural Reserve using optical and radar remotely sensed data and ancillary topographical data[J]. Geographic Geo-information Science, 2013,29(1):36-40.

[9]	Laba M, Blair B, Downs R, et al.Use of textural measurements to map invasive wetland plants in the Hudson River National Estuarine Research Reserve with IKONOS satellite imagery[J]. Remote Sens. Environ, 2010,114(4):876-886. DOI

[10]	Dronova I, Gong P, Wang L.Object-based analysis and change detection of major wetland cover types and their classification uncertainty during the low water period at Poyang Lake, China[J]. Remote Sensing Environment, 2011,115(12):3220-3236. DOI

[11]

Blaschke

, Hay G

, Kelly

, et al.Geographic object-based image analysis-towards a new paradigm[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014,87(100):180-191.

The amount of scientific literature on (Geographic) Object-based Image Analysis – GEOBIA has been and still is sharply increasing. These approaches to analysing imagery have antecedents in earlier research on image segmentation and use GIS-like spatial analysis within classification and feature extraction approaches. This article investigates these development and its implications and asks whether or not this is a new paradigm in remote sensing and Geographic Information Science (GIScience). We first discuss several limitations of prevailing per-pixel methods when applied to high resolution images. Then we explore the paradigm concept developed by Kuhn (1962) and discuss whether GEOBIA can be regarded as a paradigm according to this definition. We crystallize core concepts of GEOBIA, including the role of objects, of ontologies and the multiplicity of scales and we discuss how these conceptual developments support important methods in remote sensing such as change detection and accuracy assessment. The ramifications of the different theoretical foundations between the ‘per-pixel paradigm’ and GEOBIA are analysed, as are some of the challenges along this path from pixels, to objects, to geo-intelligence. Based on several paradigm indications as defined by Kuhn and based on an analysis of peer-reviewed scientific literature we conclude that GEOBIA is a new and evolving paradigm.

DOI PMID

[12]	Dronova I, Gong P, Wang L, et al.Mapping dynamic cover types in a large seasonally flooded wetland using extended principal component analysis and object-based classification[J]. Remote Sensing Environment, 2015,158:193-206. DOI

[13]

Blaschke

.Object-based image analysis for remote sensing[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2010,65(1):2-16.

Remote sensing imagery needs to be converted into tangible information which can be utilised in conjunction with other data sets, often within widely used Geographic Information Systems (GIS). As long as pixel sizes remained typically coarser than, or at the best, similar in size to the objects of interest, emphasis was placed on per-pixel analysis, or even sub-pixel analysis for this conversion, but with increasing spatial resolutions alternative paths have been followed, aimed at deriving objects that are made up of several pixels. This paper gives an overview of the development of object based methods, which aim to delineate readily usable objects from imagery while at the same time combining image processing and GIS functionalities in order to utilize spectral and contextual information in an integrative way. The most common approach used for building objects is image segmentation, which dates back to the 1970s. Around the year 2000 GIS and image processing started to grow together rapidly through object based image analysis (OBIA - or GEOBIA for geospatial object based image analysis). In contrast to typical Landsat resolutions, high resolution images support several scales within their images. Through a comprehensive literature review several thousand abstracts have been screened, and more than 820 OBIA-related articles comprising 145 journal papers, 84 book chapters and nearly 600 conference papers, are analysed in detail. It becomes evident that the first years of the OBIA/GEOBIA developments were characterised by the dominance of rey literature, but that the number of peer-reviewed journal articles has increased sharply over the last four to five years. The pixel paradigm is beginning to show cracks and the OBIA methods are making considerable progress towards a spatially explicit information extraction workflow, such as is required for spatial planning as well as for many monitoring programmes.

DOI

[14]	乔婷,张怀清,陈永富,等.基于NDVI分割与面向对象的东洞庭湖湿地植被信息提取技术[J].西北林学院学报,2013,28(4):170-175. [ Qiao T, Zhang H Q, Chen Y F, et al.Extraction of vegetation information based on NDVI segmentation and object oriented Method[J]. Journal of Northwest Forestry College, 2013,28(4):170-175. ]

[15]	Gonzalez-Audieana M, Saleta J L, Catalan R G, et al.Fusion of multispectral and Panchromatic images using improved HIS and PCA mergers based on wavelet decomposition[J]. IEEE Transactions on Geoscience and Remote Sensing, 2004,42(6):1291-1299. DOI

[16]

何敏,张文君,王卫红.面向对象的最优分割尺度计算模型[J].大地测量与地球动力学,2009,29(1):106-109.

面向对象的影像分析方法适合高分辨率遥感影像信息提取，其核心问题在于实现对高分辨率遥感影像的多尺度分割。提出了一种针对面向对象分析方法中多尺度分割的最优分割尺度计算模型，并进行了影像分割实验。结果表明，此模型能快速地获取可靠的最优分割尺度。

[ He

, Zhang W

, Wang W

.Optimal segmentation scale model based on object-oriented analysis method[J]. Geodetic measurement and geodynamics, 2009,29(1):106-109. ]

[17]

刘兆祎,李鑫慧,沈润平,等.高分辨率遥感图像分割的最优尺度选择[J].计算机工程与应用,2014,50(6):144-147.

面向对象的最优尺度选择是高分辨率遥感图像多尺度分割技术中的关键问题。最优分割尺度的确定直接影响到后续的图像信息提取与分析。在模型计算法的基础上，改进并实现了一种全局最优尺度计算模型。该最优尺度计算模型可以根据多波段信息自动选择最优尺度，从而避免了人目视的主观性。

[ Liu Z

, Li X

, Shen R

, et al.Selection of the best segmentation scale in high-resolution image segmentation[J]. Computer Engineering and Applications, 2014,50(6):144-147. ]

[18]

殷瑞娟,施润和,李镜尧.一种高分辨率遥感影像的最优分割尺度自动选取方法[J].地球信息科学学报,2013,15(6):902-910.

随着卫星遥感影像空间分辨率的不断提高，面向对象的地物信息提取技术发展迅速。图像分割作为面向对象分类的关键步骤之一，其分割尺度的参数设置目前仍以分类者的多次尝试和主观判断为依据，效率较低且分割结果因人而异。本文以WorldView2影像数据为例，结合当前现有的理论和方法，实现了一种计算机可自动进行主成分变换的高分辨率遥感图像全局最优分割尺度选取算法。改进后的算法以主成分变换所得的主成分影像作为图像分割的编辑层，主成分的特征值百分比作为计算异质性参数和分割质量评价值的权重，自动计算当分割尺度从20增至200时分割图像的分割质量评价值（GS），解决了人为确定图像分割编辑层的片面性问题，并利用三次样条插值选取出GS最高值所对应的尺度即为最优分割尺度。结果表明，该最优分割尺度选取方法可有效避免人为确定分割尺度的主观性、片面性和低效性，提升了高分辨率影像分割质量。

DOI

[ Yin R

, Shi R

, Li J

.Automatic selection of optimal segmentation scale of high-resolution remote sensing images[J]. Journal of Geo-information Science, 2013,15(6):902-910. ]

[19]	刘毅,杜培军,郑辉,等.基于随机森林的国产小卫星遥感影像分类研究[J].测绘科学,2012,37(4):194-196. [ Liu Y, Du P J, Zheng H, et al.Classification of China small satellite remote sensing image based on random forests[J]. Science of Surveying and Mapping, 2012,37(4):194-196. ]

[20]

Verikas

, Gelzinis

, Bacauskiene

.Mining data with random forests: A survey and results of new tests[J]. Pattern Recognition, 2011,44(2):330-349.

Random forests (RF) has become a popular technique for classification, prediction, studying variable importance, variable selection, and outlier detection. There are numerous application examples of RF in a variety of fields. Several large scale comparisons including RF have been performed. There are numerous articles, where variable importance evaluations based on the variable importance measures available from RF are used for data exploration and understanding. Apart from the literature survey in RF area, this paper also presents results of new tests regarding variable rankings based on RF variable importance measures. We studied experimentally the consistency and generality of such rankings. Results of the studies indicate that there is no evidence supporting the belief in generality of such rankings. A high variance of variable importance evaluations was observed in the case of small number of trees and small data sets.

DOI

[21]

Belgiu

, Dragut

.Random forest in remote sensing: A review of applications and future directions[J]. ISPRS Journal of Photngrammetry and Remote Sensing, 2016,114:24-31.

A random forest (RF) classifier is an ensemble classifier that produces multiple decision trees, using a randomly selected subset of training samples and variables. This classifier has become popular within the remote sensing community due to the accuracy of its classifications. The overall objective of this work was to review the utilization of RF classifier in remote sensing. This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting. It is, however, sensitive to the sampling design. The variable importance (VI) measurement provided by the RF classifier has been extensively exploited in different scenarios, for example to reduce the number of dimensions of hyperspectral data, to identify the most relevant multisource remote sensing and geographic data, and to select the most suitable season to classify particular target classes. Further investigations are required into less commonly exploited uses of this classifier, such as for sample proximity analysis to detect and remove outliers in the training samples.

DOI

[22]

严婷婷,边红枫,廖桂项,等.森林湿地遥感信息提取方法研究现状[J].国土资源遥感,2014,26(2):11-18.

森林湿地是湿地的重要组成部分，因其群落结构复杂，从景观尺度上对其进行识别是湿地研究的难点之一。在分析国内外相关科学文献的基础上，从地理生态环境、影像特征及影像信息处理单元角度分析了森林湿地遥感信息提取方法的研究现状，总结了当前森林湿地遥感提取的特点，并在此基础上初步预测未来森林湿地遥感信息提取方法的发展趋势。结果表明：以水文特征反演为依据，认为基于地理生态环境的提取方法包括基于水文地貌学方法、基于光学遥感和微波遥感的分类方法；基于影像特征提取方法包括基于雷达散射特征决策树、随机森林决策树及航片目视解译法；基于影像信息处理单元角度方法主要有基于像元和对象分类2种方法。

DOI

[ Yan T

, Bian H

, Liao G

, et al.Research status of methods for mapping forested wetlands based on remote sensing[J]. Remote Sensing for Land & Resourse, 2014,26(2):11-18. ]

[23]

王书玉,张羽威,于振华.基于随机森林的洪河湿地遥感影像分类研究[J].测绘与空间地理信息,2014,37(4):83-85,93.

随机森林（ Random Forests ）是一种最有效的分类方法之一。现阶段，它吸引了来自不同领域的研究人员，被广泛应用到不同的学科领域之中。本文采用TM影像，运用随机森林算法，对洪河湿地影像进行分类，并与最大似然监督分类方法（ Maximum Likelihood Classification ，MLC）和 CART （ Classification And Regression Tree ）算法对比。结果表明，基于RF算法的分类结果的总精度和Kappa系数分别为88．31％和0．82，较MLC和CART分类方法有明显提高。从而证明RF算法可以提高遥感影像的分类精度，并可应用在湿地信息的提取研究中。

DOI

[ Wang S

, Zhang Y

, Yu Z

.Classification of Honghe wetland remote sensing image based on random forests[J]. Geomatics & Spatial Information Technology, 2014,37(4):83-85,93. ]

[24]	Dronova I.Object-based image analysis in wetland research: A review[J]. Remote Sensing, 2015,7(5):6380-6413. DOI

[25]

刘舒,姜琦刚,马玥,等.基于多目标遗传随机森林特征选择的面向对象湿地分类[J].农业机械学报,2017,48(1):119-127.

[ Liu

, Jiang Q

, Ma

, et al.Object-oriented wetland classification based on hybrid feature selection method combining with relief multi-objective genetic algorithmand random forest[J]. Transactions of the Chinese Socierty for Agricultural Machinery, 2017,48(1):119-127. ]

[26]	刘家福,李林峰,任春颖,等.基于特征优选的随机森林模型的黄河口滨海湿地信息提取研究[J].湿地科学,2018,16(2):97-105. [ Liu J F, Li L F, Ren C Y, et al.Information extraction of coastal wetlands in Yellow River Estuary by optimal feature-based random forest model[J]. Wetland Science, 2018,16(2):97-105. ]

[27]	李方方,刘正军,徐强强,任海成.面向对象随机森林方法在湿地植被分类的应用[J].遥感信息,2018,33(1):111-116. [ Li F F, Liu Z J, Xu Q Q, et al.Applicationof object-oriented random forest method in wetland vegetation classification[J]. Remote Sensing Information, 2018,33(1):111-116. ]

[28]	Masoud M, Bahram S, Fariba M, et al. An assessment of simulated compact polarimetric SAR data for wetland classification using random forest algorithm[C]. Canadian Journal of Remote Sensing, 2017,43:5,468-484.

[29]	高分辨率对地观测系统吉林数据与应用中心. GF-2影像数据是2016年7月29日采集[DB/DK].[2016-7-29]. [ High Resolution Earth Observation System Jilin Data and Application Center. GF-2 image data was collected on July 29, 2016.[DB/DK]. [2016-7-29]. ]

[30]	Saucier A, Muller J.Using principal component analysis to enhance the generalized multifractal analysis approach to textural segmentation: Theory and application to micro-resistivity well logs[J]. Physica A: Statistical Mechanics and its Applications, 2002,309(3-4):419-444. DOI

[31]

Chabrier

, Emile

, Rosenberger

, et al.Unsupervised performance evaluation of image segmentation[J].EURASIP Journal on Applied Signal Processing, 2006(1):096306.

We present in this paper a study of unsupervised evaluation criteria that enable the quantification of the quality of an image segmentation result. These evaluation criteria compute some statistics for each region or class in a segmentation result. Such an evaluation criterion can be useful for different applications: the comparison of segmentation results, the automatic choice of the best fitted parameters of a segmentation method for a given image, or the definition of new segmentation methods by optimization. We first present the state of art of unsupervised evaluation, and then, we compare six unsupervised evaluation criteria. For this comparative study, we use a database composed of 8400 synthetic gray-level images segmented in four different ways. Vinet's measure (correct classification rate) is used as an objective criterion to compare the behavior of the different criteria. Finally, we present the experimental results on the segmentation evaluation of a few gray-level natural images.

DOI

[32]	Breiman L.Random forests[J]. Machine Learning, 2001,45(1):5-32. DOI

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

1 引言

2 研究区概况与数据源

2.1 研究区概况

Fig.1 The geographical location map of the study area

Tab.1 Classification system of wetland

2.2 数据源

3 研究方法

Fig. 2 The technology roadmap of the study

3.1 计算GF-2归一化植被指数NDVI和归一化水指数NDWI

3.2 主成分变换PCA

Tab. 2 The statistical properties of principal component transformation in the study area

3.3 改进全局最优分割算法

3.4 随机森林算法

3.5 特征空间建立和优化

Tab. 3 The initial feature space

4 实验和结果

Fig. 3 The relationship between quality of the divide and scale

Fig. 4 The importance of features

Fig. 5 The relationship between the overall accuracy and the number of features

Tab. 4 Comparison of the overall accuracies of the four classifiers

Fig. 6 The classification results of the four models

Tab. 5 The User's and Producer's accuracy of the RF classification

Fig. 7 Brushwood marsh and weed salt marshes

5 结论

References