Journal of Geo-information Science >
A Method for Detecting Outliers of Soil Heavy Metal Data Based on Spatial Autocorrelation and Probability Theory
Received date: 2016-09-23
Request revised date: 2016-11-01
Online published: 2017-05-20
Copyright
Data was the basis of carrying out the research on environmental quality of the soil. However, in the experimental process, the systematic errors and artificial errors may lead to some outliers, which may reduce the data quality and cause erroneous judgement for pollution assessment and management decision. At present, there was a lack of thorough study and exploration in this respect. Based on this, a method for detecting outliers of soil heavy metal data was put forward in this study. The soil Cd concentration of Beijing in China was taken as an example to verify the validity of the method. The results show that there are 34 outliers for Cd concentration in Beijing. The detected outliers in Beijing were re-analysed. The results showed that 76.5% of the outliers were found to be caused by the systematic errors and artificial errors and 20.6% of the outliers existed, objectively. After the correction, the interpolation accuracy was improved significantly. The mean relative error and mean square error of the outliers were reduced by 44.56% and 33.33%, respectively. Also, the mean relative error and mean square error of the nearest neighboring points which are influenced by the outliers were reduced by 20.59% and 17.33%, respectively. Results indicated that the outliers of soil heavy metal could be effectively detected by the proposed method. Under the premise of adding finite sample size and analysis time, the quality of the survey data was improved and an effective tool was provided to carry out soil investigation at regional scale and guarantee the data quality.
Key words: soil heavy metal; outlier data; check method; cross-validation; prediction accuracy
WANG Jingyun , YANG Jun , YANG Junxing , LEI Mei , WAN Xiaoming , ZHOU Xiaoyong , CHEN Tongbin , ZHANG Hongri , ZHAO Xiangwei . A Method for Detecting Outliers of Soil Heavy Metal Data Based on Spatial Autocorrelation and Probability Theory[J]. Journal of Geo-information Science, 2017 , 19(5) : 605 -612 . DOI: 10.3724/SP.J.1047.2017.00605
Fig. 1 Distribution of soil samples in Beijing图1 北京市土壤采样点分布 |
Tab. 1 Check results of Cd outliers表1 Cd元素异常数据校验结果 |
行政 单元 | 样品 数量 | 空间自 相关法 | 极差法 (数理统计) | 极差法 (背景值) | 空间自相关法 -极差法联合 |
---|---|---|---|---|---|
北京 | 727 | 100 | 33 | 70 | 34 |
注:联合校验的异常点取空间自相关法与2种极差法校验结果的交集部分 |
Fig. 2 Distribution of outliers detected by mathematical statistics method图2 数理统计校验异常点分布 |
Fig. 3 Distribution of outliers detected by background value method图3 背景校验异常点分布 |
Fig. 4 Distribution of outliers detected by spatial autocorrelation method图4 空间校验异常点分布 |
Fig. 5 Distribution of outliers detected by combined method图5 联合校验异常点分布 |
Fig. 6 Comparison graph of Cd outliers between the original results and re-analysed results图6 Cd异常点原始数据与复测数据含量对比图 |
Fig. 7 Diagram of mean relative error and its variation图7 平均相对误差及其变化示意图 |
Fig. 8 Diagram of mean absolute error and its variation图8 平均绝对误差及其变化示意图 |
Fig. 9 Diagram of root mean square error and its variation图9 均方根误差及其变化示意图 |
The authors have declared that no competing interests exist.
[1] |
|
[2] |
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
|
[14] |
|
[15] |
[
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
国家环境保护局,国家技术监督局.GB/T 17141-1997,土壤质量铅、镉的测定石墨炉原子吸收分光光度法[S].北京:中国标准出版社,1997:1-4.
[National Environmental Protection Agency, State Bureau of Technology Supervision. GB/T 17141-1997, Soil quality-Determination of lead, cadmium-Graphite furnace atomic absorption spectrophotometry[S]. Beijing: Standards Press of China, 1997:1-4. ]
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
/
〈 |
|
〉 |