地球信息科学学报 ›› 2016, Vol. 18 ›› Issue (7): 902-909.doi: 10.3724/SP.J.1047.2016.00902

• 地球信息科学理论与方法 • 上一篇    下一篇

连续变量的自适应局部空间同位模式挖掘算法

范协裕(), 陈瀚阅, 刑世和*()   

  1. 福建农林大学资源与环境学院,福州 350002
  • 收稿日期:2015-07-27 修回日期:2015-11-09 出版日期:2016-07-15 发布日期:2016-07-15
  • 通讯作者: 刑世和 E-mail:xunbei100@aliyun.com;fafuxsh@126.com
  • 作者简介:

    作者简介:范协裕(1985-),男,福建永春人,博士,讲师,研究方向为空间数据挖掘、网络地理信息系统。E-mail: xunbei100@aliyun.com

  • 基金资助:
    福建省教育厅科技计划项目(JA14102);国家自然科学基金青年科学基金项目(41401399)

Self-adaptive Local Co-location Pattern Mining Algorithm for Continuous Variables

FAN Xieyu(), CHEN Hanyue, XING Shihe*()   

  1. College of Resource and Environmental Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
  • Received:2015-07-27 Revised:2015-11-09 Online:2016-07-15 Published:2016-07-15
  • Contact: XING Shihe E-mail:xunbei100@aliyun.com;fafuxsh@126.com

摘要:

目前,局部空间同位模式挖掘方法存在需要预设定邻域范围、挖掘的结果无统计显著性意义而难以对结论进行科学地判定等问题,如当前常用的K近邻方法难以确定合适的搜索圆半径,而固定距离法由于空间数据集的多尺度特性,距离阈值的设定对结果的影响较大。因此,针对连续变量的空间采样点数据集,本文提出了一种自适应局部空间同位模式挖掘算法。首先,定义了连续变量的空间同位模式兴趣度函数、模式指示器函数及Voronoi邻域,并通过构建Voronoi邻域矩阵避免了预设定邻域阈值的问题,最后采用Gi*统计量进行局部空间同位模式及其区域的发现,使挖掘的结果具有统计显著性意义,进而帮助专家对挖掘结果做出更科学的判定。通过使用真实的连接了烟草适应性评价结果的耕地地力样点调查数据和水污染数据,对开发的算法进行测试。实验结果表明,算法无需预设邻域范围,可查找同区域内的不同空间同位模式。实验所发现的局部空间同位模式发现了实验数据研究区域存在的特有现象,对耕地地力调查工作具有实际的指导作用。

关键词: 空间同位模式, 局部, 统计显著性, 连续变量

Abstract:

Existing approaches in finding the local co-location patterns have several shortcomings: (1) they depend on user predefining thresholds for proximity between the spatial feature instances and (2) the mining results miss the statistically significant explanation. In this paper, we proposed a new self-adaptive method for finding the local co-location patterns for spatial datasets containing continuous variables. The interestingness and indicator function and the proximity area that are defined based on the Voronoi diagrams are introduced. A proximity matrix is built to avoid user predefining thresholds for proximity. At last, the local Getis-Ord's Gi* statistic quantity for the interestingness value is employed, which endowed the mining results with statistical significant. The actual datasets for cropland productivity surveying jointly with the land suitability evaluation results for tobacco planting and for water pollution are used to test the developed algorithm. The experimental results show that, the proposed approach is able to identify different local co-location patterns without the interference of user specified thresholds for proximity, and the captured local co-location patterns in the cropland productivity surveying datasets reveal the localized specified phenomenon in the experimental area. This approach has practical significances for cropland productivity surveying.

Key words: spatial co-location pattern, local, statistically significant, continuous variables