地球信息科学学报 ›› 2011, Vol. 13 ›› Issue (4): 455-464.doi: 10.3724/SP.J.1047.2011.00455

• 地理信息系统与应用 • 上一篇    下一篇

基于周期表的时空关联规则挖掘方法与实验

柴思跃1,2, 苏奋振1, 周成虎1   

  1. 1. 中国科学院地理科学与资源研究所 资源与信息系统国家重点实验室, 北京 100101;
    2. 中国科学院研究生院, 北京 100049
  • 收稿日期:2010-11-16 修回日期:2011-06-07 出版日期:2011-08-25 发布日期:2011-08-23
  • 作者简介:柴思跃(1985-),男,北京人,硕士, 研究方向:时空数据挖掘。E-mail: chaisy@lreis.ac.cn

Period Table Based Spatio-temporal Association Rules Mining

CHAI Siyue1,2, SU Fenzhen1, ZHOU Chenghu1   

  1. 1. State Key Lab of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China;
    2. Graduate University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2010-11-16 Revised:2011-06-07 Online:2011-08-25 Published:2011-08-23

摘要: 地理现象的周期性往往掩盖了许多地学规律,这也是地学数据挖掘的一个主要内容。本文以周期表设计了一种时空层次关联规则挖掘方法——PRules-Miner。模型利用周期表的表现形式对时空数据进行组织,并通过两步挖掘过程发现具有"遥相关"地理事物间的变化模式。模型算法分为3个步骤: (1)过滤周期表内无序数据:逐行地提取多周期内时空状态的频繁项,生成新的时空频繁状态表;(2)基于向下闭合引理,对时空频繁状态表中的对象进行时空拓扑匹配,得到时空关联规则候选集;(3)对于候选数据集进行时空拓扑验证,得到时空关联规则集。为证明模型算法的可靠性,应用PO.DAAC提供的20年AVHRR Product 016海表面温度遥感反演数据集和国家气象科学院提供的南京地区降水逐日数据资料,研究大洋暖池与南京降水间的时空关联规则。实践表明,这种挖掘方法具有以下特点:(1)算法基于面向对象思想,对地理对象状态进行独立描述。因此,所得时空关联规则与时空粒度无关,并能够挖掘出时空粒度不一致的地物间的关联关系。(2)算法使用笛卡尔积得到在时空拓扑阈值内匹配的时空候选集,并可以发现时域、空域均不邻接的事物间的时空关联规则,即时延不确定的地理现象的相互关联。

关键词: 数据挖掘, 关联规则, 时空数据, 层次挖掘, 周期表

Abstract: As periodical geographical phenomena cover lots of rules, geographic data mining provides a way to find out such rules. In this paper, an algorithm called PRules-Miner is designed based on period table to mine spatio-temporal association rules. Using this mining model, spatio-temporal data were reorganized from sequential dataset to period table set. And spatio-temporal association rules, which describe the tele-connected movement model of two or more objects, can be dug out through three steps: 1) Filtering disorder data in period table: we extract spatio-temporal frequent status in each row and store such status into spatio-temporal frequent item set; 2) Matching objects in the item set based on downward closure lemma and spatio-temporal topology: we match the objects in order to create the spatio-temporal association candidate set; 3) Verifying the candidate set under spatio-temporal topology to find the rules which have to satisfy the spatio-temporal support and spatio-temporal confidence. And the final rules are the spatio-temporal association rules. To check the validation of the algorithm, we use 20 years' AVHRR Product 016, which is sea surface inversion temperature data provided by PO.DAAC and the same period records of Nanjing's daily precipitation provided by National Academy of Meteorological Sciences to mine the tele-connection rules between Eastern Indo Ocean and Western Pacific Ocean Warm Pool and Nanjing's precipitation. The results show, this mining model has the following characteristics: 1) this algorithm is object-orientated and can describe geographical status independently. Thus, the final spatio-temporal association rules are not correlated with spatial scale or temporal scale. 2) The candidate item set is created by Cartesian product, and it can represent complicated spatio-temporal topology between objects. And the spatio-temporal topology can be set manually so as to find the association of none adjacent objects in spatio-temporal dimensions. After setting spatio-temporal topology, spatio-temporal association rules can be mined and validated from candidate set. In the final rules, one object's frequent status is combined with another object's frequent status with given spatio-temporal topology. Thus, the association of objects with uncertain time lag can be extracted.

Key words: hierarchical structure, period table, data mining, association rules, spatio-temporal data