Application and Effects of Data Spatial Autocorrelation on Association Rule Mining

  • School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China

Received date: 2010-03-16

  Revised date: 2010-10-27

  Online published: 2011-02-25


Spatial autocorrelation is a very general statistical property of spatial variables, it indicates correlation of a variable with itself through space. Spatial association rule mining, discovery of interesting, meaningful rules in spatial databases, ignores autocorrelation of spatial data, or just generalizes the spatial data into attribute data currently. In most of the ways on spatial association rules mining, they transferred the spatial relations into non-spatial relations by virtue of spatial analysis. This means the separation of spatial autocorrelation from spatial association rule mining. In order to study the relations between spatial autocorrelation and spatial association rule mining, in this paper, the spatial association rules were mined by developed Apriori algorithm. Then, spatial autocorrelation analysis was implemented in the same spatial data set. A basic assumption of many spatial association rules mining is lacking for a priori information about spatial attributes. The two dimensional spatial autocorrelation results were used as priori knowledge in spatial association rules mining in this paper. The experimental data is about the amount of the hay fever (disease caused by pollen allergic rhinitis) patients and its factors, including temperature, precipitation and vegetation types of each county in the United Kingdom in 2000. The obtained frequent itemsets and the spatial association rules prove that factors have stronger correlation with hay fever (correlation coefficient is lager) appear with hay fever simultaneously more frequently in the spatial database, which confirms the existence of the effects that spatial autocorrelation has on spatial association rule mining. The analysis results not only point out the relation between spatial autocorrelation and spatial association rule mining, but also provide priori knowledge in the process of spatial association rule mining, making the mining process more targeted. Besides, without calculating the Cartesian in developed Apriori algorithm, spatial autocorrelation analysis can get the correlation coefficients efficiently, making the mining process more effectively. Further work would focus on how to evaluate the effects of the spatial autocorrelation on spatial association rules mining, how to find out the candidate frequent spatial itemsets from the results of spatial autocorrelation analysis in practical application.

Cite this article

CHEN Jiangping, HUANG Bingjian . Application and Effects of Data Spatial Autocorrelation on Association Rule Mining[J]. Journal of Geo-information Science, 2011 , 13(1) : 109 -117 . DOI: 10.3724/SP.J.1047.2011.00109


[1] 李德仁,王树良,李德毅,王新洲. 论空间数据挖掘和知识发现的理论与方法
[J]. 武汉大学学报(信息科学版), 2002(3):222-233.

[2] Fayyad U M, Piatetsky Shapiro G, Smyth P. Advances in Knowledge Discovery and Data Mining . London : AAAI/MIT Press, 1996.

[3] 李德仁,王树良,李德毅.空间数据挖掘理论与应用

[4] 张建峰,王泳,王剑.关联规则在空间数据挖掘中的应用及实现
[J]. 计算机技术与发展,2007, 17(8):208-211.

[5] 黄旭峰,邹菁.空间数据挖掘中关联规则的研究与实现
[J]. 科技信息,2009(7):481-482.

[6] Wang J-F, Li X-H, Christakos G, Liao Y-L, Zhang T, Gu X & Zheng X-Y. Geographical Detectors-based Health Risk Assessment and Its Application in the Neural Tube Defects Study of the Heshun Region, China
[J]. International Journal of Geographical Information Science,2010, 24(1): 107-127.

[7] Tobler W. A Computer Movie Simulating Urban Growth in the Detroit Regional Economic Geography
[J]. Economic geography,1970,46 (2):234-2401.

[8] Tobler W. On the First Law of Geography: A Reply
[J].Annals of the Association of American Geographers,2004, 94(2): 304-310.

[9] Moran PAP. The Interpretation of Statistical Maps
[J].Journal of the Royal Statistical Society B,1948(10):243-251.

[10] Moran PAP. Notes on Continuous Stochastic Phenomenal
[J].Biometrika,1950, 37: 17-33.

[11] Geary R C. The Contiguity Ratio and Statistical Mapping
[J].The Incorporated Statistician,1954 , 5: 115-145.

[12] 王永,沈毅.空间自相关方法及其主要应用现状
[J].中国卫生统计,2008, 25(4):443-445.

[13] 陈彦光.基于Moran统计量的空间自相关理论发展和方法改进
[J].地理研究,2009, 28(6):1449-1463.

[14] 王劲峰.地图的定性和定量分析

[15] 廖顺宝,张赛.属性数据空间化误差评价指标体系研究
[J]. 地球信息科学学报,2009,11(4):176-182.

[16] Chen Jiangping, Tan Xiaojin. Mining Spatial Association Rules with Geostatistics . Proceedings of the 8th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences, 2008.

[17] 何彬彬,郭达志,方涛.基于空间统计学的空间关联挖掘

[18] Han Jiawei. Mining Knowledge at Multiple Concept Levels . Proceedings of the Fourth International Conference on Information and Knowledge Management,1995.

[19] Krzysztof Koperski, Junas Adhikary, Han Jiawei. Spatial Data Mining: Progress and Challenges Survey Paper
[J]. SIGMOD'96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'96), Montreal, Canada, June 1996.

[20] 陈江平,付仲良,徐志红.一种Apriori的改进算法