Service and Application of Grid Based Distributed Spatial Outliers Mining

  • 1. Key Lab of Spatial Data Mining and Information Sharing of MOE, Spatial Information Research Center of Fujian, Fuzhou University, Fuzhou 350002, China;
    2. Fujian Economic Information Center, Fuzhou 350001, China

Received date: 2010-11-01

  Revised date: 2011-03-29

  Online published: 2011-06-15


A spatial outlier is a spatial object whose non-spatial attribute values are significantly deviated from the other data's in the dataset. The identification of spatial outliers can lead to the discovery of some unexpected knowledge, and it has a number of practical applications. There are massive spatial data maintained over geographically distributed sites in WAN. It's necessary to analyse and process the data by using the high-performance distributed parallel processing system. Grid is one of the most effective approaches to meet this requirement. The geographical knowledge grid platform (GeoKS-Grid) established by our research group is the application of knowledge grid in geo-information science, which integrate technologies of grid computing, web service, WebGIS, data mining, information visualization, knowledge base of ontology and knowledge reasoning, online analytical processing, decision analysis, data warehouse and workflow, to form a geographical problem solving environment. In this paper, a grid based distributed framework and the corresponding strategy for distributed spatial data mining system are discussed, and a distributed algorithm for spatial outlier mining is designed and implemented. In general, the process of distributed spatial outlier mining can be seen to be a series of services including atomic services and composite services. Furthermore, according to the principle of web service reusage and compositionality, the distributed spatial outlier mining algorithm is decomposed into several grid atomic services. Distributed spatial outlier mining including local spatial outlier mining and global spatial outlier mining is realized by grid workflow approach to discovery and composition of knowledge atomic grid services provided by knowledge grid. Finally, demonstration application is carried out on the basis of soil geochemistry data inspected by the Ecological Geochemistry Survey of Fujian Coastal Economic Belt, the efficiency and the validity of the distributed spatial outlier mining service and system are verified and confirmed.

Cite this article

YAO Minjing, LIN Jiaxiang, CHEN Chongcheng, MA Hengbing . Service and Application of Grid Based Distributed Spatial Outliers Mining[J]. Journal of Geo-information Science, 2011 , 13(3) : 383 -390 . DOI: 10.3724/SP.J.1047.2011.00383


[1] Shekhar S, Lu C T, Zhang P. A Unified Approach to Detecting Spatial Outliers[J]. GeoInformatica, 2003, 7(2): 139-166.

[2] Aflori C, Craus M. Grid Implementation of the Apriori Algorithm[J]. Advances in Engineering Software, 2007, 38(5): 295-300.

[3] Rawat S S, Rajamani L. Performance of Distributed Apriori Algorithms on a Computational Grid . Services Computing Conference. APSCC 2009. IEEE Asia-Pacific, 2009,163-167.

[4] Meligy A, Al-Khatib M. A Grid-based Distributed SVM Data Mining Algorithm[J]. European Journal of Scientific Research, 2009, 27(3): 313-321.

[5] Yang C T, Tsai S T, Li K C. Decision Tree Construction for Data Mining on Grid Computing Environments . 19th International Conference on Advanced Information Networking and Applications, AINA 2005, Taipei, Taiwan: Institute of Electrical and Electronics Engineers Inc., 2005,421-425.

[6] Pérez M S, Sánchez A, Robles V, et al. Design and Implementation of a Data Mining Grid-aware Architecture[J]. Future Generation Computer Systems, 2007, 23(1): 42-47.

[7] Khoussainov R, Zuo X, Kushmerick N. Grid-enabled Weka: A Toolkit for Machine Learning on the Grid[J]. ERCIM News, 2004, 59: 47-48.

[8] Senger H, Hruschka E R, Silva F a B, et al. Inhambu: Data Mining Using Idle Cycles in Clusters of PCs [M]. Network and Parallel Computing. Springer Berlin / Heidelberg. 2004, 213-220.

[9] Ali A S, Rana O F, Taylor I J. Web Services Composition for Distributed Data Mining . ICPPW '05 Proceedings of the 2005 International Conference on Parallel Processing Workshops, IEEE Computer Society, Washington DC, USA, 2005,11-18.

[10] Talia D, Trunfio P, Verta O. Weka4WS: AWSRF-enabled Weka Toolkit for Distributed Data Mining on Grids . Proc. PKDD 2005, Porto, Portugal: Springer-Verlag, 2005,309-320.

[11] Brezany P, Hofer J, Tjoa A, et al. Gridminer: An Infrastructure for Data Mining on Computational Grids . In APAC Conference and Exhibition on Advanced Computing, Grid Applications and eResearch, PAC, Australia, 2003.

[12] Stankovski V, Swain M, Kravtsov V, et al. Grid-enabling Data Mining Applications with DataMiningGrid: An Architectural Perspective[J]. Future Gener. Comput. Syst., 2008, 24(4): 259-279.

[13] Wu X, Chen C. The Design, Development and Application of Geographical Knowledge Service Grid Portal . Proc. of 17th International Conference on Geoinformatics, Fairfax, USA, 2009.

[14] 林甲祥. 考虑约束条件的分布式空间离群挖掘及其应用研究 . 福州大学博士学位论文, 2010.

[15] 薛安荣,鞠时光,何伟华,等. 局部离群点挖掘算法研究[J]. 计算机学报, 2007, 30(8): 1455-1463.

[16] Chawla S, Sun P. SLOM: A New Measure for Local Spatial Outliers[J]. Knowledge and Information Systems, 2006, 9(4): 412-429.

[17] 郑旻琦,陈崇成,樊明辉,等. 基于Delaunay三角网的空间离群挖掘[J]. 微计算机应用, 2008, 29(6): 76-82.

[18] 刘丰富. 基于网格的地理空间知识服务技术与原型系统开发 . 福州大学硕士学位论文, 2007.