ARTICLES

Research on the Distributed Spatial Topological Query Optimization Algorithm

Expand
  • Key Laboratory of 3D Information Acquisition and Application, Ministry of Education, Capital Normal University, Beijing 100048, China

Received date: 2013-05-15

  Revised date: 2013-08-24

  Online published: 2013-09-29

Abstract

Due to complex data structure, complicated spatial relationship and massive data volume, distributed spatial query is a time-consuming processing, which will cause high transmission and processing cost. Query processing method in traditional distributed database cannot satisfy the demands of query in distributed geospatial database. Therefore, new query methods in distributed geospatial database need to be studied. In this paper, the distributed spatial join query processing is deeply studied based on the existing optimizing methods of the conventional query processing in traditional distributed database, and a series of transformation rules of relational algebra expression based on cross-border topological join optimization rules are proposed. The processed query tree is optimized by equivalent transformation after data localization. The global optimized method of distributed spatial join query for different fragments is studied. The global spatial query can be transformed into some local fragments joins effectively. The spatial join query is processed in the local area, avoiding the data transmission of spatial data among data nodes during the processing of query, so that the query performance can be improved. To improve the efficiency of the method, some new concepts were put forward, including query merged tree and execution plan tree, which can optimize the executing path of query plan. For example, by adjusting the executing order, some processes with low cost execute first, and the time-consuming processes execute based on the result set generated by the previous processes so as to reduce the process of time-consuming parts and resolve the problem of high cost of query processing to improve the performance of distributed spatial query. The experiment based on the vector data of China shows our methods can reduce the cost of the spatial join and data transmission among the nodes, and the performance improve 28.5%, which demonstrates that our methods outperform the traditional methods in terms of both algorithm complexity and the running time.

Cite this article

YANG Dian-Hua . Research on the Distributed Spatial Topological Query Optimization Algorithm[J]. Journal of Geo-information Science, 2013 , 15(5) : 643 -648,679 . DOI: 10.3724/SP.J.1047.2013.00643

References

[1] 邵佩英.分布式数据库系统及其应用[M].北京:科学出版 社,2003.

[2] Özsu M T, Valduriez P. Principles of distributed database systems[M]. 2nd ed.. New York: Prentice-Hall Inc.,1999.

[3] 周玉科,马廷,周成虎,等. MySQL集群与MPI的并行空间分析系统设计与实验[J].地球信息科学学报,2012,14(4): 448-453.

[4] Ramirez M R, de Souza J M. Distributed processing of spatial join[J/OL]. http://cronos.cos.ufrj.br/publicacoesrelteces58902. pdf, 2001.

[5] Abel D J, Ooi B C, Tan K L, et al. Spatial join strategies in distributed spatial DBMS[C]. //Egenhofer M J, Herring J R (Eds.). Proc. of the 4th Int'l Symp. Advances in Spatial Databases. London: Springer-Verlag,1995:348-367.

[6] Kian-Lee Tan, Beng Chin Ooi. Exploiting spatial indexes for semijoin-based join processing in distributed spatial databases[J]. IEEE Transactions on Knowledge and Data Engineering, November/December, 2000, 12(6): 920-937.

[7] Ramirez M R, de Souza J M. Distributed processing of spatial join[C]. //Proc.of the Anais do III Workshop Brasileiro de GeoInformática-GeoInfo, 2001:1-8.

[8] 朱欣焰,周春辉,呙维,等.分布式空间数据分片与跨边界拓扑连接优化方法[J].软件学报,2011,22(2):269-284.

[9] 黄舟,彭霞,张珂,等.分布式计算环境中的空间查询语言全局解析机制[J].地理与地理信息科学,2006,22(3):18-21.

[10] 陈波,高秀娥,陈来杰.基于等价变换的分布式查询优化方法研究[J].计算机工程与设计,2006,27(3):390-392.

[11] 刘书,李正凡.基于分布式数据库系统的一种查询优化算 法[J]. 北京联合大学学报(自然科学版),2005,19(1): 51-54.

[12] Clematis A, Mineter M, Marciano R. High performance computing with geographical data[J]. Parallel Computing, 2003(29):1275-1279.

[13] Hawick K A, Coddington P D, James H A. Distributed frameworks and parallel algorithms for processing large-scale geographic data[J]. Parallel Computing, 2003, 29(10):1297-1333.

[14] Armstrong M P, Marciano R J. Local interpolation using a distributed parallel supercomputer[J]. International Journal of Geographical Information Systems,1996,10(6): 713-729.

[15] Ren Y C, Shen L, Yang C J, et al. Parallel map rendering system for massive geospatial data in distributed environment[C]. Proceedings of International Conference on GeoInformatics, 2009.

[16] Sorokine A, Daniel J, Liu C. Parallel visualization for GIS applications. http://www.geocomputation.org/2005/Abstracts/Sorokine.pdf, 2005.

[17] Hoel E G, Samet H. Data-parallel polygonization[J]. Parallel Computing, 2003(29):1381-1401.

[18] Lanthier M, Nussbaum D, Sack J-R. Parallel implementation of geometric shortest path algorithms[J]. Parallel Computing, 2003(29):1445-1479.

[19] Armstrong M P, Densham P J. Domain decomposition for parallel processing of spatial problems[C]. Computers, Environment, and Urban Systems, 1992(16):497-513.

[20] Ware A, Kinder D. Parallel implementation of the Delaunay Triangulation within a transputer environment[C]. Proceedings of the European Geographic Information Systems (EGIS'91) Conference (UtretchtL EGIS), 1991, 1199-1209.

Outlines

/