地球信息科学学报 ›› 2013, Vol. 15 ›› Issue (5): 643-648,679.doi: 10.3724/SP.J.1047.2013.00643

• 本期要文(可全文下载) • 上一篇    下一篇

分布式空间拓扑连接查询优化处理算法

杨典华   

  1. 首都师范大学三维信息获取与应用教育部重点实验室, 北京 100048
  • 收稿日期:2013-05-15 修回日期:2013-08-24 出版日期:2013-09-29 发布日期:2013-09-29
  • 通讯作者: 杨典华,E-mail:yangdh@lreis.ac.cn E-mail:yangdh@lreis.ac.cn
  • 作者简介:杨典华(1977-),男,博士生,主要研究方向为高性能并行GIS。E-mail:yangdh@lreis.ac.cn
  • 基金资助:

    国家自然科学基金项目(40971232);国家“863”计划项目(2012AA12A403)。

Research on the Distributed Spatial Topological Query Optimization Algorithm

YANG Dianhua   

  1. Key Laboratory of 3D Information Acquisition and Application, Ministry of Education, Capital Normal University, Beijing 100048, China
  • Received:2013-05-15 Revised:2013-08-24 Online:2013-09-29 Published:2013-09-29

摘要:

针对传统分布式数据库查询应用于分布式空间数据库查询带来的传输和处理代价高的问题,本文结合已有分布式跨边界片段连接优化方法,深入研究了分布式空间拓扑连接查询处理,提出跨边界连接优化的空间查询优化算法,丰富了传统的分布式查询的关系代数等价变换规则。同时,针对不同片段连接类型的分布式空间查询全局优化策略,实现了分布式空间查询分解与数据本地化,从而优化分布式查询中的数据传输所付出的高昂代价。最后,提出了结点归并、连接归并树、执行结点、执行计划树等分布式查询优化方法,利用相应归并和优化算法将全局空间查询转化为各个场地局部空间数据库的具体执行计划,消除分布式查询中的冗余计算,优化查询计算策略,从而解决分布式空间查询中的处理代价高的问题。通过分布式空间查询实验表明,本文的算法能够较好地提高分布式空间查询的性能。

关键词: 查询优化, 空间拓扑连接, 分布式空间数据库, 空间数据查询

Abstract:

Due to complex data structure, complicated spatial relationship and massive data volume, distributed spatial query is a time-consuming processing, which will cause high transmission and processing cost. Query processing method in traditional distributed database cannot satisfy the demands of query in distributed geospatial database. Therefore, new query methods in distributed geospatial database need to be studied. In this paper, the distributed spatial join query processing is deeply studied based on the existing optimizing methods of the conventional query processing in traditional distributed database, and a series of transformation rules of relational algebra expression based on cross-border topological join optimization rules are proposed. The processed query tree is optimized by equivalent transformation after data localization. The global optimized method of distributed spatial join query for different fragments is studied. The global spatial query can be transformed into some local fragments joins effectively. The spatial join query is processed in the local area, avoiding the data transmission of spatial data among data nodes during the processing of query, so that the query performance can be improved. To improve the efficiency of the method, some new concepts were put forward, including query merged tree and execution plan tree, which can optimize the executing path of query plan. For example, by adjusting the executing order, some processes with low cost execute first, and the time-consuming processes execute based on the result set generated by the previous processes so as to reduce the process of time-consuming parts and resolve the problem of high cost of query processing to improve the performance of distributed spatial query. The experiment based on the vector data of China shows our methods can reduce the cost of the spatial join and data transmission among the nodes, and the performance improve 28.5%, which demonstrates that our methods outperform the traditional methods in terms of both algorithm complexity and the running time.

Key words: query optimization, spatial data query, spatial topological join, distributed spatial database