地球信息科学学报 ›› 2017, Vol. 19 ›› Issue (4): 457-466.doi: 10.3724/SP.J.1047.2017.00457

• 地球信息科学理论与方法 • 上一篇    下一篇

景观指数的并行计算方法

刘洋(), 关庆锋*()   

  1. 1. 中国地质大学(武汉)国家地理信息系统工程技术研究中心,武汉 430074
    2. 中国地质大学(武汉)信息工程学院,武汉 430074
  • 收稿日期:2016-07-19 修回日期:2016-11-01 出版日期:2017-04-20 发布日期:2017-04-20
  • 通讯作者: 关庆锋 E-mail:youngliu2@163.com;guanqf@cug.edu.cn
  • 作者简介:

    作者简介:刘 洋(1991-),女,陕西西安人,硕士生,研究方向为高性能地理计算。E-mail:youngliu2@163.com

  • 基金资助:
    教育部高等学校博士学科点专项科研基金项目(20130145120013)

A Parallel Algorithm for Landscape Metrics

LIU Yang(), GUAN Qingfeng*()   

  1. 1. National Engineering Research Center of GIS, China University of Geosciences, Wuhan 430074, China
    2. School of Information Engineering, China University of Geosciences, Wuhan 430074, China
  • Received:2016-07-19 Revised:2016-11-01 Online:2017-04-20 Published:2017-04-20
  • Contact: GUAN Qingfeng E-mail:youngliu2@163.com;guanqf@cug.edu.cn

摘要:

随着地理信息科学和系统的发展,GIS数据的时空分辨率和数据量呈现爆炸式的增长趋势。传统的基于个人计算机的景观指数计算软件难以有效快速地完成海量数据的空间分析。针对该问题,本文提出了一个高效的景观指数并行计算方法。首先对原有的并查集连通域标记算法进行了2点改进:① 在第2次遍历数据时,增加了计算斑块面积、周长等斑块基本信息的功能,为景观指数的计算提供必要参数;② 在第2次遍历过程中,增加了重新标记连续序号的功能,减少了原有算法在合并操作后造成的序号不连续,需要重新遍历数据的开销。在此基础上,本文利用MPI并行编程库,采用数据分割和主从进程协同的并行计算模式实现了景观指数的并行计算。实验表明,在保证计算正确性的基础上,本文的并行算法大幅度提高了景观指数的计算性能,为快速分析大规模数据的景观形态和格局提供了有效手段。

关键词: 景观指数, 连通区域算法, MPI, 并行计算

Abstract:

Landscape metrics have been widely used to quantitatively evaluate the spatial patterns of landscapes and to analyze the temporal dynamics of landscapes and their effects. However, when dealing with massive amounts of data, the calculation of landscape metrics requires large amount of computing time and extremely large memory size, which greatly decreases the feasibility in real-world applications. This paper presents a parallel algorithm to improve the performance of landscape metric calculation. First, the classical two-pass connected component labeling (CCL) algorithm was modified: (1) the calculations of some basic geometrical metrics of patches, such as areas and perimeters, are integrated into the second pass of the data for the calculation of landscape metrics; and (2) continuous patch IDs are generated along the second pass, to reduce the overheads for re-labeling. Then, a parallel algorithm consisting of a master process and a set of worker processes is designed and implemented using the C++ programming language and Message Passing Interface (MPI). In our parallel algorithm, the whole spatial domain is decomposed into multiple sub-domains and assigned to a set of concurrent processes. Each process uses the modified CCL algorithm to identify the patches within its assigned sub-domain and calculates the basic geometrical metrics of the patches. After gathering and merging the basic metrics from other processes, the master process calculates the final landscape metrics. The experiments using the land-use dataset of California showed that the computing time of landscape metrics was largely reduced using multiple processes. In conclusion, our parallel algorithm provides a high-performance solution for landscape metric calculation using massive and high-resolution datasets.

Key words: landscape metrics, connected component labeling algorithm, MPI, parallel computing