地球信息科学学报 ›› 2011, Vol. 13 ›› Issue (5): 707-710.doi: 10.3724/SP.J.1047.2011.00709

• 模型与算法应用 • 上一篇    

基于CUDA的IDW并行算法及其实验分析

刘二永, 汪云甲   

  1. 中国矿业大学 环境与测绘学院, 徐州 221116
  • 收稿日期:2011-04-19 修回日期:2011-09-22 出版日期:2011-10-25 发布日期:2011-10-25
  • 作者简介:刘二永(1978-),男,江苏徐州人,博士生,讲师,研究方向:空间分析与并行算法。E-mail:cumtley@hotmail.com
  • 基金资助:

    国家自然科学基金项目(40971275,51174287)。

Parallel IDW Algorithm Based on CUDA and Experimental Analysis

LIU Eryong, WANG Yunjia   

  1. School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China
  • Received:2011-04-19 Revised:2011-09-22 Online:2011-10-25 Published:2011-10-25

摘要: 近些年来,空间数据获取技术得到了迅猛的提高,例如LIDAR, 通常可以产生成千上万个点,这对计算机的处理能力提出了挑战。最近,图形处理器(GPU)的计算能力得到了巨大的提升,致使GPU的通用计算引起了关注。GPU是流处理器的集合,最近的设备的流处理器超过240个,浮点峰值比CPU快10多倍。在GPU上编程和编译的环境称计算统一设备架构(CUDA),它提供了可以简单生成并行代码的途径。基于CUDA的并行计算,在很多领域得到了应用,但是在空间插值方面应用较少。反距离权插值(IDW)算法,因容易计算,在空间插值中经常被使用。然而,当维数增加时,提升计算时间是紧要的,故本文提出CUDA的IDW并行算法。并在相同的条件下,对比CUDA和CPU的算法的运行时间,数值实验表明,CUDA算法的运行速度是CPU算法的6倍左右。

关键词: IDW, 并行算法, CUDA, CPU

Abstract: In recent years, spatial data acquisition methods were significantly improved, such as LiDAR, which usually generated hundreds of millions of points. These amounts of datasets create great challenges to computation capacity of computers. In the past several years, the computing capacity of graphical processing unit (GPU) has improved significantly, too. General-purpose computing on GPUs has come into notice. GPUs are an aggregation of streaming processors. The amount of streaming processors in latest device exceeds 240. The peak floating-point operations per second of CPUs are ten times slower than that of GPUs now. A new software platform, called Computer Unified Device Architecture (CUDA), allows GPU programs to be developed in ANSI C. Parallel parts are worked on GPUs based on kernels, which are invoked by the CPU. Each kernel works on a grid of blocks, and each block is an array of threads. In the application process, each block is mapped to a multiprocessor, and each thread is mapped to a streaming processor. A typical CUDA program follows the flows as follow. First, the host function begins by locating one or more buffers in the GPU global memory and conveys the data to them. Then the CUDA program is started more times by appointing the number of threads per block. Finally, the results are transformed back to CPU memory. GPUs have been applied to solve many problems in signal processing, computational geometry and so forth. However, little has been used in spatial interpolation. CUDA Inverse-distance weighting (IDW) algorithm is the most frequently used model in spatial interpolation because it is relatively easy to compute. However, when dimensions increase, obtaining fast running time remains important. In this study, we explore the parallel algorithm for IDW, using the CUDA developed by NVIDIA. The main objective is to compare running times using CPUs versus GPUs under the same conditions. The numerical experiments show that processing speed of CUDA-based algorithm is 6 times faster than that of CPU-based method.

Key words: IDW, parallel algorithm, CUDA, CPU 