Journal of Geo-information Science ›› 2017, Vol. 19 ›› Issue (10): 1346-1354.doi: 10.3724/SP.J.1047.2017.01346

Previous Articles     Next Articles

A Model of Parallel Mosaicking for Massive Remote Sensing Images Based on Self-defined RDD

JING Weipeng*(), HUO Shuaiqi   

  1. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Received:2017-02-28 Revised:2017-06-09 Online:2017-10-20 Published:2017-10-20
  • Contact: JING Weipeng E-mail:nefujwp@163.com

Abstract:

Image mosaicking is an important part of remote sensing image processing. It plays a vital role in the analysis of trans-regional remote sensing images. In order to solve the problems of low utilization rates of the nodes and frequent data I/O in the traditional parallel algorithms of remote sensing images, we proposed a parallel mosaicking algorithms based on self-defined RDD (Resilient Distributed Datasets), in which the Spark distributed memory computing framework has been used. In this paper, we take full advantage of the Spark, which is conducive to the processing of iterative data, and build remote sensing images parallel mosaic processing model through the operation of the Spark RDD. Firstly, according to the logical separability and data independence of the Fourier transform and inverse Fourier transform in the phase correlation method, we improved the traditional phase correlation method by executing a single instruction on multiple nodes, which are executed parallel in the cluster. We did so to improve the image overlapping region estimation multi-node parallel computation in the algorithm. Then, we override the compute and getPartitions methods in RDD and self-define the RDD for remote sensing image processing. Meanwhile, we used the three key steps of the image mosaicking, including overlapping region estimation, image registration and image fusion, which are the transformation-type operators of the self-defined RDD. These transformation-type operators do not perform calculations in the process of parallel mosaicking, until the final mosaicking image is required to be written to disk or file system. Thus, reducing the time consumption in the process of image parallel mosaicking. Finally, the parallel processing of image mosaicking is realized by calling the operators of self-defined RDD with the method of implicit conversion, compared with the parallel mosaicking algorithm based on MPI. The experimental results show that the parallel mosaicking algorithm of massive remote sensing image based on self-defined RDD can effectively improve the image mosaicking efficiency of large data volume on the basis of guaranteeing the image mosaicking effects.

Key words: remote sensing images, parallel mosaicking, spark, phase correlation methods, self-defined RDD