Journal of Geo-information Science >
A Model of Parallel Mosaicking for Massive Remote Sensing Images Based on Self-defined RDD
Received date: 2017-02-28
Request revised date: 2017-06-09
Online published: 2017-10-20
Copyright
Image mosaicking is an important part of remote sensing image processing. It plays a vital role in the analysis of trans-regional remote sensing images. In order to solve the problems of low utilization rates of the nodes and frequent data I/O in the traditional parallel algorithms of remote sensing images, we proposed a parallel mosaicking algorithms based on self-defined RDD (Resilient Distributed Datasets), in which the Spark distributed memory computing framework has been used. In this paper, we take full advantage of the Spark, which is conducive to the processing of iterative data, and build remote sensing images parallel mosaic processing model through the operation of the Spark RDD. Firstly, according to the logical separability and data independence of the Fourier transform and inverse Fourier transform in the phase correlation method, we improved the traditional phase correlation method by executing a single instruction on multiple nodes, which are executed parallel in the cluster. We did so to improve the image overlapping region estimation multi-node parallel computation in the algorithm. Then, we override the compute and getPartitions methods in RDD and self-define the RDD for remote sensing image processing. Meanwhile, we used the three key steps of the image mosaicking, including overlapping region estimation, image registration and image fusion, which are the transformation-type operators of the self-defined RDD. These transformation-type operators do not perform calculations in the process of parallel mosaicking, until the final mosaicking image is required to be written to disk or file system. Thus, reducing the time consumption in the process of image parallel mosaicking. Finally, the parallel processing of image mosaicking is realized by calling the operators of self-defined RDD with the method of implicit conversion, compared with the parallel mosaicking algorithm based on MPI. The experimental results show that the parallel mosaicking algorithm of massive remote sensing image based on self-defined RDD can effectively improve the image mosaicking efficiency of large data volume on the basis of guaranteeing the image mosaicking effects.
JING Weipeng , HUO Shuaiqi . A Model of Parallel Mosaicking for Massive Remote Sensing Images Based on Self-defined RDD[J]. Journal of Geo-information Science, 2017 , 19(10) : 1346 -1354 . DOI: 10.3724/SP.J.1047.2017.01346
Fig. 1 The architecture of Spark图1 Spark集群架构图 |
Fig. 2 Self-defined RDD implementation details图2 自定义RDD的实现细节 |
算法1: |
初始化:创建SparkConf对象conf,将conf作为SparkContext构造函数的参数创建SparkContext对象sc,调用sc的textFile 方法创建初始RDD |
阶段1:在自定义RDD中添加操作方法 Iterator[BufferedImage]←compute(split: Partition,context: TaskContext)//调用父RDD的iterator方法,返回一个内部 //元素类型为bufferImage的迭代器对象 Array[Partition]←firstParent[BufferedImage].partitions//调用父RDD的partitions方法,返回父RDD的分区 RDD[BufferedImage]←Image overlap region estimation//重叠区域估计方法 RDD[BufferedImage]←Image registration//图像配准方法 RDD[BufferedImage]←Image fusion//图像融合方法 阶段2:调用隐式转换的处理方法 self-definedRDD[rdd]←exchange(rdd:RDD[String])//转换类中的exchange方法由implicit关键字修饰,RDD为方法参数,//自定义RDD作为返回值 import RDDtoSelf-defiendRDD.exchange//在程序中导入声明的隐式转换的方法 阶段3:生成自定义RDD对象. imageRDD ←fileRDD.exchange//初始RDD调用exchange方法生成自定义RDD对象 |
Fig. 3 Parallel mosaicking algorithm based on self-defined RDD图3 基于自定义RDD的并行镶嵌算法 |
Fig. 4 Parallel mosaicking directed acyclic graphs of remote sensing images图4 遥感图像并行镶嵌有向无环图 |
Fig. 5 Parallel mosaicking algorithm based on Spark图5 镶嵌效果图 |
Fig. 6 Speedup contrast chart (with increasing number of processes)图6 加速比对比图(随进程数增加) |
Fig. 7 Running time comparison chart (with the increase of data size)图7 运行时间对比图(随数据规模增加) |
Fig. 8 Throughput comparison chart (with the increase of data size)图8 吞吐率对比图(随数据规模增加) |
The authors have declared that no competing interests exist.
[1] |
|
[2] |
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
[
|
[11] |
[
|
[12] |
[
|
[13] |
[
|
[14] |
|
[15] |
[
|
[16] |
|
[17] |
|
[18] |
[
|
[19] |
[
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
Xin R S. Rosen J. Zaharia M, et al.Shark: SQL and rich analytics at scale[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of data. ACM, 2013:13-24.
|
[25] |
|
[26] |
|
/
〈 |
|
〉 |