面向数据传输的地理栅格数据快速压缩方法
作者简介:江岭(1987-),男,安徽六安人,博士,讲师,研究方向为数字地形建模及高性能地学计算。E-mail: jiangling_xs@163.com
收稿日期: 2015-04-05
要求修回日期: 2016-05-26
网络出版日期: 2016-07-15
基金资助
国家自然科学基金项目(41501445)
安徽省高等学校自然科学研究项目(KJ2015A171)
测绘遥感信息工程国家重点实验室开放基金项目(14I02)
滁州学院科研启动基金项目(2014qd028)
A Fast Compression Approach of Geo-raster Data for Network Transmission
Received date: 2015-04-05
Request revised date: 2016-05-26
Online published: 2016-07-15
Copyright
随着对地观测技术的高速发展,高分辨率地理栅格数据已被广泛应用于地貌、环境、水文等领域,传输与存储海量数据亟需通过数据压缩来解决有限信道容量的制约。本文分析了地理栅格数据特征,并基于数据保真性和压缩即时性原则,提出了融合转换压缩和编码压缩的地理栅格数据两阶段压缩方法,并从精度和效率2个视角构建了两阶段压缩方法的评价方法。利用不同大小的规则格网DEM数据,在集群系统上对两阶段压缩方法的数据保真性和压缩性能进行了测试。实验结果表明,本文构建的两阶段压缩方法在数值和地表形态上均有较好的精度,数据保真性高。同时,其压缩率一般在50%以上,解/压速率达到实时层次,能够显著地减少数据传输时间消耗,提高网络传输效率。两阶段压缩方法具有较好的普适性,可为高性能地学并行计算等领域提供技术支撑。
江岭 , 王春 , 赵明伟 , 杨灿灿 . 面向数据传输的地理栅格数据快速压缩方法[J]. 地球信息科学学报, 2016 , 18(7) : 894 -901 . DOI: 10.3724/SP.J.1047.2016.00894
As the main form of representing the geographical information, geo-raster data contains abundant geographical knowledge. With the rapid development of earth observation technology, high-resolution geo-raster data has been widely applied to many research fields, such as landform, soil, environment and hydrology. With respect to this context, the contradiction between the saving and transferring of massive geo-raster data and a limited channel capacity has become increasingly prominent with regard to the intensive increase of data size. Data compression techniques provide the possibility to solve this problem. This paper studies the compression method of geo-raster data based on gridded DEMs for the purpose of realizing massive data’s online transmission. By analyzing the characteristics of geo-grid data, this paper proposes a new compression method named as the two-phase compression method, which combines the conversion compression and the coding compression based on the data fidelity and the real-time compression principle. Meanwhile, this paper establishes an assessment method of two-phase compression method from the perspectives of accuracy and efficiency. In order to test and verify the data fidelity and the compression performance of the two-phase compression method, this paper conducted several experiments on a 10-node server cluster under the Linux operating system by using different sizes of gridded DEMs. The experiment results showed that the proposed two-phase compression method has provided good data fidelity. It keeps the data accuracy on both of the numerical and the representation structure. At the same time, the compression ratio is generally above 50%, and the almost real-time decompression/compression efficiency also indicates that it has a good performance. The two-phase compression method can significantly reduce the time consumption of data transmission through network, and improve the efficiency of network transmission. In all, this two-phase compression method of geo-raster data presents a good universality, and it can provide a technical support to the application of geo-raster data, such as the high-performance geo-computation.
Key words: data compression; two-phase compression; geo-raster data; DEM; parallel computation
Fig.1 Diagram of byte redundancy图1 字节冗余示例 |
Fig.2 DEMs of the study area图2 实验数据 |
Tab.1 Performance of different lossless compression algorithms表1 不同无损压缩算法压缩性能 |
数据组1(1821行×2134列) | 数据组2(2001行×2285列) | 数据组3(2645行×2759列) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
CE/(%) | CT/10-2 s | UCT10-2 s | CE/(%) | CT/10-2 s | UCT/10-2 s | CE/(%) | CT/10-2 s | UCT/10-2 s | |||
LZO | -0.36(39.03) | 0.35( 2.93) | 0.27( 2.31) | 13.87(46.11) | 0.47( 3.10) | 0.36( 2.35) | 45.49(62.75) | 0.58( 3.45) | 0.51( 2.58) | ||
QUICKLZ | 0.00(48.68) | 1.66( 3.50) | 0.24( 2.97) | 0.00(53.93) | 2.24( 3.65) | 0.33( 3.14) | 40.35(67.43) | 3.49( 4.20) | 4.27( 3.71) | ||
LZ4 | 0.08(38.91) | 0.70( 3.64) | 0.28( 1.05) | 14.90(45.63) | 0.92( 3.79) | 0.34( 1.14) | 46.49(61.80) | 1.07( 4.10) | 0.51( 1.31) | ||
LZFX | -1.96(39.62) | 6.42( 5.45) | 0.61( 3.76) | 13.08(46.04) | 6.74( 5.79) | 0.85( 3.86) | 45.26(62.16) | 7.14( 6.69) | 1.91( 4.03) | ||
SNAPPY | 0.29(38.02) | 0.44( 5.46) | 0.30( 1.39) | 14.33(44.36) | 0.75( 5.53) | 0.35( 1.49) | 44.53(59.85) | 0.80( 5.54) | 0.61( 1.83) | ||
FASTLZ | -1.66(32.40) | 5.02( 5.69) | 0.67( 3.10) | 13.40(40.43) | 5.22( 6.04) | 0.73( 3.14) | 45.73(59.44) | 5.32( 6.96) | 0.86( 3.31) | ||
LZW | -39.79(34.14) | 36.13(25.50) | 7.55( 8.35) | -19.11(16.00) | 37.52(30.91) | 8.50( 8.80) | 24.85(46.12) | 36.92(37.56) | 9.65(11.25) | ||
RLE | -0.19( 0.08) | 2.14( 2.28) | 1.19( 1.36) | 14.76( 0.07) | 2.68( 2.67) | 1.53( 1.55) | 46.59( 0.05) | 4.18( 4.01) | 1.94( 2.26) | ||
HUFFMAN | 1.68( 7.58) | 30.72(29.35) | 32.30(29.17) | 8.82(12.40) | 33.60(33.15) | 34.17(32.86) | 35.28(33.00) | 34.83(39.22) | 35.69(31.96) | ||
SFANO | 1.22( 7.05) | 31.26(28.56) | 24.71(23.74) | 8.50(12.02) | 31.78(29.19) | 25.69(26.54) | 30.99(28.47) | 36.99(37.34) | 31.50(31.57) |
注:括号内数字为对应整型数据组实验结果 |
Fig.3 Error analysis of DEMs before and after the conversion compression图3 DEM数据转换压缩前后误差分析 |
Tab.2 Net execution speeds of different lossless compression algorithms表2 不同无损压缩算法的净综合速率 |
数据组1(1821行×2134列) | 数据组2(2001行×2285列) | 数据组3(2645行×2759列) | ||||||
---|---|---|---|---|---|---|---|---|
浮点型NCV/(MB/s) | 整型NCV/(MB/s) | 浮点型NCV/(MB/s) | 整型NCV/(MB/s) | 浮点型NCV/(MB/s) | 整型NCV/(MB/s) | |||
LZO | -4.26 | 55.18 | 147.37 | 73.79 | 576.93 | 145.00 | ||
QUICKLZ | 0.00 | 55.75 | 0.00 | 69.28 | 72.36 | 118.64 | ||
LZ4 | 0.62 | 61.48 | 103.15 | 80.80 | 409.46 | 158.93 | ||
LZFX | -2.06 | 31.90 | 15.02 | 41.63 | 69.61 | 80.76 | ||
SNAPPY | 2.89 | 41.15 | 114.11 | 55.08 | 438.75 | 112.94 | ||
FASTLZ | -2.16 | 27.34 | 19.64 | 38.42 | 103.00 | 80.50 | ||
LZW | -6.75 | 7.47 | -3.62 | 3.51 | 7.43 | 13.15 | ||
RLE | -0.42 | 0.17 | 30.54 | 0.14 | 105.92 | 0.10 | ||
HUFFMAN | 0.20 | 0.96 | 1.14 | 1.64 | 6.96 | 6.45 | ||
SFANO | 0.16 | 1.00 | 1.29 | 1.88 | 6.30 | 5.75 |
Tab.3 Performance of two-phase compression method表3 两阶段压缩方法性能 |
CE/(%) | CT/10-2 s | UCT/10-2 s | |
---|---|---|---|
数据组1 (1821行×2134列) | 49.82 (69.51) | 1.59 (3.85) | 3.05 (4.65) |
数据组2 (2001行×2285列) | 56.93 (73.05) | 1.56 (4.16) | 2.83 (5.11) |
数据组3 (2645行×2759列) | 72.75 (81.38) | 2.15 (5.21) | 3.63 (6.92) |
注:括号内数字为对应整型数据组实验结果 |
Fig.4 Data transmission efficient based on the two-phase compression method图4 基于两阶段压缩方法数据传输效率 |
Tab.4 Com-transmission rate via the two-phase compression method表4 基于两阶段压缩方法的压缩传输比 |
进程数 | ||||||
---|---|---|---|---|---|---|
2 | 4 | 8 | 16 | 32 | 64 | |
采样数据1 (13 225行×13 795列) | 2.50 | 2.49 | 2.49 | 2.47 | 2.64 | 2.69 |
采样数据2 (33 063行×34 488列) | - | 2.54 | 2.60 | 2.76 | 2.73 | 2.75 |
The authors have declared that no competing interests exist.
[1] |
|
[2] |
|
[3] |
|
[4] |
[
|
[5] |
|
[6] |
|
[7] |
[
|
[8] |
[
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
[
|
[19] |
[
|
[20] |
[
|
[21] |
[
|
/
〈 | 〉 |