Skip to content
BY-NC-ND 4.0 license Open Access Published by De Gruyter Open Access December 13, 2017

A Stream Tilling Approach to Surface Area Estimation for Large Scale Spatial Data in a Shared Memory System

  • Jiping Liu , Xiaochen Kang EMAIL logo , Chun Dong and Shenghua Xu
From the journal Open Geosciences

Abstract

Surface area estimation is a widely used tool for resource evaluation in the physical world. When processing large scale spatial data, the input/output (I/O) can easily become the bottleneck in parallelizing the algorithm due to the limited physical memory resources and the very slow disk transfer rate. In this paper, we proposed a stream tilling approach to surface area estimation that first decomposed a spatial data set into tiles with topological expansions. With these tiles, the one-to-one mapping relationship between the input and the computing process was broken. Then, we realized a streaming framework towards the scheduling of the I/O processes and computing units. Herein, each computing unit encapsulated a same copy of the estimation algorithm, and multiple asynchronous computing units could work individually in parallel. Finally, the performed experiment demonstrated that our stream tilling estimation can efficiently alleviate the heavy pressures from the I/O-bound work, and the measured speedup after being optimized have greatly outperformed the directly parallel versions in shared memory systems with multi-core processors.

1 Introduction

In the context of the Spatial Big Data (SBD), various data-intensive and computation-intensive operations often involve heavy pressures on the Input and Output (I/O). This can lead to the exposure of slow I/O processes and then impede the performance improvement when processing a large scale spatial data set. According to the well-known Amdahl’s law, the part of the program that cannot be parallelized will become the constraints for boosting the performance [1]. This law has demonstrated that

S=11a+a/n(1)

where, a is the proportion of the computing time, n is the number of processors. Obviously, if the value of (1-a) is equal to zero, i.e., the algorithm is fully parallelized, a maximum speed can be obtained, i.e., (s=n). If a is equal to zero, i.e., the algorithm cannot be parallelized. In this case, a minimum speedup can be obtained, i.e., s=1. For example, if a is equal to 50%, the final speedup will be less than 2. As noted by Hill and Marty, more attentions should be paid on how to alleviate the serial bottleneck when optimizing the algorithms in the multi-core era [2].

Surface area provides a more accurate estimation of the land area, and there are many reasons to know the true surface area of the landscapes, especially in landscape analysis [3]. For example, Bowden et al. found that ratio estimators of Mexican spotted owl population size were more precise using a version of this surface area ratio than with planimetric area [4]. In the real world, topography surface is undulating, and the surface area of a region is not only related to the size of the projection area, but also the terrain and slope of the region, so it is necessary to combine with the undulating elevation data, i.e., the Digital Elevation Model (DEM) data. Measuring the surface area of land covers plays an important role in precisely monitoring the land cover changes to reveal the erosion and evolution, the ecological transition, the process of industrialization, the micro-environmental variation and so on.

However, the computational complexity and intensity of the estimation are very high due to the very heavy loads both on the CPU resources and the disk I/O processes. To boost the performance of the surface area estimation, this paper designed a novel method to spatial decomposition which could partition the vector data into spatial tiles with topological expansions. By this way, the spatial locality of the vector features in computation was well preserved. A streaming computing framework was further designed and implemented, which could overlap the I/O processes and multiple computing units encapsulating the copies of the surface area estimation. Finally, two demonstrating experiments were performed individually in order to systematically compare the conventional parallel approach and the proposed streaming version.

2 Background

When processing large volume of datasets, most popular methods for exploiting the parallelism are to decompose the computational process into multiple independent subprocesses, and then each sub-process only reads and processes a small part of the entire dataset. In other words, the data parallelism can be exploited in a divide and conquer way [5]. However, when multiple sub-processes simultaneously read or write the data, the computing resources often lie in an idle state. Typically, for the widely used high end workstations which equip multicore processors, the developers often fail to coordinate the I/O and the computing resources. In most cases, these machines provide limited physical memory resources compared with the volume of the spatial dataset (from dozens of Giga Bytes to Tera Bytes, or even more). Obviously, loading the entire dataset into the main memory is infeasible. At the same time, the I/O processes often consume a large proportion of the entire processing time. In brief, slow access ability does not match the rapidly growing computing power, and this performance constraint has become a common problem in spatial computing.

2.1 Stream Computing

Moore’s law suggests that the performance of chips is roughly doubled every two years [6]. Over the past half a century, computing has, famously, increased in potency. However, the power and cooling requirements are beyond that of the faster clock frequency, and growth of the CPU performance begin to slow down. Fortunately, the emergency of the multi-core processors has to some degree delayed the end of the Moore’s law. Unfortunately, the hard disk speed in the past several years has only gained moderate improvements, and this leads to the huge gap between the high performance computing and the sequential I/O performance [7]. Although the emergency of the Solid Disk and Phase Change Memory has alleviated this problem, this gap has not been merged fundamentally. In other words, a newly designed storage system is urgently demanded for targeting at the data processing procedure [8]. In such a situation, stream computing provides a new way for conquering this problem in the software level.

As suggested by Stephens [9], stream processing system is composed of a collection of separate but communicating processes that receive stream data as input and produce stream data as output. In contrast with batch processing, stream computing views the data as a sequence of elements made available overtime, allowing elements to be processed one by one rather than in large batches [10, 11], and this mode may be a good fit for modern generalpurpose processors [12]. Stream computing is a potential way for solving this problem, and four requirements related with designing an algorithm in multi-core architectures should be met [13, 14]:

  1. Sequential data access: the data should be organized and accessed as a stream of sequential data elements.

  2. Linear execution: the operations on the input data stream should be orderly organized, and driven to work in the pipeline mode.

  3. Locality: the operating range for each data element should only cover a limited elements set, i.e., data elements located in the remote areas should not been accessed.

  4. Memory recycling: the memory space consumed by each data element must be recycled to be reused for the next element to cope with the large volume of dataset in the main memory.

2.2 Related Spatial Processing

Some related work has demonstrated the advantages of using the streaming computation to exploit or create spatial coherence to process the large scale spatial data set. Isenburg et al. used stream computing to redesign the existing incremental Delaunay triangulation implementations [15]. Before triangulating the huge data set of points, the space was previously decomposed into small regions. Then a stream of input points with finalization tags was augmented, and each finalization tag indicated whether a topological point in the last region could be used for the future computation. Furthermore, the TIN (Triangulated Irregular Network) Streaming was also used in Digital Elevation Model (DEM) generation [16]. In addition, Wu et al. further optimized the streaming triangulation by partitioning the input points into non-overlapped blocks and then triangulating each block with divide and conquer Delaunay triangulation, instead of the incremental algorithm [13]. Guan et al. further exploited the thread level parallelism of multi-core platforms in generating DEM from massive air-borne LiDAR point clouds [17]. Herein, the raw point clouds were partitioned into overlapped blocks; then these discrete blocks were interpolated concurrently on parallel pipelines. On the interpolation run, intermediate results were sorted and finally merged into an integrated DEM. Besides, stream computation was also used in seamless building reconstruction from huge aerial LiDAR point sets by storing data as stream files on hard disk [18]. Kang et al. decomposed the point clouds into overlapping blocks with some distance expansions, and then take advantage of multi-core computing facilities to speed up the progressive TIN densification (PTD) filter [19].

The above work focused on how to decompose the huge point dataset into an organized data stream, and then process each data element continuously. As the point type is a typical category structure of vector data, each point only represents a spatial location without complex shapes. As a matter of fact, more sophisticated data structures are frequently used in spatial computing. In most cases, the vector objects maintain not only the location feature, but also the arbitrary shapes and their topological relations. Therefore, only using the locations and distance relations among the points is insufficient in decomposing spatial data, and the complex vector objects (such as the lines and polygons) and their topological relations should also be considered in optimizing the overlay-based surface area estimation.

In our proposed method, the key lies in how to decompose the input data into organized streaming elements, and then efficiently coordinate the data accesses and the estimation computing.

3 Methods

Generally, it involves two steps to parallelize a spatial algorithm: to decompose the data (spatial decomposition) and to schedule the tasks (task scheduling) [20, 21, 22].

3.1 Overlay-Based Surface Area Estimation

There were a variety of methods in the literature for measuring terrain irregularity. Hobson described some early computational methods for estimating surface area and discussed the concept of surface area ratios [23]. Beasom et al. described a method for estimating land surface ruggedness based on the intersections of sample points and contour lines [24], and Jenness described a similar method based on measuring the density of contour lines [3]. Mandelbrot et al. described the concept of “fractal dimension” in which the dimension of an irregular surface lies between 2 and 3 [25], and a series of publications discussed a variety of methods for estimating the fractal dimension for a landscape [26, 27]. Surface area derived with this method would, therefore, be unduly influenced by adjacent cells. In addition, Xue et al. used the continuous TIN to calculate the surface area, and further reduced the bias in estimation, which was resulted from the random errors in the DEM data [28]. Actually, the above methods can provide an approximate value for a regional area through the DEM data. However, to obtain the exact value of an area enclosed by a polygon, spatial overlay of the DEM layer and the polygon layer should be performed. Moreover, to cope with large scale datasets characterized by millions of polygons and high precision DEM, high performance computing (HPC) must be used.

Overlay-based method was proposed in this paper by overlapping the vector polygons with the digital elevation DEM dataset and then calculating the exact area of each polygon that covered the corresponding DEM cells. For the polygon layer, the calculation of the surface area contained four steps: (1) to read the coordinates that constituted a polygon and the raster cells with same scope to the bounding box of this polygon; (2) for each polygon, to find the intersecting DEM cells and calculate the partial area from each cell; (3) to sum up the partial areas to form the total area for each polygon; and (4) to continue step 1-3 until all the polygons had been processed. The details on how to calculate the surface area of a polygon covered the input, computation and output, and the algorithm was listed in Table 1.

Table 1

Overlay-based surface area estimation

1Initialize the range variable, polygon_box according the polygon coordinates.
Initialize the cells variable, box_raster, according to polygon_box.
Calculate the vertical positions of horizontal scanning lines within polygon_box and form the scanline_list each of which is isomorphic to the horizontal lines in raster data.
2For each two scanning lines in scanline_list (from top to bottom).

  1. Construct the horizontal rectangle, scanline_rect, with two adjacent parallel scanning lines.

  2. Cut the polygon with scanline_rect, and herein a group of horizontal stripping polygons are formed, i.e., scanline_polygon_list (Figure 1b).

For each scanning line polygon in scanline_polygon_list.

  1. Project the coordinates in the upper line to the lower line, and this constitutes some triangles, rectangles and trapeziums (Figure 1c).

  2. Decompose the above the triangles, rectangles and trapeziums with the vertical raster lines (Figure 1d), and this further bringmany small triangles and rectangles stored in region_list, each of which has three or four points.

  3. Calculate the partial area of each region in region_list by referencing the DEM data d: Sum the partial area computed from each the above regions.

3Output the result by summing all the elements in scanline_list.

Figure 1 shown the steps when overlaying a vector polygon with the corresponding DEM data set. The overlay operation was performed by repeating the above steps for all the polygons in the vector layer. For the focused row (the third line from the top), grid regions and black regions after sub-division should be included and excluded, respectively.

Figure 1 Surface area calculation (grid regions: to be added; black regions: to be subtracted)
Figure 1

Surface area calculation (grid regions: to be added; black regions: to be subtracted)

3.2 Spatial Decomposition

Problems involving massive amounts of geometric data are ubiquitous in spatial databases [29]. To deal with a very huge dataset in HPC environment, a special strategy that partitions the data into sub-areas should be firstly used. Herein, each sub-area only contains a small amount of geometries, and then different sub-areas can be simultaneously processed. Moreover, the geometries within the same sub-areas are close related, and this is because these geometries often partly share the same DEM ranges. In addition, the geometries across different boundaries should also be handled.

3.2.1 Spatial Locality in Surface Area Estimation

Locality is a fundamental characteristic in the field of geography, which means that a certain location is influenced by its surrounding environment and thus can be theoretically understood from the first law of geography (i.e., everything is related to everything else, but close things are more related than distant things [30]). With a further suggestion from Chaowei Yang et al., this fundamental spatial principle could be used to optimize the distributed computing [31]. Herein, the locality is the fundamental factor that determines whether a decomposing method is reasonable, and the computation in each partitioning domain should be provided with enough data to complete a subtask. Overlay-based surface area estimation assumed that each polygon and the DEM layer should be both provided. As a matter of fact, the DEM ranges for topologically adjacent polygons were mutually overlapped, as shown in Figure 2. Therefore, the calculations of the adjacent polygons would inevitably cause repeated reading of the DEM data.

Figure 2 Different polygons with overlapping boundaries
Figure 2

Different polygons with overlapping boundaries

In general, most decomposing methods partitioned the 2-D or 3-D space into multiple sub-spaces, and each sub-space could be individually processed. For the vector data, topological relations are often the basis for data processing, and the adjacent vector objects are often closely related and thus should be grouped into the same subsets. To partition the spatial domain covered by vector objects, kinds of decomposing methods had been used. Specifically, related methods could be divided into four classes [32]: (1) approaches based on the minimum bounding rectangles, such as B-tree [33], R-Tree [34]; (2) disjoint decompositions, such as R+Tree [35], cell tree [36]; (3) uniform grid approaches [37]; and (4) quadtree-based approaches [38]. The first two classes of approaches depended more on the data, and had been widely used in spatial queries, but not applicable to the grouping and spatial operations, such as the union operation [32]. Instead, the last two classes of methods could be used in settheoretic operations, such as union operation, intersection operation, difference operation, etc. The third class of methods used a regular grid to decompose the spatial domain, and thus produced a series of equal-sized blocks. The fourth class of methods continuously partitioned the space into four equal quadrants, until each quadrant only contained a certain amount of geometries. In comparison, uniform grid decomposition was easier to implement and could enable us to focus on handling the boundary geometries. More importantly, the produced equal-sized blocks were more suitable for a streaming framework.

When the data were decomposed into numerous tiles, a low level physical memory usage would be realized. In other words, the large scale data set became feasible for the limited memory space in a commodity workstation. In addition, if different spatial tiles had no mutual dependence, multiple computing processes could be asynchronously executed in parallel. Therefore, there was no need to wait between multiple processes.

3.2.2 Topology-Preserving Assignment Based on Spatial Index

For the spatial operations based on the topological relationships, the objects locating at the boundary needed special handling, i.e., the boundary object handling. In spatial joint query, the commonly used methods involved Multiple Assignment (MA) and Multiple Matching (MM) [39, 40]. MA assigned an boundary crossing object to multiple decomposing domains that related to this object; while MM only assigned this object to one domains, and obviously one object might appear in multiple domains in spatial joint query [41]. Obviously, MA could be easily used in spatial joint query, but the locality might be destroyed, and therefore the matching relations should be first established in stream processing in order to ensure the integrity of the data.

In the proposed approach, the domain was firstly decomposed into a series of adjacent tiles, and each tile corresponds to a block of DEM data. Then, an R-tree spatial index was used to accelerate the searching speed for the geometries within a tile range, and an open source spatial indexing package called libspatialindex (http://libspatialindex.github.iso/) was integrated. For each tile range bounded by the DEM block, spatial range query could quickly locate the intersecting polygons. All the result polygons from the query should be recorded, as shown in Figure 3. Herein, MA was adopted to expand the data tiles, and these expansions could preserve the topological locality of the objects near the tile boundaries. Figure 3 showed four decomposing domains (P1, P2, P3, and P4). To build the matching relations, each object was assigned to at least one tile. For example, the object b was simultaneously assigned to tile P1 and P4; while the object f was simultaneously assigned to tile P2, P3 and P4. In summary, there were four group of objects: P1 (a, b, c, d), P2 (d, e, f), P3 (f, g, P4 (b, c, h, i).

Figure 3 Topological expansion
Figure 3

Topological expansion

The polygons that were topologically contained in a tile could be directly calculated, such as the polygon a located in tile P1. Otherwise, only partial area topologically within the tiles should be considered. For the polygons that located on the tile boundaries, each tile could only be used for calculating a partial of the corresponding area, such as the polygon b. Therefore, the surface area of the polygon b should be obtained by summing the partial area in tile P0 and P1.

3.3 Streaming Schedule Framework

With the above decomposing method, a streaming schedule algorithm was designed that could overlap the computing processes and the I/O accesses. The streaming schedule process involved three modules, a reader, a writer, and multiple computing units that could be simultaneously executed. The reader took charge of continuously reading data from the data source, and organized them into a stream of data tiles. Then, these tiles were transferred to the subsequent computing units for processing. After that, the result for each tile would be directed to the writer, and then dumped to the disk. The whole process worked in a fixed order, i.e., reader → computing units → writer, as shown in Figure 4.

Figure 4 Streaming schedule model
Figure 4

Streaming schedule model

With the above decomposing method, each tile corresponded to a tile of DEM data and a group of polygons. The DEM data and the polygons were considered as the input for the streaming schedule framework, and the overlay-based surface area estimation was encapsulated into a series of computing units. Each polygon completely contained in a tile could be independently used for calculation, while the boundary-crossing polygons would be used for multiple tiles. Each computing unit maintained a list of results in a form of key-value pair which could label each polygon and its partial or the entire area in the outputting process. In computation, the reader played a role of producer, and it took charges of producing data tiles and pushing them into the input data queue which worked in a First In First Out (FIFO) manner. Each computing unit played a double role of both producer and the consumer: (1) getting the data elements from the input queue and processing each one, and (2) pushing the tile results into the output queue. Finally, the writer played a role of consumer, and it took charges of summing the results according to polygon keys. Before executing the procedure, the entire dataset should be regularly partitioned, and organized into a continuous data stream.

4 Results and Discussion

The overlay-based surface area estimation has been widely used in the Geographical Conditions Monitoring Project [42] in China. About 260 million land cover polygons that completely cover the country have been calculated on the 10-meter DEM data once a year, and the huge time cost should be reduced. For example, in a cluster computing environment with 30 computers, each computer would consume about 13 days with the serial estimation. In other words, the accumulated processing time would be more than 400 days. To validate our proposed method, we conducted two individual experiments in two different environments for investigating the adaptability, performance improvement and the memory utilization. The first experiment was to measure the adaptability of the approach on polygons with extreme heterogeneity, while the second one was to test the performance on a very large dataset. In the near future, we will also explore ways to facilitate the approach under a cluster environment.

4.1 Experiment on the Adaptability

4.1.1 Experiment Environment and Dataset

The first experiment was conducted on a workstation running Microsoft Windows 7 (×64) with a Six-Core Intel Xeon E5-1650 (3.50 GHz in each core), 16 GB DDR4-2133 RAM and a 1 TB hard disk. Moreover, hyper-threading technology made a single physical processor appear as two logical processors, and OpenMP was used for thread-level parallelism. The program was compiled with the Intel C++ compiler.

An extremely complex vector dataset from Huazhou City, Guangdong Province, China (Figure 5) was selected. The data layer covered about 2356 square kilometers, and many kinds of land parcels such as farm lands, garden lands, grass lands, buildings and roads, were randomly distributed in the area. The total number of these parcels was 198002. In terms of the geometry shapes, most of these parcels were very complex polygons with distorted boundaries and inner holes. From the spatial distribution, these parcels were extremely fragmented and skewed. To support the surface area estimation, a DEM dataset with 10-meter resolution was used.

Figure 5 Land cover parcels from Huazhou City, Guangdong Province, China
Figure 5

Land cover parcels from Huazhou City, Guangdong Province, China

4.1.2 Results

In this experiment, we compared the algorithms of streaming version and the traditional parallel version on each class of land parcels. To test the adaptability, the tile size was set to 5000 (meters) × 5000 (meters) and 12 threads were all used. Thus, each tile would correspond to a block of DEM data with 500 (rows) × 500 (columns). To ensure the accuracy, each observed value was given according to the mean value of three repeated tests. As shown in Table 2 was the computing time for different classes of land parcels.

Table 2

Computing time for different classes of land parcel (s)

ClassificationNon-streaming processingStreaming processingEfficiency Improvement
Serial timeParallel timeSpeedupSerial timeParallel timeSpeedupSerialParallelTotal speedup
1Farm lands287.7690.463.18156.6743.713.5845.56%51.68%6.58
2Garden lands201.7062.863.21122.4334.053.6039.30%45.83%5.92
3Wood lands799.25195.854.08328.00126.502.5958.96%35.41%6.32
4Grass lands126.0651.32.4677.3929.822.6038.61%41.87%4.23
5Buildings166.56107.061.5688.8223.293.8146.67%78.25%7.15
6Roads535.7996.975.53351.0463.155.5634.48%34.88%8.48
7Constructed bodies44.0726.951.6427.6520.581.3437.26%23.64%2.14
8Digging lands8.924.871.838.762.663.291.79%45.38%3.35
9Desert lands2.911.921.522.821.881.503.09%2.08%1.55
10Waters265.7598.712.69148.4825.165.9044.13%74.51%10.56
Total-Average2,438.77736.953.311,312.06370.803.5446.20%49.68%6.58

Herein, the speedup for each class of land parcels in non-streaming and streaming processing was calculated according the ratio of the parallel time to the serial time. While the efficiency improvement was calculated according to the ratio of the saved time to time used in non-streaming processing. Furthermore, the total speedup was calculated according to the ratio of the parallel time used in streaming manner to the serial time in non-streaming manner. Obviously, in a non-streaming manner, the parallel efficiencies for different classes of land parcels were different, which was mainly due to the differences in terms of polygon complexity. As the wood lands, roads and waters were usually constrained with arbitrary shapes constituted by very dense coordinates, the processing time were mainly consumed in the computation instead of I/O processes. Therefore, a higher speedup ratio could be easily obtained in parallel environment. In comparison, the buildings, constructed bodies, desert lands and other land parcels were often simple polygons with sparse coordinates. In the non-streaming manner, lower computational loads would bring very limited efficiency improvements through parallelization considering the huge I/O consumption. While in the streaming manner, the I/O consumption could be further reduced by overlapping it with the computation. Therefore, a further improvement could be realized. In addition, a comprehensive test for all the land parcels was also conduced to measure the entire efficiency in the streaming manner, as shown in Table 3.

Table 3

Computing time for the entire dataset in streaming manner (s)

Non-streaming processingStreaming processingEfficiency
SerialParallelSpeedupThreadsParallelSpeedupimprovement
2,438.77736.953.3111,285.531.9047.3%
2894.812.73
4613.363.98
6463.675.26
12316.297.7157.1%

When all the land parcels were merged into a complete polygon layer, spatial decomposition could be accomplished once for all the land parcels. Under this situation, repeated I/O could be further reduced. Therefore, an ideal speedup was realized when a few of threads were allocated in Table 3. Physically, overlapping of CPU and I/O also exploited the parallelism in computation. In comparison with the non-streaming serial version, the time used could be reduced by more than 47 percent. While the streaming parallel version could reduce more than 57 percent of the time in comparison with the non-streaming parallel version.

4.2 Experiment on the Performance

4.2.1 Experiment Environment and Dataset

The second experiment was conducted on a workstation running Microsoft Windows 7 (×64) with two Six-Core Intel Xeon E5645 (2.40 GHz in each core), 64 GB DDR3-1333 RAM and a 2 TB hard disk. The test data utilized the vector slope data and the DEM data from Zhuzhou City, Hunan Province, China (Figure 6), and data range was (112.962W, 27.207S, 113.359E, 28.028N). The polygon layer contained 8933891 polygons. The DEM data was resampled into 1 meter from 10 meters, in order to expose the I/O bottleneck, and the counts of the columns and rows of the DEM dataset were 79375 and 167613, respectively, occupying about 49.5 GB. In Figure 6, the left was the overlapped layers, and the right was the partial enlarged drawing of a small window in the left. In general, most of polygons had very complex shapes, and a river went across even the entire range.

Figure 6 Vector slope polygons from Zhuzhou City, Hunan Province, China
Figure 6

Vector slope polygons from Zhuzhou City, Hunan Province, China

4.2.2 Results

From the perspective of time and memory usage, the test compared the algorithms of streaming version and the traditional parallel version. Table 4 showed the time usage for the tiles with different sizes; Table 5 showed the corresponding physical memory usage. According to Table 4, in the non-streaming manner, the speedup from parallel version algorithm was about 6.8. Due to the low efficiency of I/O operations, the observed speedup was limited to a further improvement.

Table 4

Computing time comparison (s)

Non-streaming processingStreaming processingEflciency improvement
SerialThreadsParallel timeSpeedupTile SizeSerial timeParallel timeSpeedupSerialParallelTotal speedup
1027332613491.67200^24822688175.4753.1%41.5%11.65
3460122.23500^24945187785.6351.9%41.7%11.70
6254564.041000^24822589035.4253.1%40.9%11.54
12152426.742000^24837481175.9652.9%46.1%12.66
24150606.825000^25093884446.0350.4%43.9%12.17

Table 5

Memory used in computation (MB)

Non-streaming processingStreaming processing
235716391200^2318433
500^2164335
1000^2171486
2000^2218933
5000^25474296
10000^272216235

Under the same hardware environment, the streaming version algorithm achieved an obviously optimized result. With the proposed streaming schedule, algorithms from serial version and parallel version were realized and tested. For the serial version, the streaming engine allocated three independent threads to account for reading data, calculating surface area, and writing (summing) the results. Herein, the size of the schedule queue was set to 2, in order to avoid the waiting phenomena between the CPU and I/O resources. Physically, overlapping of CPU and I/O also exploited the parallelism in computation. In comparison with the non-streaming serial version, the time used could be reduced by about 50 percent. While the streaming parallel version could reduce about 40 percent of the time in comparison with the non-streaming parallel version.

A further test was also given for analyzing the data throughput in two manners. As shown in Table 6 was the disk reading reduction. In comparison, the amount of the data to be accessed in non-streaming manner was many times the input DEM data set; while in the streaming manner, the accessed data was only slightly more than the inputting DEM data. In addition, the amount of the data to be accessed showed a decreased trend along with the tile size increased. On the whole, very low disk throughout was observed in the streaming manner, and this contributed to the high reduction in data reading and the significant improvement in efficiency in Table 4

Table 6

Disk reading reduction (GB)

Non-streaming processingStreaming processing
SerialParallelBlock sizeSerialReductionParallelReduction
320.01319.55200^256.6182.31%57.4582.02%
500^253.3783.32%52.4983.57%
1000^251.6283.87%51.8883.76%
2000^250.3984.25%51.0384.03%
5000^249.5684.51%49.3784.55%

4.3 Discussions

According to the results, non-streaming parallelism provided an effective way for boosting the performance in estimating the surface-area. Through comparing different hardware environments, it can be concluded that more computing threads contribute to a better performance improvement. However, the speedup obtained from multicore environment is very limited due to the intractable I/O consumption. In fact, the deeper reason lies in the slow I/O accesses. Moreover, this problem will get worsen due to the random distribution of the irregular polygons, which can easily incur a heterogeneous distribution of the computational loads for different polygons and repetitive accesses for the same range of DEM data set. In addition, huge memory consumption and even bad allocation may occur when processing large scale datasets. On the whole, the slow I/O processes make up a large portion of the total time, and this is consistent with the Amdahl’s law.

To solve these problems in a streaming manner, the experimental datasets were firstly decomposed into tiles and the topological relations between the tile boundaries and the polygons were built. Herein, each tile range corresponded to a small tile of DEM data and a group of intersecting polygons, which together constituted the input of a computing unit. Through a streaming schedule based on the producer-(consumer/producer)-consumer, the processing time was reduced sharply in comparison with the non-streaming approach. Moreover, the parallel efficiency had gained a further improvement, and the speedup was nearly double that in non-streaming manner. In addition, the experimental results also indicated that an adjustable tile size could maintain a very low memory usage in the streaming manner. With the increase of the tile size, the costs of the scheduling could be reduced, and the speedup would increase accordingly. However, the memory usage would also increase sharply. Therefore, an appropriate tile size should be set by referencing the physical memory space in the workstation.

According to the testing results, using the streaming schedule to optimize the surface area estimation has three obvious advantages: First, a better adaptability can be accomplished when processing complex polygons in different hardware environments; second, the streaming estimation has a more satisfactory serial efficiency, parallel efficiency, and total speedup, in comparison with the non-streaming manner. Third, very high memory usage in computation can be reduced to an affordable level.

5 Conclusions

In the field of GIS, surface area is a widely used tool for measuring kinds of geographical phenomena. However, the extremely high complexity has impeded the wide application of this measurer. In this paper, a stream tilling approach which involved a decomposing strategy preserving the spatial locality and a streaming framework overlapping the computation and I/O accesses was designed and implemented. The overlay-based surface area estimation was then transformed into a streaming version, which could well coordinate the input data stream, the computation and the output data stream. According to the demonstrating experiments, the streaming estimation has a more satisfactory computing efficiency in comparison with the non-streaming version. At the same time, the memory usage in computation is very low and therefore can be used for processing large scale spatial data in a shared memory system.

Amdahl’s Law governs the speedup of using parallel processors on a problem versus using only one serial processor. The proposed stream tilling approach presents a novel way for conquering the data-intensive applications. It is no doubt that other spatial analysis algorithms will also benefit from the proposed approach in the near future.

  1. Author Contributions: All the authors contributed extensively to the work presented in this paper. Jiping Liu and Xiaochen Kang have made the main contributions on the programming, performing experiments and writing the manuscript. Chun Dong and Shenghua Xu have revised the paper extensively.

Acknowledgement

This work was financially supported by the National Natural Science Foundations of China (Grant No. 41701461) and the National Key Research and Development Program of China (Grant No. 2016YFC0803100). We thank Haowen Yan for English correction, and we also thank the editors and the anonymous reviewers for their insightful comments which have helped to improve the quality of the paper.

References

[1] Gustafson J. L., Reevaluating Amdahl’s law. Communications of the ACM, 1988, 31(5), 532-53310.1145/42411.42415Search in Google Scholar

[2] Hill M. D., Marty M. R., Amdahl’s law in the multicore era. Computer, 2008(7), 33-3810.1109/HPCA.2008.4658638Search in Google Scholar

[3] Jenness J. S., Calculating landscape surface area from digital elevation models. Wildlife Society Bulletin, 2004, 32(3), 829-83910.2193/0091-7648(2004)032[0829:CLSAFD]2.0.CO;2Search in Google Scholar

[4] Bowden D. C., White G. C., Franklin A. B., and Ganey J. L., Estimating population size with correlated sampling unit estimates. The Journal of wildlife management, 2003, 67 (1), 1-1010.2307/3803055Search in Google Scholar

[5] Gordon M. I., Thies W., Amarasinghe S. Exploiting coarsegrained task, data, and pipeline parallelism in stream programs. ACM SIGOPS Operating Systems Review. ACM, 2006, 40(5), 151-16210.1145/1168917.1168877Search in Google Scholar

[6] Moore G. E., Cramming more components onto integrated circuits. In: Hill M. D., Jouppi N. P., Sohi G.S. (Eds.), Readings in computer architecture. Morgan Kaufmann Publishers, San Francisco, California, 2000, 56-59Search in Google Scholar

[7] Chen C. P., Zhang C. Y., Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 2014, 275(11), 314-34710.1016/j.ins.2014.01.015Search in Google Scholar

[8] Labrinidis A., Jagadish H., Challenges and opportunities with big data. In: Proceedings of the VLDB Endowment. 2012, 5(12), 2032-203310.14778/2367502.2367572Search in Google Scholar

[9] Stephens R., A survey of stream processing. Acta Informatica, 1997, 34(7), 491-54110.1007/s002360050095Search in Google Scholar

[10] Feigenbaum J., Kannan S., Strauss M. J., Viswanathan M., An approximate L 1-difference algorithm for massive data streams. SIAM Journal on Computing, 2002, 32(1), 131-15110.1109/SFFCS.1999.814623Search in Google Scholar

[11] Neumeyer L., Robbins B., Nair A., Kesari A., S4: Distributed stream computing platform. In: Fan W., Hsu W., Webb G. I., Liu B., Zhang C. Q., Gunopulos D., Wu X. D. (Eds.), Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, Sydney, Australia, 2010, 170-177Search in Google Scholar

[12] Gummaraju J., Rosenblum M., Stream processing in general purpose processors. In: Mukherjee S. (Eds.), Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 04’), Boston, MA, USA. 2004, 9-13Search in Google Scholar

[13] Wu H., Guan X., Gong J., ParaStream: A parallel streaming Delaunay triangulation algorithm for LiDAR points on multicore architectures. Computers & Geosciences, 2011, 37(9), 1355-136310.1016/j.cageo.2011.01.008Search in Google Scholar

[14] Guan X., Liesmars H. W., Lin L. L., A Parallel Framework for Processing Massive Spatial Data with a Split–and–Merge Paradigm. Transactions in Gis, 2012, 16(6), 829–84310.1111/j.1467-9671.2012.01347.xSearch in Google Scholar

[15] Isenburg M., Liu Y., Shewchuk J., Snoeyink J., Streaming computation of Delaunay triangulations. ACM Transactions on Graphics (TOG). ACM, 2006, 25(3), 1049-105610.1145/1179352.1141992Search in Google Scholar

[16] Isenburg M., Liu Y., Shewchuk J., Snoeyink J., Thirion T., Streaming computation of Delaunay triangulations. In: Raubal M., Miller H. J., Frank A.U., Goodchild M. F. (Eds.), 4th International Conference, GIScience 2006, Münster, Germany, 2006, 186-19810.1145/1179352.1141992Search in Google Scholar

[17] Guan X., Wu H., Leveraging the power of multi-core platforms for large-scale geospatial data processing: Exemplified by generating DEM from massive LiDAR point clouds. Computers & Geosciences, 2010, 36(10), 1276-128210.1016/j.cageo.2009.12.008Search in Google Scholar

[18] Zhou Q. Y., Neumann U., A streaming framework for seamless building reconstruction from large-scale aerial lidar data. In: Flynn P., Mortensen E. (Eds.), Computer Vision and Pattern Recognition, 2009, Miami, FL, USA, 2009, 2759-276610.1109/CVPR.2009.5206760Search in Google Scholar

[19] Kang X., Liu J., Lin X., Streaming Progressive TIN Densification Filter for Airborne LiDAR Point Clouds Using Multi-Core Architectures. Remote Sensing, 2014, 6(8), 7212-723210.3390/rs6087212Search in Google Scholar

[20] Wang S., Armstrong M. P., A theoretical approach to the use of cyberinfrastructure in geographical analysis. International Journal of Geographical Information Science, 2009, 23(2), 169-19310.1080/13658810801918509Search in Google Scholar

[21] Wang S., A CyberGIS framework for the synthesis of cyberinfrastructure, GIS, and spatial analysis. Annals of the Association of American Geographers, 2010, 100(3), 535-55710.1080/00045601003791243Search in Google Scholar

[22] Wang S., Anselin L., Bhaduri B., Crosby C., Goodchild M. F., Liu Y., Nyerges T. L., CyberGIS software: a synthetic review and integration roadmap. International Journal of Geographical Information Science, 2013, 27(11), 2122-214510.1080/13658816.2013.776049Search in Google Scholar

[23] Hobson R. D., Surface roughness in topography: a quantitative approach. Anesthesia and Analgesia, 1972, 90(3), 51-6410.4324/9780429273346-8Search in Google Scholar

[24] Beasom S. L., Wiggers E. P., and Giardino J. R., A technique for assessing land surface ruggedness. Journal of Wildlife Management, 1983, 47(4), 1163-116610.2307/3808184Search in Google Scholar

[25] Mandelbrot B. B., The Fractal Geometry of Nature. In: W.H. Freeman, New York, NY, USA, 198310.1119/1.13295Search in Google Scholar

[26] Polidori L., Description of terrain as a fractal surface, and application to digital elevation model quality assessment. Photogrammetric Engineering & Remote Sensing, 1991, 57(10), 1329-1332Search in Google Scholar

[27] Lorimer N. D., Haight R. G., Leary R. A., The fractal forest: fractal geometry and applications in forest science. In: General Technical Report NC-170, US Department of Agriculture, Forest Service, North Central Forest Experiment Station: St. Paul, MN, USA, 1994Search in Google Scholar

[28] Xue S., Dang Y., Liu J., Mi J., Dong C., Cheng Y., Wang X. Q., Wan J., Bias estimation and correction for triangle-based surface area calculations. International Journal of Geographical Information Science, 2016, 30(11), 2155-217010.1080/13658816.2016.1162795Search in Google Scholar

[29] Vitter J. S., External memory algorithms and data structures: Dealing with massive data. ACM Computing surveys (CsUR), 2001, 33(2), 209-27110.1145/384192.384193Search in Google Scholar

[30] Tobler W. R., A computer movie simulating urban growth in the Detroit region. Economic geography, 1970, 234-24010.2307/143141Search in Google Scholar

[31] Yang C., Wu H., Huang Q., Li Z., Li J., Using spatial principles to optimize distributed computing for enabling the physical science discoveries. Proceedings of the National Academy of Sciences, 2011, 108(14), 5498-550310.1073/pnas.0909315108Search in Google Scholar PubMed PubMed Central

[32] Samet H., Spatial data structures. In: Kim W. (Eds.), Modern database systems. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 1995, 361-385Search in Google Scholar

[33] Comer D., Ubiquitous B-tree., ACM Computing Surveys (CSUR), 1979, 11(2), 121-13710.1145/356770.356776Search in Google Scholar

[34] Guttman, A., R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proceedings of the International Conference on Management of Data, Boston, MA, USA, 1984, 47–57.Search in Google Scholar

[35] Sellis T., Roussopoulos N., Faloutsos C., The R+-tree: A dynamic index for multi-dimensional objects. In: Stocker P. M., Kent W., Hammersley P. (Eds.), Proceedings of the 13th International Conference on Very Large Databases, Brighton, UK, 1987: 507-518Search in Google Scholar

[36] Günther O., Efficient structures for geometric data management. Springer, Berlin, Heidelberg, 198810.1007/BFb0046097Search in Google Scholar

[37] Franklin W. R., Part 4: Mathematical, Algorithmic and Data Structure Issues: Adaptive Grids For Geometric Operations. Cartographica: The International Journal for Geographic Information and Geovisualization, 1984, 21(2-3), 160-16710.3138/Q722-7681-3K17-JR08Search in Google Scholar

[38] Samet H., Webber R. E., Storing a collection of polygons using quadtrees. ACM Transactions on Graphics (TOG), 1985, 4(3), 182-22210.1145/282957.282966Search in Google Scholar

[39] Lo M. L., Ravishankar C. V., Spatial hash-joins. ACM SIGMOD Record. ACM, 1996, 25(2), 247-25810.1145/233269.233337Search in Google Scholar

[40] Zhou X., Abel D. J., Truffet D., Data partitioning for parallel spatial join processing. Geoinformatica, 1998, 2(2), 175-20410.1007/3-540-63238-7_30Search in Google Scholar

[41] Aji A., Wang F., Vo H., Lee R., Liu Q., Zhang X., Saltz J., Hadoop GIS: a high performance spatial data warehousing system over mapreduce. In Proceedings of the VLDB Endowment, Riva del Garda, Italy, 2013, 1009–102010.14778/2536222.2536227Search in Google Scholar

[42] Zhang J., Li W., Zhai L., Understanding geographical conditions monitoring: a perspective from China. International Journal of Digital Earth, 2015, 8(1), 38-5710.1080/17538947.2013.846418Search in Google Scholar

Published Online: 2017-12-13

© 2017 J. Liu et al.

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Downloaded on 19.3.2024 from https://www.degruyter.com/document/doi/10.1515/geo-2017-0047/html
Scroll to top button