Abstract
The design of compression codes for LIDAR sensors faces many challenges that are related to the large amounts of data that are generated, the high rates of data acquisition, the dynamic structure of the data streams, the need to provide rapid access to subsets without decoding the whole collection, varying requirements on data fidelity, the need to perform multiple levels of processing to derive the end-user data stream and the desire to integrate compression into different processing levels. The wide range and difficulty of the challenges and the computational challenges implied by the size and data rates associated with the collections speak for the use of technology that is modular and flexible as well as founded on fundamental information theory and practical real-world success in other challenging applications. In this proposal we describe a research program that is based on a number of experimental results in our laboratory that show great potential for LIDAR compression, where we have achieved a compression ratio of more than 100:1 for point cloud elevation data when properly framed and then encoded using prediction and transformation algorithms, quantization and lossless coding. This proposed research is intended to systematically explore the multitude of compression options within a general structure that uses these techniques, to develop a prototype compression/decompression system, and to lay the foundation for standards-based LIDAR compression techniques that will support a broad range of applications. The development process follows a cyclic rapid-prototyping and application-oriented evaluation methodology marked by frequent workshops with the sponsor, developers and application experts. Each workshop will include application experts from areas such as path/route planning, line-of-sight analysis, detection of vertical obstructions, cross-country trafficability (vehicles) analysis, and helicopter landing zones/point identification and will integrate their perspectives into testing and evaluation process and planning of application dependent functionality. Our experimental results and analysis indicates that the prospects for achieving the research goals of this topic are excellent.
Research Plan
Objectives
The objectives of this research are to study the underlying structure of LIDAR data from an informational perspective, to determine a natural structure that is amenable to compression algorithms, to construct mathematical models of those structures and to implement and test a complete encoding and decoding system with real LIDAR data. The objectives also include an assessment of usage factors that are important in applications so that the overall structure of the encoding and decoding system can accommodate them. An example is the ability to extract subsets of the data based on location without the need to decompress large amounts of data.
Current compression approaches in the literature use approaches that are not likely to be fully successful. The most successful approach appears to be that of Martin Isenburg and Jonathan Shewchuk who have developed a suite of LIDAR tools that are available on the Internet [1]. Included in their tools is a lossless compression program that can achieve approximately 11:1 compression. They have also published papers on computation of elevation models from streaming data, which would be an initial step toward a compression algorithm based on encoding of surfaces.
An approach to LIDAR compression based on representing surfaces is also described in the work of Pradhan et. al. [2, 3]. Wavelet encoding is used in this work to represent various LIDAR surfaces using triangular irregular networks.
There are basic problems, although different, with both the lossless compression and surface representation approaches. Lossless compression is good in that no information is lost but at the cost of limited compression power. The powerful tools of rate-distortion theory provide a means to achieve much greater compression while controlling the type and amount of distortion. By employing these tools we can achieve many multiples of compression over lossless coding while still maintaining the utility of the data. The approaches based on surface representation have the problem that the surface has to be first represented with a fairly complex model and then the surface is described by some kind of transform such as a wavelet. Not all data of interest is elevation or intensity data, which presents the need for another type of representation, as in the need to represent multiple returns. Furthermore, the method is inherently incapable of being extended to real-time operation.
Our proposed research is based on the natural sequential structure that is provided by the scanning operation. It enables us to divide the data into frames that can be encoded using transform coding and predictive coding combined with lossless coding. These techniques are well-established in video compression, which also has high-volume and complex data. There is little that can be transferred from video coding in any specific way, but we can apply the inspiration and the principles. We have constructed prototypes of the critical frame-based encoding elements and have found that very high compression ratios are possible. It is possible to provide a smooth rate-distortion trade-off that can be under user control. This proposal is to develop the theoretical structure, prototype implementation, testing and evaluation, and, finally, a working encoding & decoding algorithm together with a proposed format description.
The natural sequential structure can be directly extracted from LIDAR data that is arranged by flightline. For point clouds that have been constructed by combining several flightlines the scanning pattern can be extracted from point timing information or, absent even that, from embedded patterns in the point coordinates.
Anticipated Results
We expect to deliver a fully functional prototype of a LIDAR compression system and a proposed format for compressed LIDAR data as a part of this research. The prototype will be as described in Approach (below) and will be tested and evaluated using real LIDAR data for a set of user applications. The evaluation will include workshops with end users. The underlying mathematical framework will be described from an information-theoretic and signal processing perspective. The algorithms and their implementation will be documented and the computational performance evaluated with a variety of tests on real LIDAR data from test scenes. Results will be reported in an annual technical report and demonstration code will be delivered.
Applicability
This research addresses the compression of LIDAR point cloud data and production of data files that are accessible by a variety of indexing methods. The goal is the availability of compression across a range from low to high with a managed trade-off with loss of fidelity. The method is inherently capable, through the transforms and predictive filters used in the compression process, of being adapted to rejection of outliers and spurious responses. The computational demands will be assessed in connection with various compression options and when applied to different types of retrieval. The methods will be tested on real point cloud data and on data that has had controlled spurious anomalies embedded, such as scene clutter, noise and objects of different size, shape and structure. The evaluations will include performance assessments for applications such as trafficability studies and route planning, line-of-sight analysis, detection of obstructions, and forest and terrain studies. It is planned that the assessments will be conducted in connection with regular workshops with application experts and that their evaluations will be reported and used for refinement of the processes.
Approach
Scientific and Technical Approach
LIDAR presents a very challenging compression problem. In examining possible solutions, it is worthwhile to consider successful approaches to compression that have been applied in other challenging situations. LIDAR compression has important analogies to video compression, including the large volume of data and high natural redundancy combined with scene-dependent randomness. Video compression has gone through a long period of research and development that has led to systems that achieve substantial compression and are general enough to be embodied in standards so that they can be deployed in many applications. The fundamental elements of the video compression algorithms are framing, prediction, transformation, quantization and coding. Each element is designed to do a particular part of the compression task and is highly-tailored to the video media. We propose similar elements for LIDAR compression that are tailored to its characteristics and use.
Let us look at the fundamental elements individually.
Framing is used to select basic signal entities to form the foundation of the compression process. In video the frames are short sequences of images that are encoded together. Each frame is processed with algorithms that find the internal relationships in the data and describe them in a compressed form. One key to LIDAR compression is finding similar entities that occur naturally in LIDAR signals and then finding their internal relationships, deriving a mathematical framework and expressing it as an algorithmic process.
Prediction is used to capture much of the structure in video frames. By using information that is provided by preceding images it is possible to make a prediction about the current image and then encode the difference between the prediction and the actual content. Clever coding techniques specific to the medium, such as block coding and motion estimation are used in making the predictions. Different techniques would be required for LIDAR, but such are possible and are described below.
Transformation is used to convert the data to a domain that enables essential components to be isolated and preserved while others are discarded. The transformation used in video is the discrete cosine transform (DCT), which provides a match between video characteristics and elements that are important for human perception. Different transformations, such as wavelets, are likely to be more appropriate for LIDAR applications. Our research, described below, indicates that wavelet compression, when applied to correctly chosen elements of the LIDAR data, can be very effective. It is critical that the correct data elements are used for this to be effective.
Quantization can be applied to the transformation coefficients and to the prediction residuals to restrict the set of possible values to be stored or transmitted. Individual values can be quantized or combinations of elements can be quantized using vector quantization. Vector quantization can be significantly more efficient but at a substantial cost in complexity. The quality-cost-efficiency trade-offs are worth considering as a part of this research.
Lossless coding of the resulting wavelets uses standard techniques such as Huffman coding, arithmetic coding, Burrows-Wheeler block transforms, Lempel-Ziv-Welch (LZW) coding, the Lempel-Ziv-Markov chain-algorithm (LZMA), or others to efficiently encode the quantized data by exploiting statistical qualities of the data. The most appropriate technique for LIDAR will depend upon an analysis of the data produced by earlier steps and is worth considering as a later part of this research.
A compression approach for LIDAR can be built upon similar elements that are matched to the characteristics of that data. We will briefly describe structural and statistical characteristics of LIDAR that are important to the compression problem and then describe our approach to LIDAR compression.
A full-resolution uncompressed video signal requires 124.416 Mbps. MPEG-2 can squeeze this down to about 4 Mbps with little or no loss in perceived quality; this is a true compression ratio of 124:4 or 31:1. A goal of our LIDAR compression approach is to achieve similar results. We have shown this to be feasible in our experimental work using an approach that is described in Compression Methodology and Experimental Results below.
Characteristics of LIDAR Data
The first step in compression is to define a suitable framework. This means finding a structure of related entities in the data that can be treated as a group and described in a compact data structure.
LIDAR acquisition is fundamentally a sampling process that provides a distribution of point measurements from reflecting objects in a 3D spatial domain. We are particularly interested in LIDAR collections made from airborne platforms and the ability to both quickly exploit the data and to compress it for storage or communication in a manner that does not introduce damaging artifacts, is reliable, and does not have undesirable side-effects on algorithms for target detection or other applications.
Our basic approach is to identify and extract LIDAR data elements that can be formed into frames of data with a high internal structure. This makes it possible to tailor mathematical techniques for transformation and prediction that can remove the majority of data redundancy. Algorithms based on the mathematical structure can then be constructed to calculate descriptive data structures for efficient storage or transmission. This approach is based on proven techniques in the waveform and image processing domains but which, to our knowledge, have not been investigated in connection with LIDAR. The method that we use to construct data frames must be general and must support a broad range of LIDAR systems and applications.
LIDAR point data has a rather random appearance when displayed as a point cloud, as illustrated in Figure 1(a). After geo-rectification the points are in a coordinate system that enables spatial relationships to be analyzed, but the points are not on a regular grid and cannot therefore be accessed by a pixel or voxel scheme. They can be interpolated to a coordinate grid, but that risks degradation and mixing of the data. Using spatial position as the primary access method presents significant computational challenges to data compression. The difficulty can be reduced by using regularities in the data that are seen when the points are examined in their collection order. The regularities can be exploited to achieve a high degree of compression while preserving accessibility and data fidelity.
Regularities in the data arise because of the physical design of LIDAR sensors. The system samples the scene by projecting a beam and collecting returns. The beam is steered by a scanning mirror that follows a periodic motion as the sensor platform moves through space. This periodic motion provides samples of the point positions that follow a nearly periodic path. This is illustrated in Figure 1(b) where the x and y positions have been plotted for a small portion of a flightline. The sequences clearly have periodicities and high redundancy that can be exploited for data compression.
Other types of data sequences have less structure than the (x,y) location sequences, but still have redundancy that can be exploited. The elevations from the primary returns are shown in Figure 1(c), where it is evident that the data has an almost-periodic structure that is related to the scanning action of the sensor. While this regularity will not be present in all portions of a scene, and will change with position in the scene, it is a structural feature of the data that is available for exploitation for compression. The same comment applies to the intensity and x position data shown in Figure 1(d). Examination of other components of LIDAR data from different scenes consistently shows that this type of structure is present and can provide a natural and directly accessible framework for data compression algorithms.
The regularity of the LIDAR data components and their synchronicity with the scanning action presents two important elements to be exploited for data compression. Regularity offers the opportunity to frame several scanning cycles into each frame and then calculate relationships among the scans to form compressed descriptions. The relationships can be based on transformations, such as wavelet transforms, and inter and intra-scan predictions. The opportunities provided by this method of organizing the data for compression offer a broad range of opportunities We have achieved compression ratios on the order of 100:1 in describing elevation data and even higher ratios for position data. The purpose of this proposal is to do a systematic examination of compression approaches starting with this data organization, with the expectation of substantially improving our initial results and constructing an end-to-end compression and decompression process.
The position data has a particularly tight relationship, but also a number of challenging characteristics. Some characteristics can be seen by analyzing the sample-to-sample changes, Δx and Δy. The sawtooth character of the sequences seen in Figure 1(b) indicate that Δx and Δy will be nearly constant over the linear regions. A plot of Δx and Δy in Figure 2(a) shows that this is essentially true, but that each waveform has scattered points that are shifted above or below the curves.
The scattering in the Δx and Δy values is primarily caused by missing points in the LIDAR returns. When n points are missed in sequence then the deltas step by a multiple of n. However, the (Δx, Δy) values are still highly correlated as shown in Figure 2(b) and (c). The difference Δx-Δy, shown in Figure 2(b), is consistently a multiple of a basic unit and the correlation, shown in Figure 2(c) follows a very tight 2D histogram. The slope of the histogram is a function of the angle of flight, but is always very linear. A joint (Δx, Δy) histogram is shown in Figure 2(d). This histogram will hold over a major section of a given flightline and offers the opportunity to encode the track information with high compression.
Our initial experiments with encoding of (x,y) point positions using only sweep pairs has achieved a compression ratio of more than 30:1. We expect this ratio to be increased by an order of magnitude when encoding sweep position data over large portions of the flightline. This is due to the highly repetitive character of the scanning motion. In Figure 3(a) the variation of the x-coordinate of forward and backward sweeps over a sequence of 196 cycles. Predictive encoding will make it possible to describe each sweep by encoding small changes in the sweep parameters, leading to extremely high compression.