Lidar DEM Error Analyses and Topographic Depression Identificationin a HummockyLandscape in the Prairie Region of Canada

Sheng Lia[*],R.A. MacMillanb, David A. Lobba,Brian G. McConkeyc, Alan Moulind, Walter R. Fraserde

a. Department of Soil Science, University of Manitoba, Winnipeg, MB, Canada

b. ISRIC – World Soil Information, Wageningen, the Netherlands

c. Semiarid Prairie Agricultural Research Centre, Agriculture and Agri-FoodCanada, Swift Current, SK, Canada

d. Brandon Research Centre, Agriculture and Agri-Food Canada, Brandon, MB, Canada

e. Land Resource Unit, Agriculture and Agri-FoodCanada, Winnipeg, MB, Canada

Abstract

Topographic depressions are abundant in topographically complex landscapes. A common practice with earlier, low resolution Digital Elevation Models (DEMs) was to remove all depressions to ensure that water flowed continuously to the edge of the DEM domain. The assumption was that most depressions were created due to errors in the DEMs. This practice is no longer justified with the increasing availability of high accuracy DEMs. However, very few studies have addressed how DEM processing options such as smoothing and coarsening and setting area and depth thresholds can affect depression identification.

In this study, a site located in the Prairie Region of Canada was examined. The site is a hummocky glaciated landscape with many in-field wetlands. Lidar topographic data were collected and were used to generate a 1 m by 1 m square-grid DEM. Detailed error analyses of the lidar DEM were conducted. A set of DEMs were generated after different degrees of smoothing and coarsening. FlowMapR, an established terrain analysis tool, was used to identify depressions in each DEM with various user-defined area and depth thresholds. The results were validated against a field wetland survey.

We determined that the problems associated with depression identification using a lidar DEM are two-fold. On one hand, artefactual depressions created due to DEM errors need to be eliminated, for which the raw lidar DEM need to be smoothed. On the other hand, it is also desirable to remove those topographic depressions that do not function as closed basins at the spatial or temporal scale of the processes of interest. Setting area and depth thresholds appeared to be the preferred choice for this. We suggested using the un-autocorrelated lidar DEM error as the criterion for DEM smoothing and considering depression connections in the selection of area and depth thresholds. Using lidar data on a hummocky landscape with loamy soils in the Prairie Region of Canada, 10 to 20 times smoothing operations with an area threshold of 200 m2 and a depth threshold of 0.1 m were recommended as guidelines for depression identification.

Keywords: Terrain analysis; Fine resolution DEM; Lidar data error; Depression connection, Canadian Prairies

1. Introduction

Regularly-spaced grid DEMs, especially square-grid DEMs, havebecomethe dominant form of topographic data nowadays and have been used in a wide range of hydrological, geomorphological and ecological applications (Wilson and Gallant, 2000). Raw DEMs, especially remotely sensed DEMs,were often processed to meetvarious requirements of different applications (Hutchinson and Gallant, 2000). Common processing steps include filtering the raw DEM to create a smoother surface and coarsening the DEMs to improve the representation of surface shape. Furthermore, hydrologic conditioning isoften applied to DEMs, especially in hydrological applications,to ensure that water flows continuously across the DEM to its edge (Maune et al., 2007). The purpose of hydrologic conditioning is to remove all artificialartefactualtopographic depressions (also termed sinks or pits)while retaining real ones. In practice, however, the standard procedure is to remove every topographic depression, which in fact assumes that all topographic depressions are artifacts (Wilson and Gallant, 2000; Lindsay and Creed, 2006; Wechsler, 2007).This assumption is frequently untrue and is especially suspect unlikely when applied to inverted topographic surfaces, in which each peak can be treated as a depression in the inverted DEM. To remove topographic depressions, the raw DEMs are typically modified either by raising the elevations in topographic depressions to match surrounding points (termed filling) or less commonly by lowering the elevations of surrounding points to route flow (termed breaching) (Jenson and Domingue, 1988; Tribe, 1992; Rieger, 1998; Martz and Garbrecht, 1998; Lindsay and Creed, 2005a). Hydrologic conditioning, especially depression filling,modifies the DEM and inevitably will affect parameters derived from DEMs and the subsequent hydrologic modeling (Lindsay and Creed, 2005b; Wechsler, 2007). Therefore, the practice of removing all depressions in DEMs is questionable if wherever some of these depressions are real topographic features. Indeed, we argue that hydrologic conditioning has led some researchers to throw out the baby with the bathwater.

Many early workers of hydrologic modeling using DEMs have noted that topographic depressions are real features in topographically complex landscapes, most abundant in glaciated, karst or aeolian landscapes and volcanically active areas (e.g., Mark, 1988; Martz and de Jong, 1988; Tribe, 1992; MacMillan et al., 1993 and Muehrcke and Muehrcke, 1998). These depressions, although mostly small in size, may have profound impacts on local hydrology and, therefore,on other soil landscape processes such as soil erosion, sediment delivery and nutrient cycling. A typical example is the glaciated hummocky glaciated landscape. About 84 %% of Canada’s agricultural land is in the prairie region (the provinces of Manitoba, Saskatchewan and Alberta) and about 30 %% of the agricultural land in this region is dominated by hummocky landscapes, characterized by closed (or internally drained) basins (AAFC, 2007). The centers of these closed basins are often covered by water, permanently, seasonally or temporarily, which are referred to as shallow wetlands or potholes. The hydrology of these wetlands has been described as driven by groundwater (e.g., Euliss et al., 2004). However, Leibowitz and Vining (2003) reported strong connections between wetlands through surface water overspill. Cook and Hauer (2007) determined that despite the important role of groundwater recharge and discharge in the water balance of wetlands, the hydrology of wetlands is driven by temporal temporary surface-water. Major sources of water in the wetlands, excluding in-situ precipitation, are surface runoff from upland higher areas during snowmeltmelting in the spring and rainfall events. The hydrological regime drives the biological and pedological processes along the topo-sequence. As a result, under native conditions, flora and fauna species and soil types vary substantially along the topo-sequence from the upslope down to the center of the each wetlands.Clearly, the importance of topographic depressions in hummocky landscapes is undisputable indisputable and, therefore, it is inappropriate to remove all depressions in the DEMs of hummocky landscapes (MacMillan et al., 1993).

A major reason for hydrologic conditioning (i.e., removing depressions by filling or breaching) is the errors in DEM error. Research on DEM errors continues to be very active (see reviews about DEM errors in Fisher and Tate, 2006 and Wechsler, 2007). Error structure of DEMs is determined by the data acquisition and processing procedures. However, inIn general, errors in DEMs can be grouped into three categories: systematic errors, blunders and random errors (Wise, 2000; Fisher and Tate, 2006). Systematic errors and blunders are usually corrected prior to the release of the DEM. Random errors come from various sources, e.g. data collection and interpolation.The exact sources and magnitudes of random errors are usually unknown. Therefore, random errors are not corrected for and persist in the final DEM.

Another reason for removing all depressions is that the resolution and accuracy of DEMs limit its their ability to describe real depressions, most of which are small scale features. On the other hand, artefactual depressions could be generated even from exact DEM where the channel width is less than grid mesh (i.e. channels escape between two grid points that correctly have higher values). For example, publiclyavailable DEMs provided by government agencies, such as the Unites States Geological Survey (USGS) and the Canadian Council on Geomatics (CCOG),generally have a horizontal resolution of > 25-30 m and a vertical uncertainty of ≥ 1m. These DEMs are often too coarse to identify smallchannels and small in-field depressions (e.g., smaller than 2 - 3 times of the grid resolution in diameter), particularly those ofin irregular shapes. However, recent advances in remote sensing technology (e.g., Lidar─Light Detection And Ranging, and IFSAR ─Interferometric Synthetic Aperture Radar) have made finer resolution DEMs available at reasonable prices(Maune et al., 2007). For example, DEMs produced using lidarusually have a horizontal resolution of <5 m and a vertical accuracy of about 0.3 m or better (Maune et al., 2007). These high resolution and high accuracy DEMs are able to capture small scale depressionsfeatures. However, artificialartefactual depressions arising from noise in these DEMs are also abundant. Correct identification of depressions requires a method to distinguish artificial artefactual topographic depressions from real ones. Unfortunately, developing such a method has received little attention in the literature (Lindsay and Creed, 2006).

MacMillan et al. (1993) useda knowledge knowledge-based approach and applied heuristic rules to distinguish real depressions fromartificialartefactual ones. This approach usesareaand depth thresholdsbased on knowledge of the site and the DEM data source. Depressions that were smaller than the area threshold or shallower than the depth threshold were considered artifacts. Lindsay and Creed (2006) proposed a modeling approach based on the simulation of DEM errors. This approach usesthe Monte Carlo method and estimates the likelihood of a digital depression actually occurring in the landscape given the degree of uncertainty (errors) in local topography.The authors found that this modeling approach was slightly better than the knowledge knowledge-based approach in their case study. The modeling approach has the advantage of handling the DEM uncertainty explicitly.Although this modeling approachis automated, it is alsocomputationally intensive, as it requires hundredsof runs until the results converge (i.e., until additional runs make negligible improvement).As a result, use of the modeling approach is probably impractical for large data sets. In contrast, the knowledge knowledge-based approach is conceptually and computationally simple. It has been noted that the performance of this approach is strongly affected by the quality of the DEM (errors), the DEM processing steps (e.g., smoothing and coarsening) and thethreshold values(MacMillan et al., 1993). However, no study has been carried out to examine the sensitivity of this approach to these various conditions and settings.

In this study, we examined depressions identified using a knowledge knowledge-based approach (the FlowMapR program) using a lidar DEM of a hummocky site. Errors in the lidar DEM were assessed using more accurate topographic data obtained from a kinematic ground survey. The variation of the depression identification was investigated based onmultiple runs with four levels of smoothing, four levels of coarseningand a wide range of threshold values. The results were compared against a field wetland survey to evaluate the performance of the program under given conditions. The objective of this study is to provide guidelines on how the knowledge knowledge-based approach can be used most effectively in depression identification.

2. Materials and methods

2.1. Thefield site

The field site is the Manitoba Zero Tillage Research Association (MZTRA) Research Farm located 17.6 kilometers (12 miles) north of Brandon, Manitoba, Canada, at the northeast corner of the junction of Highway-10 and Road-353 (Fig. 1a). The climate in this area is cool to moderately cool subhumid (Podolsky and Schindler, 1994). Mean annual precipitation recorded at four Long long-term climatic records from four weather stations in the area indicate total precipitationstations ranges from 426 to 490 mm. Growing season precipitation is variable due to the local occurrence of storm events which account for much of the summer rainfall. Mean annual air temperature at the four climatic stations ranges from 0.8 to 1.7 °C, while the long-term average length of the frost-free season varies from 90 to 115 days at the four stations.

The research farm covers the entire section of land (about 259 ha). Topography of the farm is considered irregular undulating to hummocky, with permanent and ephemeral wetlands of various sizes (Fig. 1a). Elevation in the farm ranges from 495 m to 512 m, with an overall down slope trend from the northwest corner to the southeast corner. Soils are dominantly well- and imperfectly imperfectly-drained Black Chernozemic soils (68 %% of the land area) developed on fine loamy, till deposits (Podolsky and Schindler, 1994). Poorly to very poorly drained Gleysols, occupying depressions, account for the remaining 32 %% of the land area. All the soils have high organic matter content and good moisture holding capacity. Surface drainage on the farm is quite variable, ranging from well good to rapid on the upper slopes to prolonged inundation of in the poorly drained pothole areas.

2.2. Lidar data collection and error analyses

Lidar data for the farm were collected on October 31, 2002 by Aeroscan International Inc. The system was mounted on a helicopter that flew about 250 m above ground. Typical flying speed was 35 ms-1 and the laser measured approximately 10,000 useable points per second. The typical spacing between measured points on the ground was 1 m across track and 1.5 m along track(Fig. 1b). The original ground value datalaser data werecloud was filtered for trees and buildings by separating laser returns hitting the ground from those hitting vegetations and buildings. The ground hits were thinned, removing points that have an elevation difference of less than 0.05 m compared to the other points within 2 meters horizontally.This ground thinned dataset formed from which a bare earth surface containing irregularly spaced mass raw elevation data points was derived and was delivered to the end user by the service provider as the raw lidar data.A Triangulated Irregular Network (TIN) surface was created on this raw lidar data using the “Create TIN” tool in ArcGIS 9.0. There were a total of 2499515 nodes (data points) in the TIN surface, forming 4998948 triangles. Average area of the triangles was 0.58 m2 and areas of 98% of the triangles were between 0.1 m2 and 1 m2.

As a quality control exercise, a kinematic ground surveywas carried out along the access roads (i.e., Highway-10, Road-353 and in-farm roads, Fig. 1b, 1c, 1d) with point spacing of about 10 m. The accuracy (precision) of the ground survey was checked by comparing elevation data collected along the same roads in opposite directions (Fig. 1c, 1d).To make the paired comparison possible, elevations of points in one direction were linear-interpolated from elevation data collected in the opposite direction, thus the comparison was made on the same points between the measured elevations (Zk) and the interpolated elevations (Zk’). The XY coordinates of the kinematic survey points were transposed onto the TIN surface derived from the raw lidar dataat the intersection of the triangle planes to obtain their elevation values.The errors of the lidar data (El) were approximated by the differences between the interpolated lidar elevations (Zl) and the measured kinematic survey elevations (Zk), assuming the kinematic survey had negligible errors. Root Mean Square Error (RMSE) of the lidar data was then calculated.

The calculated lidar data errors for Highway-10 and Road-353 (in-farm roads not included) were imported into GS+ plus, a geostatistics program, for semivariance (γ) analyses. Variograms were computed for the two roads separately, representing the spatial structure in the two cardinal directions (North-South and West-East direction for Highway-10 and Road-353, respectively). The variograms were fitted with models that had the lowest values of Residual Sums of Squares among the five models provided by the GS plus + program. To examine the relationship between lidar data errors and topography, semivariance analyses were also conducted for the kinematic survey elevation (Zk) and slope gradient (SGk) data, following the same procedures as those for the lidar data errors.

The un-autocorrelated random errors of the lidar data (“white noise”) can be estimated as the square root of the nugget value of the variogram model (termed Nugget-method, herein). This method is crude since the nugget value is determined by the model fitting, which is highly variable. Therefore, in this study, the un-autocorrelated random errors were assessed directly fromthe lidar points along Highway-10 and Road-353 (termed the Segment-method, herein). Firstly, aA straight line was drawn in the middle of each road and a 0.5 m buffer zone (on both sides of the line) was created for each line. Since both roads werestraight and paved,roads and the lines were drawn in the middle of the roads, it was assumed that there was no elevation difference perpendicular to the lines within the buffer zones (1 m wide). Mass Ppoints in the raw lidar data within the buffer zoneswere extracted and were snapped to their respective lines orthogonally.These points were sorted along the line and were grouped into segments, each with10 points.Assuming all segments were level (recall that both roads are paved roads and slope gradients along the roads are very low), the standard deviation of elevations of the 10 points was used to represent the un-autocorrelated random error for that segment.The average of the standard deviations for all segments then gave a measure of the average un-autocorrelated error.