Lab Exercise #3: Spatial Interpolation: Inverse Distance Weighting and Kriging of Several Datasets including Ozone concentration.

John Doe

Specific Instructions for Tasks 1-4

Task #1: Inverse Distance Weighting by hand

The table below represents x and y coordinates of something called ‘Value’. Estimate the value of the fourth record (indicated by a ‘?’) using inverse distance weighting. Draw a crude scatter plot of the data in the table to the right of the table.

X coord / Y coord / Value
1 / 3 / 11
3 / 5 / 7
2 / 7 / 3
3 / 3 / 8.3
7 / 4 / 9

Now, enter this table into Excel or some text editor and read it in as an ‘event theme’ to ArcView. Convert that shapefile to an Arc/INFO point coverage and do an IDW on that point coverage. How did the output of Grid’s IDW command differ from your simple hand calculation? Explain. It had the possibility to be much more accurate. In arc, I got 8.431223 as the value. But that value was dependent on getting the cursor to the right place.

Task #2: Kriging and IDWing without a ‘truth’ or reference grid

If you look at the Point Attribute table for the point coverage ‘usozonesites’ (usozonesites.pat) you will find that there are numerous attributes. We will work with the attribute: ‘O3apr2oct_’ (don’t forget the underscore ‘_‘at the end). First we will create an interpolated grid of the ‘O3apr2oct_’ values using the Inverse Distance Weighting command. Get into the right workspace and get into Grid. At the Grid prompt type:

Grid: idwozone = IDW(usozonesites, o3apr2oct_,. #, 2, Sample, 11, 1000000, 10000)

Take a look at the grid you created ‘idwozone’ in ArcView. Now create another interpolated filed using the ‘kriging’ command in grid. At the grid prompt type:

Grid: krigozone = kriging(usozonesites, o3apr2oct_,. #, both, varozone, linear, sample, 11, 1000000, 10000)

This command produced three things: 1) a kriged estimate of the ‘3apr2oct_’ attribute of the usozonesites point coverage, 2) A semi-variogram of that attribute of the point coverage, and 3) a variance of the kriged estimate grid. To look at these outputs read the interpolated grid ‘krigozone’ into ArcView. Also read the interpolated grid ‘varozone’ into ArcView. To see the semi-variogram run Arcplot, do a disp 9999, then type semivariogram krigozone.svg. This should display a semi-variogram in the ArcPlot display. Use Snag-it to save this image somewhere (This is shown in figure 1.)

Figure 1. Semivariogram krigozone

Knifing’ often show IDW to outperform kriging because the assumptions needed fro kriging are not met.

A key assumption is that of ‘Stationarity’ which essentially means that the auto-correlation function is the same throughout the space being interpolated. You will see how this fails soon.

Anyway, BTW, you should have obtained a variogram for the ozone data that looked like this:

Deleated to save space

We chose to fit a linear model to this data. The nugget was estimated ab about 33.

By putting ‘Spherical, Gaussian, Exponential’, etc. into the kriging command instead of ‘Linear’ Arc/INFO would have calculated a different function to characterize the autocorrelation.

Questions for Task #2:

1)  Read the three grids (idwozone, krigozone, and varozone) and a co-registered state boundary coverage into ArcView. Compare and contrast the appearance of these three grids. These are shown, with their legend, as Figure 2. varozone, Figure3. krigozone, and Figure 4. Idwzone. Idwozone looked very similar to krigozone in the areas that it showed, but the Idw seems to have more intense zones, than kreiging does. Looking at the numbers for variance, shows there is a lot of errors.

Figure 2. varozone

Figure3. krigozone

Figure 4. varozone

2)  Where are the areas of high ‘ozone’ concentration in idwozone and krigozone? Southern to mid. California, upper Arizona, lower Utah, a large group consisting of: western Wyoming, southeast Idaho. Lastly there was a belt running through Tennesse, Virginia, and North Carolina. Do the two images agree in general? Yes!

3) What is the IDW and Kriged estimate of o3apr2oct_ at the Four Corners location? Idw-41.733036, Kreiged-39.314422. What is the value of varozone at Four Corners 37.323338? If the varozone value is the standard error of the estimate of the kriged ozone estimate, what is your 95% confidence interval for the o3apr2oct_ for Four Corners? 37.323338 +/- 1.96 * sqrt(37.323334) = 95% C.I. = (49.2975228, 25.3491452)

Task #3: Spatial interpolation with a known full valid coverage to validate estimates against

In this task you will be IDWing and Kriging interpolated grids from data you already have ‘truth’ for. To do this you will artificially blind yourself by getting elevation, population density, and landcover information at only the usozonesites points and then interpolating them. To do this you will need to do a ‘summarize zones’ command with ArcView using usozonesites and each of the following three grids: nademlza, uspopden, and naigbplza. This will generate several tables which you will have to read into excel, delete most columns, (keep ‘Monitor’ and ‘Median’), change name of median to elevation, popden, and landcover as appropriate, and join these three tables back to usozonesites.pat. Now you are ready for spatial interpolation.

The Elevation dataset

Get an IDW and a Kriged interpolated grid from the usozonesites pointcoverage from grid

Grid: idwelevation=IDW(usozonesites, elevation,. #, 2, Sample, 11, 1000000, 10000)

Grid: krigelevation = kriging(usozonesites, elevation,. #, both, varelevation, linear, sample, 11, 1000000, 10000)

Questions

1)  Look at these interpolated fields. How well did IDW or Kriging spatially interpolate elevation? Not well at all. The estimates was off by good 2000 feet, the higher elevations were not in the correct places (ie Colorado was not represented as being high while Wyoming and Montana were. Figure 5 represents Varelevation, Figure 6 represents krigelevation, and Figure 7 represents Idwelevation.

Figure 5. Varelevation

Figure 6. Krielevation

Figure 7. Idwelevation

2) Does the varelevation grid seem honest with respect to characterizing the error of estimate? No it does not, because it does not correctly identify the error where it is (i.e. Colorado)

3) Create two error grids with simple map algebra, from grid the commands would be:

Grid: idweleverror = nademlza – idwelevation

Grid: krigerror = nademlza – krigelevation

Display the grids idweleverror and krigeleverror in ArcView using standard deviation as a classification and color scheme. How similar are the error patterns? The error patterns look pretty similar, where there is error. Figure 8 represents idweleverror, Figure 9 represents krigeleverror.

Figure 8. idweleverror

Figure 9. Krigeleverror

4) Now create the absolute error (so over and underestimates do not cancel each other out when you calculate the mean). To do this use the absolute value function in grid:

Absidweleverror = abs(idweleverror) Shown in Figure. 10

Abskrigeleverror = abs(krigeleverror) Shown in Figure. 11

Figure 10. Absidweleverror

Figure 11. Abskrigeleverror

Which interpolator had the lowest Mean Absolute Deviation? Krig

5) Comment on the usefulness of IDW and Kriging for interpolating Elevation data with point coverages of the density of usozonesites. It is not very useful at all! As was shown earlier, the elevations are majorly off, and are represented in the wrong areas.

The Population Density Dataset

Using the population density dataset for the conterminous United States that you used in the nighttime satellite imagery lab repeat the same routine you did for elevation for population density.

Questions 1-B thru 5-B same as elevation dataset

The Landcover Dataset

Using the landcover dataset naigbplza dataset for the conterminous United States was in the nighttime satellite imagery workspace repeat the same routine you did for elevation for landcover.

Note: smart students will not do this and explain to me that it is an inappropriate exercise.

6) Print the variograms for elevation and population density on a sheet of paper side by side. How do they compare and how can they be interpreted in light of what you know about the spatial variability of population and elevation in the conterminous United States.

Task #4: De-Trended Kriging of Ozone in Rocky Mountain National Park

Space is not the cause of variability in measurements of elevation, ozone concentration, etc. Kriging, IDW, and other spatial interpolaters merely take advantage of the fact that spatial auto-correlation exists. It is even better to build a model using spatially explicit measure of dependent variables, apply them to extensive spatial coverages, look at the errors at known points and only krig the errors. Approaches like this are sometimes called ‘de-trended’ kriging. For example, let’s consider ozone concentration in Great Smoky Mountain National Park. Suppose you have a point coverage of ozone measurements that you want an interpolated field of. Suppose you have a DEM, a land-cover database, and slope. By intersecting the point coverage with these grids (summarize zones in ArcView and a bunch of table joins) you will have a table from which you can build a single or multiple regression to predict ozone from elevation and/or landcover and/or slope. Then apply this regression to those grids. Compare this ‘model’ to the point values and get ‘errors’ at those point values. Krig the ‘error’ values at those points and add the interpolated field of errors to your ‘model’.

Compare this to a simple ‘kriging’ and simple ‘IDWing’ of the original point data.

Procedures:

Produce a simple kriged estimate of ozone:

Grid: smpkriggsmnp = kriging(ozonelocs, adj_intsv_, # , both, varsmpkrig, linear, sample, 5, 500000, 200)

Produce a simple IDW estimate of ozone:

Grid: smpkriggsmnp = IDW(ozonelocs, adj_intsv_, #, 2, Sample, 5, 500000, 10000)

Now you need to gather the data needed to build a model that uses more than simply the spatial relationships of the ozone measurements at the ozone monitoring sites. To do this you need to read in the ozonelocs point coverage, and the following grids: gsmnpelev, gsmnp_slope, gsmnp_landcover. Do the GIS manipulations necessary to get a table with the following fields and appropriate values in the records: Code4_site, Elevation, Adj_intsv_, gsmnpelev, gsmnp_slope, gsmnp_landcover. Read that table into JMP. How does the Elevation variable correlate with the gsmnpelev variable? This should be a strong relationship. It’s not perfect because the elevation measurements were made with different devices. Now build a univariate or multivariate model that predicts Adj_intsv_ using the other variables. Build a grid of that model estimate with ArcView or Arc/INFO’s Grid module. I used the formulae Adj_intsv_ = 17.787131 +.0239096 GSMPelev

My R^2 = .549453

Questions:

1) How do these models compare visually? Figure 12 represents Adj_invst_ , and Figure 13 represents gsmnpelev. They are very similar to each other in appearance, but the scales for each are different.

2) How could you quantitatively compare the accuracy of these interpolations? When you are in .jmp, switch the axis for each of the variables and use that equation in grid to create map to compare it with.

Figure 12. Adj_invst_

Figure 13. GSMPelev

LAST EXERCISE:

Use JMP to generate 3 random variables. 300 records with the following fields:

X coordinate 100*random Uniform Y Coord 100*random Uniform Z value Normal 20, 3

Create shapefile of points convert to point coverage. Krig these points and attach print-out of variogram and interpret. Figure 14 is the variogram, and Figure 15 is the Arcview plot view of it. Sill = 1.6, nugget = approx..79, thus the range = .81

Figure 14. Variogram of 3 variables

Figure 15. ArcView plot of 3 varibales