Implementation of Neural Network Interpolation in Arcgis and Case Study for Spatio-Temporal

Implementation of Neural network Interpolation in ArcGIS and Case Study for Spatial-Temporal Interpolation of Temperature

Master project

POEC 6389

Xiaogang Yang

GIS Program

The University of Texas at Dallas

Instructor: Dr. Fang Qiu

July, 2005

Introduction

Interpolation is an important feature of a Geographic Information System; it is the procedure to estimate values at unknown locations within the area covered by existing observations. Extensive researches have touched on this topic, and various application and tools are available on the market, for example, ESRI provide two ArcGIS Extensions, one is “Spatial Analyst” and another one “Geostatistcal Analyst”, several interpolation algorithm are implemented in these two extension, they provide different approaches to manipulate GIS data, such as IDW, Spline, Polynomial, Kriging and RBF. Another vendor, Mapinfo, also provides interpolation tool called IDW Interpolator and TIN Interpolator. Not surprise, those available interpolation tool or application focus on the traditional interpolation techniques. IDW is still the most popular implemented algorithm. Those traditional interpolation techniques mainly deal with 2 dimensional (X, Y) GIS dataset, and few of them could process 3 (X, Y and Z) dimensional interpolation. None of them can process the spatial-temporal interpolation, while the spatial-temporal based interpolations are quite common in GIS interpolation.

In this project, we develop an interpolation technique based on the neural network algorithm, the neural network interpolation can handle interpolations based on different dimension: 2D, 3D, spatial-temporal (2D-temporal, 3D-temporal), and possible on “X” dimensional interpolation. The major output of this project is an implementation of interpolation application based on the Back-Propagation (BP model) neural network algorithm. This application is integrated with ESRI ArcGIS (ArcMap).

Another part of this project is the case study: spatial-temporal interpolation of temperature, which utilizes the neural network interpolation application.

This project stresses on software implementation of neural network algorithm as an extension of ESRI ArcGIS, the outline of this project are the following:

The outline for this report is listed below:

Literature review, we will review the algorithms in geographic data interpolation.
Neural network algorithm and the theory for neural network interpolation.
Data Source, description of the GIS data for case study.
Neural network simulator design and implantation.
Case study.
Conclusion.
Appendix and reference.

Literature review

Spatial interpolation is very important feature in Geograpic Information System and is already frequently used. There are many spatial interpolation algorithms for spatial data sets. Shepared (1968) discusses in detail inverse distance weighting (IDW), Goodman and O'Rourke (1997) Spline, both of IDW and Spline are refered ro as deterministic interpolation methods because they are directly based on the surrounding measured values or on specified mathematical formulas that determine the smoothness of the resulting surface. Another family of interpolation methods consists of geostatistical methods that are based on statistical models that include autocorrelation (the statistical relationship among the measured points). Because of this, not only do these techniques have the capability of producting a prediction surface, but they can also provide some measure of the certainty or accuracy of the prediction. The well known statistical interpolation is Kriging.Deutsch and Journel describe the detail theory of kriging interpolation in 1998. There are other special interpolation algorithms:trend surface(Zurflueh, 1967), and Fourier series(Harbaugh and Preston, 1968).

Some of those algorithms have been integrated into the software products of major GIS vendors; for example, ESRI ArcGIS has spatial analyst and geostatisticial analyst extension, which provide the tool for IDW, Spline, Kriging, global and local polynomial interpolations. Mapinfo has the IDW interpreter and TIN interpreter.

The algorithms above are pure spatial based (2D or 3D)interpolation, however geographic information system (GIS) applications often require spatial-temporal interpolation of an input data set. Spatial-temporal interpolation requires the estimation of the unknown values at un-sampled location-time pairs with a satisfying level of accuracy. For example, suppose that we know the recording of temperatures at different weather stations at different instance of time. Then spatial-temporal interpolation would estimate the temperature at un-sampled locations and times. Another example is to predict (interpolate) the housing price, as we know that housing sale price varies with location and time, so it does require the techniques for spatial-temporal interpolation.

There are surprisingly few papers that consider the topic of spatial-temporal interpolation in GIS. Most of the researchers assume that spatial-temporal interpolation is reducible to a sequence of spatial interpolations. It might be true for those datasets which have consistent record. A good example is the temperature data, normally; the distributed stations can provide temperature records at exact same time, so it sounds reasonable to treat the spatial – temporal interpolation as series of time–independent spatial interpolations. However, GIS interpolation often come through the irregular data, for example, the record of housing price sold within the area, not like the temperature, the reduction method is not applied for these kind of irregular data.

Recently, more research are focus on spatial-temporal interpolation, Lixin, Li Peter Revesz (2004) use finite methods to predict the hosing price, Miler (1997) utilizes kringing for spatial-temporal interpolation. Masoud Hessami (2004) use neural network for post-calibration of weather radar rainfall estimation. In this project, a neural network based spatial-temporal interpolation method will be introduced and implemented.

Neural network algorithm

Neural network is mathematic models of human cognition, which can be trained to perform a specific task based on available experiential knowledge. They are typically composed of three parts: input, one or many hidden layers, and an output layer. Hidden and output neuron layers include the combination of weights, biases and transfer function. The weights are connections between neurons while the transfer function are linear or non-linear algebraic functions. When a pattern is presented to the network, weights and biases are adjusted so that a particular output is obtained. Neural networks provide a learning rule for modifying their weights and biases. Once a neural network is trained to a satisfactory level, it can be used to novel data

Training techniques can either be supervised or unsupervised. Supervised training methods are adapted for interpolation and extrapolation problem. In this project, the most common used supervised Back-propagation (BP) algorithm is used.

Back-propagation (BP) algorithm

Back-propagation (BP model) neural network is multiple-layer architecture with fully connected interactions between layers. Typically one or more hidden layers are included to enable the network to learn complex tasks. Training is based on an error-correction learning rule, a general rule of the Least-Mean-Square (LMS) algorithm (Rumelhar and McClelland, 1986). Each neuron in the network may employ a nonlinear activation function at the output end, producing smooth signals to other neurons. One of these nonlinear functions is a sigmoid transfer function defined by the logistic function (Hagan, 1996):

Initialized with all the synaptic weights and thresholds set to small random number, the network is fed with training input-output pairs. Each learning iteration of the network consists of two passes: a forward pass and a backward pass. In the forward pass, the net input of the jth neuron of layer l is calculated as:

and the output is computed using the active transfer function

the difference between the desired response and actual output of neuron j at output layer can be obtained by :

in the backward pass, a local error gradient, δ is computed layer by layer:

for neuron j in output layer

for neuron j in hidden layer i

with this information the synaptic weight in layer l can be adjusted according to the generalized Least -Square-Mean rue (Hagan, 1996):

(2-11)

where η is the learning rate, and α is the momentum rate for speeding up learning without running into the risk of oscillation. the adjustment of weights is proportional to the error gradient. This process is called "error back - propagation", because the error correction is started from the output layer and propagated backward to the previous layers. The patterns inherent in the training data are obtained through the iteration of forward and backward passes. to prevent over-training from happening, learning should be stopped when the average square root for the test data set is reduced to a minimum or acceptable level and further training will not improve the performance, a state that s often referred to as the convergence of the network.

The training process looks like the process of highly non-linear polynomial regression, it can interpreter highly complex relationship. The well tuned network can capture the knowledge of relationship between input and output. All those knowledge is stored in the network as the values of weight and bias. After that, the well trained network could be used to interpolate/predict the observation at unsampled locations and times. The neural network interpolation is very flexible on the number of input, so it an ideal method for multi dimensional interpolation, from 2D, 3D to 2D-temporal and 3D-temporal. In this project, the BP model will be implemented in the application.

Neural network simulator design and implantation

The main objective of this project is to develop new GIS tool, which can be used to interpolate GIS dataset on multi dimensional bases, for example, 2D, 3D, or high dimensions, especially for spatial-temporal dataset. The stage of software design and implement takes almost 80% of overall time.

The major features of this software are listed below:

1. The simulatorisintegrated with ESRI ArcGIS (ArcMap) as an extension. And final product is released dynamic link library (DLL).
Programming is based on VB.NET and ESRI ArcObject.
The input data format is ESRI Shape file, personal geo database, SDE, raster data, the interpolation output can be text file and raster data.
The simulator should provide user friendly interface, easy to learn, easy to set up environment.

Major challenge:

Integration between .NET (VB.NET) and ArcObject, becauseArcObject is COM base component, while .NET can not handle the COM object directly, the provided wrap option will degrade the performance.
The training process might take huge amount of time for computation, so how to coding smartly to reduce the calculation time, for example, learning rate acceleration.
ESRI upgrade their product from 8.3 to 9.0 and 9.1, it causes compatibility issue.One example is that many library names are changed from version 8.3 to version 9. Another example is that some raster object is not fully implemented in version 9 while does in version 8.3. The third one is that the wrapped COM object does not work properly.

Next is some screen shots for the software:

(a). Once register the provide DLL, the “Neural Network Extension could be enabled from the tool menu.

(b). the extension include two modules, one is “Interpolation” module, another is “Image Classification” module, the “Image Classification” module is the implementation of neural-fuzzy image classification, which developed by author before.This master project only focuses on the “Interpolation” module.

( c ). Interface for “Neural Network Interpolation”

The panel on left side is used to set up input dataset, environment, cell size, etc. the right side panel is used to set up the neural work structure and training parameters.

The toolbar on the top of the window are used to save, load and view the network information. The trained network can be saved and stored in net files.

Case study: spatial-temporal interpolation of temperature

In this case study, the application is tested for spatial-temporal interpolation of temperature.

Data Source

3 year (1997, 1998 and1999) dairy based temperature records from 26 stations around Los Angels in south of California are used. According the geographic location, the 26 stations will be divided into two groups: training group and verification group, training group contains 20 stations, and verification group contains 6 stations (The training group will be used to training neural network and the verification group will be used to verify the interpolation result. The location of the study area and stations was showed on the map below:

The Geographic coordinate system for this study is GCS_North_American_1983, and units are decimal degrees. The extent (decimal degree) of the area is: West: -118.483300, East: -116.250000, North: 34.583300, South: 33.483300. The area is near 9284.4square mile. The information about the station and location is listed on appendix.

The original temperature records is in text format, each station is associated with X, Y coordinate and 365 daily temperature records. We convert the text file into ESRI point feature dataset (feature classes) on the format either of shape file or personal geo-database, the daily temperature records will stored as attributes in the feature class.

Besides the 2D (X, Y) data, elevation information is needed for 3D and 3D-temporal interpolation. Elevation data is stored on one raster data. From the elevation map below, we can find that the elevation vary apparently in the study area, the highest elevation is 3443 feet and lowest is -41 feet. The interpolation study will show that the elevation will affect the temperature apparently.

The interpolation simulations are divided in to different groups according the scenario of input layers, and only the training group stations are used to train the neural networks:

Only longitude and latitude of the stations are used as input information, the unit of longitude and latitude are decimal degrees.

Meanwhile, we use the available interpolation tools from ESRI Spatial Analyst and Geostatistical Analyst to interpolate the temperature, these method are IDW, Spline and polynomial interpolation, those results are used to compare with the neural network interpolation.

The interpolation result are listed below:

Compare the regular interpolation with neural network interpolation, result for day 1 (Jan, 1, 1997).

Apparently, different method give different results (temperature distributions), as we stated above, IDW is a deterministic method, while NN is statistical method, the result of NN depend on the training parameter (network structure, learning rate and training loops), even the tainting parameters are same, the result might varies little bit according to the initial weights and biases. The two polynomial methods generated some negative value, which is not correct. The kriging methods were tried on this data set, but were not successful. The accuracy of interpolation result can be partly judged by the R-squared number (R2), the R2 define how close between the actual temperature and interpolated result, here, all the 26 stations and 6 verification stations are used separately to calculate different R2.

The last two graphics show how the training parameters affect the interpolation accuracy and temperature distribute pattern:

The training loop are increased from 30,000 to 150,000, the over accuracy (26 stations) increase from 0.3686 to 0.9675, and the accuracy for 6 verification stations increase from 0.4122 to 0.9867. The network arrives convergent states, at second interpolation. But we also find the temperature distribution pattern also change a lot.

1. 2D

IDW power 2 (Jan,1,1997) / / IDW power 4 (Jan,1,1997)
Spline Regulation (Jan,1,1997) / Spline Tension (Jan,1,1997)
Global polynomial (Jan,1,1997) R2 = 0.0015 / Local polynomial (Jan,1,1997) R2 = 0.0015
2D NN (Jan,1,1997) 30000 loops R2 = 0.3686 / 2D NN (Jan,1,1997) 150000 loops R2 = 0.9675

2D neural network interpolation: there are 6 six separate simulations on six different day (day 1, 50, 100, 150 and 250).In order for comparison, all six training have exact training parameters (loops=30000, learning rate 1.5), and the R2 value also tell how well the neural network be trained.

NN 2D: Day 1,1997 R2 = 0.3686 / / NN 2D: Day 50,1997 R2 = 0.1055
NN 2D: Day 100,1997 R2 = 0.7271 / NN 2D: Day 150,1997 R2 = 0.3898
NN 2D: Day 200,1997 R2 = 0.5891 / NN 2D: Day 250,1997 R2 = 0.5891

From the result, the each individual interpolation can distinct the general trends of time variation, but the temperature distribute pattern varies apparently on different day. We also notice that, the over temperature of day 100 is lower than that in the day 50, we will see if the 2D-temporal and 3D- temporal can tell it or not.

In this series interpolation, elevation is added as other input along with the latitude and longitude. The elevation is varies from -40.5 feet to 3441 feet. The network structure is 3-8-8-1.

Same as 2D interpolation, 6 separate simulation on different day: day 1, 50, 100, 150, 250. the training parameters for six days are same (loops=30000, learning rate 1.5)

NN 3D: Day 1,1997 R2 = 0.4964 / / NN 3D: Day 50,1997 R2 = 0.6842
NN 3D: Day 100,1997 R2 = 0.7355 / NN 3D: Day 150,1997 R2 = 0.72041
NN 3D: Day 200,1997 R2 = 0.3207 / NN 3D: Day 250,1997 R2 = 0.6118

Compared with 2D interpolation, we observe that:

It is surprised that the temperature distribution follow same pattern for all days, Compared with the elevation distribution, we can find that the elevation affect temperature apparently, the reason is that in the three input data, elevation change more abruptly than latitude and longitude along nearby location, the highest elevation is 3441 feet, while the lowest is -40.5 feet, so network training reflect it very well., it is also the reason that the temperature distribution of different days follow the same pattern. Generally, the temperature at the location of high elevation is lower than those of low elevation, this follows the nature phenomenal overall.