Vegetation Science – MSc Remote Sensing UCL Lewis

Lumped Parameter Modelling

P. Lewis RSU, Dept. Geography, University College London, 26 Bedford Way, London WC1H 0AP, UK.

Email:

1. Introduction

The aim of these notes is to introduce concepts, model, and applications of ‘simple’ lumped parameter models of canopy reflectance and scattering.

In the previous sections of this course, we introduced the radiative transfer (RT) equation as a framework for the calculation of (optical) reflectance as a function of canopy and soil biophysical variables (leaf dimensions and density or LAI, leaf and soil moisture content, leaf biochemistry etc). Formulating the RT equation and solving for a given canopy description allows us to investigate (through analytical or numerical means) the relationship between these ‘fundamental’ descriptions of canopy state and the remote sensing signal. It allows us to investigate issues such as the sensitivity of a form of signal (e.g. Landsat TM waveband reflectance) to canopy properties to allow us to make decisions on the type of data we might be able to use to ‘solve the remote sensing problem’ – i.e. to derive estimates of biophysical quantities from a remote sensing signal.

In this section, we will investigate how more generalised/approximate forms of canopy reflectance/scattering models can be developed and exploited for a range of practical tasks.

2. Linear Models

An important concept in many of the models we will be dealing with is that of the ‘linear model’. We can define a linear model as follows: For some set of independent variables x ( = {x0, x1, x2, … , xn}), we have a model of a dependent variable y which can be expressed as a linear combination of the independent variables.

The following are examples of linear models:

(2.1)

(2.2)

(2.3)

Equation 2.1 is the form of a ‘standard’ linear regression you will have come across (‘y = mx + c’). Equation 2.2 is a ‘multi-linear’ form. Equation 2.3 is the general form for n+1 terms. As far as we are concerned here, one of the major features of a linear model is that we can use matrix inversion to solve for the model parameters (a ( = {a0, a1, a2, … , an}). We will see how to do this later, but for the moment can note that:

(2.4)

Many other equations can be expressed in this form, given a suitable transformation, so that the following are also linear models:

(2.5)

(2.6)

(2.7)

(2.8)

Equation 2.5 is a Fourier series – it has 2 terms per ‘i’, but is otherwise directly in the form of equation 2.4. Clearly, equation 2.6, a magnitude-phase representation of a Fourier series, can be expressed in an equivalent form to 2.5. Equation 2.7 sometimes confuses people: a polynomial, such as a cubic, is a linear function. Equation 2.7 requires transformation to a linear form:

(2.9)

since ln(ab) = ln(a)+ln(b).

2.1 Linear Mixture Modelling

As an example, we can consider the ‘linear mixture model’ (or ‘spectral mixture modelling’) often used in remote sensing[1],[2]. We can consider a remote sensing measure, e.g. of spectral reflectance, r, of a pixel as a summation or integration of a set of n ‘component’ measures (reflectances), ri weigthed by their relative proportions (‘fractional covers’), Fi, where . These components will vary according to the application and the area being studied, but may, for instance, be different mineral soils, or soil and vegetation. Then, we can write:

(2.10)

we can express this as:

(2.11)

where r is a vector of n reflectance terms (e.g. in a given waveband), and F is a vector of n fractional cover terms.

This model assumes that only first-order interactions are important in defining r. We know that multiple scattering can be dominant in some cases – e.g. vegetation canopy reflectance in the near infrared, so in this case, the component reflectance ri would be specified as an equivalent ‘canopy’ reflectance, rather than using a leaf-level measure. Even so, equation 2.10 misses out multiple interactions between components – e.g. scattering interactions between the canopy and soil, which are assumed small. We shall return to further develop this model later.

One of the main uses of equation 2.10 is to attempt to derive estimates of fractional cover from a multispectral measurement of r(l), assuming that a set of ‘end-member’ reflectance spectra r(l)i are known.

(2.12)

The remote sensing problem in this case is to derive an estimate of F from a set of measurements of r. We can express equation 2.12 for all wavebands considered in vector-matrix form:

(2.13)

where r = {rl0, rl1, … rlm, 1.0} is a vector of length m+1 spectral reflectance measurements (with a 1.0 at the end) over m wavebands, F is as above, the proportions (fractional cover) vector of length n, and is an nx(m+1) matrix, the columns of which are the end-member reflectance spectra r(l)i with 1.0 on the end of each vector.

(2.14)

The role of the 1.0 terms on the end of each vector is (one way of) expressing the constraint . If n=m+1, is a square matrix, and we can solve for F from:

(2.15)

where is the inverse of . So, if we have a measurement of spectral reflectance, e.g. in 2 wavebands (m=2), and a set of 3 (n=3) ‘end-member’ spectra in these wavebands, we can determine the proportion of those spectra in the measurement.

Figure 2.1 Linear Mixture Model

Figure 2.1 shows an example mixture model, demonstrating the method graphically for m=2, n=3. End-members r1-3 are shown. The measured reflectance r is seen to be a combination of its proportionate distance from r1 to r3 (around 0.5 here) and its proportionate distance from r1 to r2 (around 0.2 here). So, r = r1+0.5 (r3-r1) + 0.2 (r2-r1) = 0.3r1 + 0.2 r2 +0.5 r3. The figure also demonstrates well one feature of the linear mixture model – that the end-member spectra must define vertices of the convex hull of all measurements.

Such methods can be used for practical applications, but one must be aware of the following issues:

1.  The method, as described, is not robust to error in measurement or end-member spectra;

2.  Proportions must be constrained to lie in the interval (0,1) – this is effectively the convex hull constraint described above;

3.  Only m+1 end-member spectra can be considered in the mixing;

4.  The method is dependent on the prior definition of end-member spectra;

5.  The method cannot directly take into account any variation in component reflectances (e.g. due to topographic effects).

Of these limitations, number 1 is often the most serious, particularly in the presence of noisy data (e.g. noise introduced by atmospheric correction, on top of any sensor/quantisation noise).

2.2 Linear Mixture Modelling in the presence of Noise

The linear mixture model developed above allowed an estimation of m+1 parameters (proportions) from m observations (m wavebands). This is because we have m+1 linear simultaneous equations from these m observations. If there are fewer than m independent observations, the model is under-determined for m+1 parameters, and we cannot determine the parameters uniquely. If, there are more than m independent observations, we have an over-determined problem (i.e redundancy), which allows us to use statistical methods to reduce uncertainty in the derived parameters.

The simplest way to do this is using the Method of Least Squares (MLS). To employ this for a linear model, we consider that our model may have an error associated with it:

(2.16)

where e is a vector of residuals (discrepency between model and measurement). The MLS attempts to minimise the sum of the squares of the error e, i.e.. This is achieved by setting the partial derivatives of with respect to each of the model parameters Fi, to zero. Thus:

If the model is linear, , so:

(2.17)

Equation 2.17, for i = 0, 1, …, n-1 gives a set of n simultaneous linear equations with n unknowns (F) which can be used to solve for F.

We can write this in matrix form, , as following the pattern:

(2.18)

where is an ‘observations’ vector, is the ‘model’ matrix (contains only terms assosciated with the end-member spectra – the model), and is the ‘parameter’ vector – the model parameters we wish to solve for.

This can be solved by matrix inversion as above. Note the ‘patterns’ in the matrix and vectors – remembering this will allow you to formulate for any similar linear case. The vector on the LHS comprises a summation over all observations (wavebands here) of the product of the observation and the model term (end-member spectrum here) associated with that location in the vector. The matrix is solely a function of model terms (end-member spectra) with a clear pattern. Note that we have not dealt with applying any constraints here – for the mixture modelling case, we must constrain[3][4] the proportions to lie in the correct limit (0 to 1) and also constrain their sum to 1 as above.

The solution to this is a considerable improvement over equation 2.13 in providing a robust estimate, as it provides a ‘best fit’ (in the least squares sense) of the model parameters for an over-constrained case.

2.3 Best-fit Line

An example you may have come across before that uses this approach is in providing the ‘best fit’ of a line to a set of data points. In this case, the model is:

(2.19)

We can use the work we did above to go straight to the solution for this. Following the pattern of equation 2.18, :

which we may more conveniently write:

(2.20)

where the bar represents a mean over the observation set, by dividing both sides by n. The inverse matrix can be written as:

(2.21)

where is the variance in x, . For larger matrices, we will use alternative methods to solve for the inverse. It is left as an exercise to show that:

where is the covariance between x and y.

Using this example, we will examine a number of issues relevant to more general linear modelling. First, recall from above that the sum of the squares of the errors (residuals) is:

We can clearly calculate this once the model parameters (m and c) have been worked out (it can also be calculated more directly). This is an important term in seeing how appropriate the model (straight line here) is to the dataset. Sometimes a similar term is used as an estimate of noise in the data, although such a measure tends to make the noise estimate too highly variable. Rather than e2, the sum of the squared errors, we tend to use the normalised measure Root Mean Squared Error (RMSE):

(2.22)

where n is the number of observations and m here is the number of model parameters (2 here) – (n-m) is the number of degrees of freedom (DOF) of the system of equations.

Figure 2.2 Linear Regression

You may be used to placing ‘error bars’ on graphs for which you have fit a straight line to (figure 2.2). A more general concept is the the idea of the ‘weight of determination’ which allows you to predict uncertainty in any of the parameters of a linear model, or a linear combination thereof. This is discussed in more detail by Lucht and Lewis (2000)[5].

We can write the weight of determination, at some location

(2.23)

as

(2.24)

where T denotes the transpose operation. Plugging 2.23 into 2.24:

(2.25)

The uncertainty (‘mean error to be feared’) associated with a prediction of y(x), e is:

(2.26)

where e is the uncertainty associated with the measurements. We typically use the RMSE to approximate e, as noted above. Lucht and Lewis (2000)5 describe the weight of determination as a ‘noise inflation factor’, as it relates noise in the data to uncertainty in model prediction (equation 2.26).

We can see from equation 2.25 that the minimum error in the prediction of y from a straight line fit is obtained at x = - the error (‘uncertainty’) will increase as we depart from this value as a quadratic in x. So uncertainty in a model prediction is determined by the noise in the data and the sampling over x – for example – the sampling of x in figure 2.2 will allow a better prediction of y(x2) (effectively an interpolation) than it will y(x1) (effectively an extrapolation). We can see from equation 2.26 that the higher the variance of x, the lower the uncertainty.

3 Lumped Canopy Reflectance/Scattering Models

In the previous lecture, we developed a description of the reflectance or scattering from a canopy which was driven by what we might term ‘fundamental’ biophysical parameters – these are the terms that we generally consider to drive physically-based models. There are many applications in Earth surface remote sensing when we do not need to know these terms – instead a much simpler parameterisation may often suffice, particularly if an alternative parameterisation might provide a more robust measure than a ‘full’ description of the surface.

Examples of this are: (i) when considering shortwave energy budgets – when generally a measure of albedo will suffice; (ii) for atmospheric correction; (iii) in tracking changes in the remote sensing signal to detect changes, a parameterisation of the dynamics of spectral reflectance may suffice; (iv) in erosion modelling, a measure of canopy cover is a much more important requirement than any details of leaf angle distribution or other ‘fundamental’ terms; (v) when we have sufficient ground observations to consider using a calibrated model. Examples (i) to (iii) can be, as we shall see, somewhat related – they require a method of interpolating and extrapolating an observed signal. Example (iv) is a slightly different form of problem, requiring some generalisation of canopy description. Example (v) is used widely in remote sensing to estimate some quantity by from a signal which may have some variation due to system/satellite effects (e.g. varying viewing/illumination angles over a scene) which need to be accounted for in an otherwise essentially empirical relationship.