Revised Chapter 4 in Specifying and Diagnostically Testing Econometric Models (Edition 3)

Simultaneous Equations Systems 4-1

Revised Chapter 4 in Specifying and Diagnostically Testing Econometric Models (Edition 3)

Chapter 4

Simultaneous Equations Systems

4.0 Introduction

4.1 Estimation of Structural Models

Table 4.1 Matlab Program to obtain Constrained Reduced Form

Table 4.2 Edited output from running Matlab Program in Table 4.1

4.2 Estimation of OLS, LIML, LS2, LS3, and ILS3

4.3 Examples

Table 4.3 Setup for ols, liml, ls2, ls3, and ils3 commands

Table 4.4 SAS Implementation of the Kmenta Model

Table 4.5 RATS Implementation of the Kmenta Model

4.4 Exactly identified systems

Table 4.6 Exactly Identified Kmenta Problem

4.5 Analysis of OLS, 2SLS and 3SLS using Matrix Command

Table 4.7 Matrix Command Implementation of OLS, 2SLS, 3SLS and FIML

4.6 LS2 and GMM Models and Specification tests

Table 4.8 LS2 and General Method of Moments estimation routines

Table 4.9 Estimation of LS2 and GMM Models using B34S, Stata and Rats

4.8 Conclusion

Simultaneous Equations Systems

4.0 Introduction

In section 4.1, after first discussing the basic simultaneous equations model, the constrained reduced form, the unconstrained reduced form and the final form are introduced. The MATLAB symbolic capability is used to illustrate how the constrained reduced form relates to the structural parameters of the model. In section 4.2 the theory behind QR approach to simultaneous equations modeling as developed by Jennings (1980) is discussed in some detail. Thesimeq command performs estimation of systems of equations by the methods of OLS, limited information maximum likelihood (LIML), two-stage least squares (2SLS), three-stage least squares (3SLS), iterative three-stage least squares (I3SLS), seemingly unrelated regression (SUR) and full information maximum likelihood (FIML), using code developed by Les Jennings (1973, 1980). The Jennings code is unique in that it implements the QR approach to estimate systems of equations, which results in both substantial savings in time and increased accuracy.[1] The estimation methods are well known and covered in detail in such books as Johnston (1963, 1972, 1984), Kmenta (1971, 1986), and Pindyck and Rubinfeld (1976, 1981, 1990) and will only be sketched here. What will be discussed are the contributions of Jennings and others. The discussion of these techniques follows closely material in Jennings (1980) and Strang (1976).

Section 4.3 illustrates estimation of variants of the Kmenta model using RATS, B34S and SAS while section 4.4 illustrates an exactly identified model. Section 4.5 shows how using the matrix command OLS, LIMF, 3SLS and FIML can be estimated. The code here is for illustration purposes, benchmarking but not production. Section 4.6 shows matrix command subroutines LS2 and GAMEST that respectively do single equation 2SLS and GMM models. This code is 100% production.

4.1 Estimation of Structural Models

Assume a system of G equations with K exogenous variables[2]

(4.1-1)

where is the kth exogenous variable for the ith period, is the jth endogenous variable for the ith period, and is the jth equation error term for the ith period. If we define

equation (4.1-1) can be written as

(4.1-2)

If all observations in are included, then

and equation (4.1-2) can be written as

(4.1-3)

From equation (4.1-3), the constrained reduced form can be calculated as

(4.1-4)

If is estimated directly with OLS, then it is called the unconstrained reduced form. The B34Ssimeq command estimates B, using either OLS, 2SLS, LIML, 3SLS, I3SLS, or FIML. For each estimated vector B, the associated reduced form coefficient vector π can be optionally calculated.[3] If B is estimated by OLS, the coefficients will be biased since the key OLS assumption that the right-hand-side variables are orthogonal with the error term is violated. Model (4.1-3) can be normalized such that the coefficients . The necessary condition for identification of each equation is that the number of endogenous variables - 1 be less than or equal to the number of excluded exogenous variables. The reason for this restriction is that otherwise it would not be possible to solve for the elements of uniquely in terms of the other parameters of the model. A short example from Greene (2003) that is self documented using MATLAB illustrates this problem.

Table 4.1 Matlab Program to obtain Constrained Reduced Form

% Greene (2003) Chapter 15 Problem # 1

% y1= g1*y2 + b11*x1 + b21*x2 + b31*x3

% y2= g2*y1 + b12*x1 + b22*x2 + b32*x3

% We know BY+GX=E

syms g1 g2 b11 b21 b31 b12 b22 b32

B =[ 1, -g1;

-g2, 1]

G =[-b11,-b21,-b31;

-b12,-b22,-b32]

a= -1*inv(B)*G

p11=a(1,1)

p12=a(1,2)

p13=a(1,3)

p21=a(2,1)

p22=a(2,2)

p23=a(2,3)

% Hopeless. Have 6 equations BUT more than 6 variables

' Now impose restrictions'

' b21=0 b32=0'

G =[-b11, 0, -b31;

-b12,-b22, 0 ]

B,G

a= -1*inv(B)*G

' Here 6 equations and six unknowns g1 g2 b11 b31 b12 b22 '

p11=a(1,1)

p12=a(1,2)

p13=a(1,3)

p21=a(2,1)

p22=a(2,2)

p23=a(2,3)

Table 4.2 Edited output from running Matlab Program in Table 4.1

p11 = -1/(-1+g1*g2)*b11+g1/(-1+g1*g2)*b12

p12 = -1/(-1+g1*g2)*b21+g1/(-1+g1*g2)*b22

p13 = -1/(-1+g1*g2)*b31+g1/(-1+g1*g2)*b32

p21 = -g2/(-1+g1*g2)*b11+1/ (-1+g1*g2)*b12

p22 = -g2/(-1+g1*g2)*b21+1/ (-1+g1*g2)*b22

p23 = -g2/(-1+g1*g2)*b31+1/ (-1+g1*g2)*b32

Here 6 equations and six unknowns g1 g2 b11 b31 b12 b22

p11 = -1/(-1+g1*g2)*b11+g1/(-1+g1*g2)*b12

p12 = -g1/(-1+g1*g2)*b22

p13 = -1/(-1+g1*g2)*b31

p21 = -g2/(-1+g1*g2)*b11+1/(-1+g1*g2)*b12

p22 = -1/(-1+g1*g2)*b22

p23 = -g2/(-1+g1*g2)*b31

If the excluded exogenous variables of the ith equation are not significant in any other equation, then the ith equation will not be identified, even if it is correctly specified. We note that where . The reduced form disturbance is not correlated with the exogenous variables or. from which we deduce that

(4.1-5)

In summary, = G by K exogenous variable coefficient matrix, =G by G nonsingular endogenous variable coefficient matrix, = K by K symmetric positive definite matrix structural covariance matrix,=G by K constrained reduced form coefficient matrix and = G by G reduced form covariance matrix. The importance of this is that since and can be estimated consistently by OLS, following Greene (2003, 387) if were known, we could obtain from (4.1-4) and from (4.1-5). If there are no endogenous variables on the right, yet a number of equations are estimated where there is covariance in the error term across equations, the seemingly unrelated regression model (SUR) can be estimated as

(4.1-6)

Elements of can be estimated if OLS is used on each of the G equations and

(4.1-7)

For more detail see Greene (2003) or other advanced econometric books. Pindyck and Rubinfield (1976, 1981, 1990) provides a particularly good treatment that is consistent with the notation in this chapter.

From (4.1-4) Theil (1971, 463-468) suggests calculating the final form. First partition the ith observation of the exogenous variables into lagged endogenous, current exogenous and lagged exogenous where identifies are used to express lags > 1.

(4.1-8)

Theil (1971) shows that (4.1-8) can be expressed as

(4.1-9)

where is the impact multiplier. If there are no lagged endogenous variables in the system, and the constrained reduced form and the final form are the same. In this case . The interim multipliers are which, when summed, form the total multiplier

(4.1-10)

Goldberger (1959) and Kmenta (1971, 592) provide added detail. The importance of (4.1-8) is that it shows the effect on all endogenous variables of a change in any exogenous variable after all effects have had a chance to work themselves out in the system.

There are several common mistakes made in setting up simultaneous equations systems. These include the following:

- Not fully checking for multicollinearity in the equations system.

- Attempting to interpret the estimated B and Γ coefficientsas partial derivatives, rather than looking at the reduced form G by K matrix π.

- Not effectively testing whether excluded exogenous variables aresignificant in at least one other equation in the system.

- Not building into the solution procedure provisions for taking into account the number of

significant digits in the data.

The simeq code has unique design characteristics that allow solutions for some of these problems. In the next sections, we will briefly outline some of these features.

Assume for a moment that X is a T by K matrix of observations of the exogenous variables, Y is a T by 1 vector of observations of the endogenous variable, and β is a K element array of OLS coefficients, then the OLS solution for the estimated β from equation (2.1-8) is . The problem with this approach is that some accuracy is lost by forming the matrix . The QR approach[4]proceeds by operating directly on the matrix X to express it in terms of the upper triangular K by K matrix R and the T by T orthogonal matrix Q. X is factored as

(4.1-11)

Since Q'Q = I, then

(4.1-12)

Simultaneous Equations Systems 4-1

Following Jennings (1980), we define the condition number of matrix X, (C(X)), as the ratio of the square root of the largest eigenvalue of to the smallest eigenvalue of

(4.1-13)

If , and X is square and nonsingular, then

(4.1-14)

Throughout B34S, 1/C(X) is checked to test for rank problems. Jennings (1980) notes that C(X) can also be used as a measure of relative error. If μ is a measure of round-off error, then is the bound for the relative error of the calculated solution. In an IBM 370 running double precision, μ is approximately .1E-16. If C(X) is > .1E+8 (1 /C(X) is < .1E-8), then , meaning that no digits in the reported solution are significant. Jennings (1980) looks at the problem from another perspective. If matrix X has a round-off error of τX such that the actual X used is X+τX, then must be less than 1/C(X) for a solution to exist. If

(4.1-15)

then there exists a such that is singular.[5]The user can inspect the estimate of the condition and determine the degree of multicollinearity. Most programs only report problems when the matrix is singular. Inspection of C(X) gives warning of the degree of the problem. The simeq command contains the IPR parameter option with which the user can inform the program of the number of significant digits in X. This information is used to terminate the iterative three-stage (ILS3) iterations when the relative change in the solution is within what would be expected, given the number of significant digits in the data.

Jennings (1980) notes that the relative error of the QR solution to the OLS problem given in equation (4.1-10) has the form

(4.1-16)

where and are of the order of machine precision and are the lengths of the estimated residual and estimated coefficients, respectively. (The length or L2NORM of a vector ei is defined as ) . Equation (4.1-14) indicates that as the relative error of the computer solution improves, the closer the model fits. An estimate of this relative error is made for OLS, LIML and 2SLS estimators reported by simeq.

4.2 Estimation of OLS, LIML, LS2, LS3, and ILS3

Simultaneous Equations Systems 4-1

For OLS estimation of a system of equations, simeq uses the QR approach discussed earlier. If the reduced option is used, once the structural coefficientsB and Γ in equation (4.1-3) are known, the constrained reduced form coefficients π from equation (4.1-4) are displayed. If B and Γ are estimated using OLS, and all structural equations are exactly identified, then the constraints on π imposed from the structural coefficients B and Γ are not binding and π could be estimated directly with OLS or indirectly via (4.1-4). However, if one or more of the equations in the structural equations system (4.1-2) are overidentified, π must be estimated as .

Although the reduced-form coefficients π exist and may be calculated from any set of structural estimates B and Γ, in practice it is not desirable to report those derived from OLS estimation because in the presence of endogenous variables on the right-hand side of an equation, the OLS assumption that the error term is orthogonal with the explanatory variables is violated. Since OLS imposes this constraint as a part of the estimation process, the resulting estimated B and Γ are biased.

The reason that OLS is often used as a benchmark is because from among the class of all linear estimators, OLS produces minimum variance. The loss in predictive power of LIML and 2SLS has to be weighed against the fact that OLS produces biased estimates. If reduced-form coefficients are desired, identities in the system must be entered. The number of identities plus the number of estimated equations must equal the number of endogenous variables in the model. The simeq command requires that the number of model sentences and identity sentences is equal to the number of variables listed in the endogenous sentence.

The 2SLS estimator first estimates all endogenous variables as a function of all exogenous variables. This is equivalent to estimating an unconstrained form of the reduced-form equation (4.1-4). Next, in stage 2 the estimated values of the endogenous variables on the right in the jth equation are used in place of the actual values of the endogenous variables Yj on the right to estimate equation (4.1-2). Since the estimated values of the endogenous variables on the right are only a function of exogenous variables, the theory suggests they can be assumed to be orthogonal with the population error, and OLS can be safely used for the second stage. In terms of our prior notation, the two-stage estimator for the first equation is

Simultaneous Equations Systems 4-1

(4.2-1)

where is the matrix of predicted endogenous variables in the first equation and X1 is the matrix of exogenous variables in the first equation. For further details on this traditional estimation approach, see Pindyck and Rubinfeld (1981, 345-347).

The QR approach used by Jennings (1980) involves estimating equation (4.2-1) as the solution of

(4.2-2)

For where and X+ pseudoinverse[6]of X. Zj consists of the X and Y variables in the jth equation. XX+ is not calculated directly but is expressed in terms of the QR factorization of X. By working directly on X, and not forming X'X, substantial accuracy is obtained. Jennings proceeds by writing

(4.2-3)

where Ir is the r by r identity matrix and r is the rank of X. Using equation (4.2-3), equation (4.2-2) becomes

(4.2-4)

where and .

The 2SLS covariance matrix can be estimated as

(4.2-5)

Simultaneous Equations Systems 4-1

where is the degrees of freedom and is the residual sum of squares (or the square of the L2NORM of the residual). There is a substantial controversy in the literature about the appropriate value for . Since the SEs of the estimated 2SLS coefficients are known only asymptotically, Theil (1971) suggests that be set equal to T, the number of observations used to estimate the model. Others suggest that be set to T-K, similar to what is being used in OLS. If Theil's suggestion is used, the estimated SEs of the coefficients are larger. The T-K option is more conservative. The simeq command produces both estimates of the coefficient standard errors to facilitate comparison with other programs and researcher preferences.

Two-stage least squares estimation of an equation with endogenous variables on the right, in contrast with OLS estimation, in theory produces unbiased coefficients at the cost of some loss of efficiency. If a large system is estimated, it is often impossible to use all exogenous variables in the system because of loss of degrees of freedom. The usual practice is to select a subset of the exogenous variables. The greater the number of exogenous variables relative to the degrees of freedom, the closer the predicted Y variables on the right are to the raw Y variables on the right. In this situation, the 2SLS estimator sum of squares of residuals will approach the OLS estimator sum of squares of residuals. Such an estimator will lose the unbiased property of the 2SLS estimator. Usual econometric practice is to use OLS and 2SLS and compare the results to see how sensitive the OLS results are to simultaneity problems.

While 2SLS results are sensitive to the variable that is used to normalize the system, limited information maximum likelihood (LIML) estimation, which can be used in place of 2SLS, is not so sensitive. Kmenta (1971, 568-570) has a clear discussion which is summarized below. The LIML estimator,[7]which is hard to explain in simple terms, involves selecting values for b and δ for each equation such that L is minimized where L = SSE1 / SSE. We define SSE1 as the residual variance of estimating a weighted average of the y variables in the equation on all exogenous variables in the equation, while SSE is the residual variance of estimating a weighted average of the y variables on all the exogenous variables in the system. Since SSE SSE1, L is bounded at 1. The difficulty in LIML estimation is selecting the weights for combining the y variables in the equation. Assume equation 1 of (4.1-1)

(4.2-6)

Ignoring time subscripts, we can define

(4.2-7)

If we define and we knew the vector we would know y*1 since and could regress y* on all x variables on the right in that equation and call the residual variance SSE1 and next regress on all x variables in the system and call the residual variance SSE. If we define X1 as a matrix consisting of the columns of the x variables on the right X1= [x1i,...,x1K], and we knew B1*, then we could estimate as

(4.2-8)

However, we do not know B1*. If we define

(4.2-9)

(4.2-10)

where X is the matrix of all X variables in the system, then can be written as

(4.2-11)

Minimizing L implies that

det (4.2-12)

The LIML estimator uses eigenvalue analysis to select the vector B1* such that L is minimized. This calculation involves solving the system

(4.2-13)

for the smallest root L which we will call This root can be substituted back into equation (4.2-12) to get B1* and into equation (4.2-8) to get Γ1. Jennings shows that equation (4.2-13) can be rewritten as

.(4.2-14)

Further factorizations lead to accuracy improvements and speed over the traditional methods of solution outlined in Johnston (1984), Kmenta (1971), and other books. Jennings (1973, 1980) briefly discusses tests made for computational accuracy, given the number of significant digits in the data and various tests for nonunique solutions. One of the main objectives of the simeq code was to be able to inform the user if there were problems in identification in theory and in practice. Since the LIML standard errors are known only asymptotically and are, in fact, equal to the 2SLS estimated standard errors, these are used for both the 2SLS and LIML estimators.

In the first stage of 2SLS, π is the unconstrained, reduced form.

Y = πX + V (4.2-15)

Simultaneous Equations Systems 4-1

and is estimated to obtain the predicted variables. 2SLS, OLS, and LIML are all special cases of the Theil (1971) k class estimators. The general formula for the k class estimator for the first equation (Kmenta 1971, 565) is

(4.2-16)

where is the predicted residual from estimating all but the 1st y variable in equation (4.2-15),

, and X1 is the X variables on the right-hand side of the first equation. (4.2-16) follows directly from (4.2-1). If k=0, equation (4.2-15) is the formula for OLS estimation of the first equation. If k=1, equation (4.2-16) is the formula for 2SLS estimation of the first equation and can be transformed to equation (4.2-5). If k = , the minimum root of equation (4.2-13), equation (4.2-16) is the formula for the LIML estimator (Theil 1971, 504). Hence, OLS, 2SLS, and LIML are all members of the k class of estimators.

Three-stage least squares utilizes the covariance of the residuals across equations from the estimated 2SLS model to improve the estimated coefficients B and Γ. If the model has only exogenous variables on the right-hand side (B = 0), the OLS estimates can be used to calculate the covariance of the residuals across equations. The resulting estimator is the seemingly unrelated regression model (SUR). In this discussion, we will look at the 3SLS model only, since the SUR model is a special case. From (4.2-2) we rewrite the 2SLS estimator for the ith equation as