Revised Chapter 7 in Specifying and Diagnostically Testing Econometric Models (Edition 3)
© by Houston H. Stokes 30 October 2013. All rights reserved. Preliminary Draft
Chapter 7
Time Series Analysis Part I: ARIMA and Transfer Function Identification
7.0 Introduction
7.1 Identifying and Estimating an ARIMA Model
Table 7.1 Use of the ACF and PACF to Identify an ARIMA Model
7.2 Identifying and Estimating a Transfer Function Model
7.3 Diagnostic Tests on the Autocorrelations of the Cross Correlations
Table 7.2 Setup to Illustrate CCF Diagnostic Tests
Table 7.3 ACF of Residuals and Cross Correlations for Case 1
Table 7.4 ACF of Residuals and Cross Correlations for Case 2
Table 7.5 ACF of Residuals and Cross Correlations for Case 3
Table 7.6 ACF of Residuals and Cross Correlations for Case 4
7.4 Spectral Analysis
7.6 Automatic ARIMA Model Selection
7.7 Examples of Box-Jenkins Analysis
Table 7.7 Program to Estimate the Effect of M2 on Interest Rates
Table 7.8 Data on M2
Table 7.9 Data on the Commercial Paper Interest Rates
7.8 IACF and ESACF Identification
Table 7.10 Inverse Autocorrelation Subroutine iacf
Table 7.11 Inverse Autocorrelation Subroutine iacfn
Table 7.12 Analysis of the Box-Jenkins-Reinsel Chemical Process Data Using IACF and ESACF
7.9 Conclusion
Time Series Analysis Part I: ARIMA and Transfer Function Identification
7.0 Introduction
The B34S commands bjiden and bjest control identification and estimation of Box-Jenkins (1976) models. The basic code was originally developed by David Pack (1977) and was extensively modified by the author.[1]This code also runs under the matrix command where the autobj command can be used to automatically identify an ARIMA model. The basic reference for the theory that underlies these commands is the classic work of Box and Jenkins (1976), which has been updated as Box, Reinsel and Jenkins (1994). Nelson (1973) provides a simplified treatment of univariate models, while a more modern treatment is found in Enders (1995) and Hamilton (1994). Since these excellent references are available, only a brief discussion of the identification and estimation steps will be given here. Zellner and Palm (1974) outline how the Box-Jenkins univariate and transfer function models are special cases of the more general vector autoregressive moving-average models. The simultaneous equations model is shown to be another special case. The btiden and btest commands to estimate these more complex models are discussed in Chapter 8, while the simultaneous equations command simeq is discussed in Chapter 4. A frequency interpretation of the VAR model is discussed in Chapter 12.
7.1 Identifying and Estimating an ARIMA Model
Assume two jointly distributed, discrete random variables and.[2] Defineas the joint distribution of x and y or the probability at the point x =x* and y = y*. The marginal distribution of x (y) is the probability distribution of x (y) without regard to y (x) and is defined as respectively. The conditional distribution of x, given that , is , where
(7.1-1)
The conditional distribution is the joint distribution divided by the marginal distribution. A series is said to have the property of strict stationarity if
(7.1-2)
which implies that the marginal distribution of each observation is the same , the expected value of each observation is the same , the variances of each observation are the same and the covariances are invariant of any offset in time
(7.1-3)
7-1
ARIMA and Transfer Function Models 7-1
Define B as the backshift operator such that . The general form of the ARIMA model is
(7.1-4)
where is a regular autoregressive coefficient, is a seasonal autoregressive coefficient, d is the order of regular differencing, D is the order of seasonal differencing, s is the seasonal period, is a moving-average coefficient, is a seasonal moving-average coefficient and is a white-noise error process having the following three properties:
E() = 0 for all t (7.1-5)
E() = for all t (7-1-6)
E() = 0 for all t and s 0. (7.1-7)
A correctly specified, autoregressive integrated moving-average (ARIMA) model will filter a series such that the estimatedcovariances of the error term in (7.1-4) vanish as specified in (7.1-7). A GARCH model, relaxes the homoskedasticity assumption (7.1-6) and fits an ARIMA model to the conditional volatility.[3]It is possible to write equation (7.1-4) in more compact notation as
(7.1-8)
or
, (7.1-9)
where and and we assume for simplicity that has been suitably differenced to make it stationary. If is invertible[4] equation (7.1-9) can be written in the inverted form as an autoregressive model
(7.1-10)
while if is invertible, equation (7.1-9) can be written in the random shock form as a moving-average model
,(7.1-11)
where
(7.1-12)
and
7-1
ARIMA and Transfer Function Models 7-1
.(7.1-13)
Assume that the model that prewhitens the series is applied to the series . In general, the error
(7.1-14)
will not be white noise since the appropriate model for prewhitening is
(7.1-15)
and, in general, . Problems concerning the one-filter and two-filter identification technique will be discussed in sections 7.2 and 7.3. The inverted form is easy to understand since it is an AR model, the random shock of MA model is needed for forecasting while the ARMA form is most parsimonious. To aid in the selection of the appropriate ARIMA model, Box and Jenkins (1976) recommend calculation of the autocorrelation function (ACF) and the partial autocorrelation function (PACF). The ACF and the PACF are used in a four-step procedure to identify an ARIMA model. The steps include the following:
a. Make the series stationary so the ACF and PACF can be calculated.
b. Use the ACF and PACF to determine a preliminary ARIMA model.
c. Obtain an estimate of the preliminary model, using ML estimation.
d. Diagnostically check the preliminary model to see if the residual is white noise. If the
residual is white noise, the model is complete. If the residual is not white noise, then
adjust the model based on the ACF and PACF of the residual and repeat steps c through d
as many times as necessary to obtain white noise in the residual.
e. Optionally, use the ARIMA model identified in step d to forecast.
If the series is not stationary, then the variance and covariance terms defined in (7.1-17) do not exist. For further detail on this problem see section 12.4. Define μ as the estimated mean of the series . An estimate of the autocorrelation at lag j, , is defined as
,(7.1-16)
where the estimated covariance is
(7.1-17)
Equation (7.1-17) uses T in the denominator in contrast with the usual covariance formula, which uses the small sample approximation, T-j. Equation (7.1-16) will make the estimated autocorrelation smaller than the formula with the small sample approximation. A series is deemed white noise if there are no significant spikes in the ACF.[5]The Bartlett (1946) formula for the estimated standard error of is
(7.1-18)
Equation (7.1-18) reduces to the large sample approximation if . The Bartlett form of the standard error will be greater than or equal to the large sample approximation. An estimated autocorrelation coefficient for lag k will be deemed significant at approximately the 95% confidence interval if . The first k estimated autocorrelations can be tested with the Box-Pierce (1970) statistic
, (7.1-19)
which is distributed as chi-square with k-p-q degrees of freedom, where p and q are the number of AR and MA coefficients in the multiplied out ARIMA model given in equation (7.1-9), or the Ljung-Box (1978) modified Q statistic
,(7.1-20)
which is thought to be closer to chi-square with k-p-q degrees of freedom in moderate-sized samples. The autocorrelation function in the bjiden and bjest paragraphs provides an estimate of the modified Q for each estimated autocorrelation coefficient, and the Q and the modified Q and the levels of significance for the first NCHI estimated autocorrelation coefficients. If NCHI is not set, it defaults to MIN(24,NAC), where NAC is the number of autocorrelation coefficients.
The estimated partial autocorrelation coefficients , where k is specified by the user with the NPAC parameter, are calculated as the successive solutions to the Yule-Walker equations for given the estimated autocorrelations [6]
(7.1-21)
ARIMA and Transfer Function Models 7-1
Inspection of the PACF indicates the maximum order of the pure AR model given in equation (7.1-10). Table 7.1, which was taken from Box and Jenkins (1976, 79), illustrates how the ACF and the PACF are used to determine the appropriate orders p and q of the AR and MA parts of an ARIMA model.
Table 7.1 Use of the ACF and PACF to Identify an ARIMA Model
______
Model ACF PatternPACF Pattern
Pure MA (7.1-11) Spikes at lags 1 - qTail off
then cut off
Pure AR (7.1-10)Tail off according toSpikes at lags
(7.1-21) 1-p then cut off
ARIMA (7.1-9)Irregular pattern 1-qTail off
then tail off according
to (7.1-21)
______
For the purposes of the table, q is the maximum MA term in Fx (B) for an ARIMA model and p
is the maximum AR term in Hx (B) in equation (7.1-9).
If a series is not stationary, the ACF and PACF are not defined, although they can be calculated. The usual procedure to check for stationarity is to calculate the ACF and see if the coefficients die out. If the coefficients do not die out, more differencing is used. While the above procedure has traditionally been followed, unit root tests such as the Dickey-Fuller and Phillips-Perron, can be employed. These tests, callable from the bispec sentence under the bjiden and bjest sentences and available under the matrix command, are discussed in Chapter 12. The following commands will estimate the ACF and PACF for active variables MONEY and PRICE.
b34sexec bjiden list=(money,price) plot=(money,price)
nac=48 iwtpa $
var money price$
seriesn var=money name=('money supply m2')$
seriesn var=price name=('cpi all items')$
rtrans dif=(2,1)(1,12)$
rauto money price$
b34seend$
There will be six sets of autocorrelations calculated. These are MONEY, (1-B)MONEY,
(1-B)2MONEY, (1-B12)MONEY, (1-B)(1-B12)MONEY and (1-B)2(1-B12)MONEY and the corresponding transformations for PRICE. Both series have been optionally listed and plotted (list and plot parameters) and the autocorrelations have been plotted (iwtpa option). The number of autocorrelations and partial autocorrelations has been set at 48. Assume that (1-B)MONEY is stationary. The ACF of (1-B)2MONEY would show added spikes due to the over differencing. Once the series is deemed stationary, Table 7.1 is used to select the model. Assume a preliminary guess for the correct model is an ARIMA model of the form
, (7.1-22)
where, if there is no confusion, we assume .
The following bjest commands will estimate the model listed in equation (7.1-22) and forecast for 12 periods ahead, starting from the 200th observation.
b34sexec bjest$
model money $
modeln p=12 q=1 dif(1,1)$
forecast nf=12 nt=200$
b34seend$
The bjest command uses a variant of the Meeter (1964a, 1964b) nonlinear routine, GAUSHAUS, to minimize the sum of squares of the residual of the general ARIMA model listed in equation (7.1-9). Since the shocks of the system before observation one are not known, the procedure sets them to their expected value of zero. If the iwbf option is set, the model is run backward through time to generate estimates of the errors prior to the first observation. It is recommended that this option not be used initially because of convergence problems using this "exact" ARIMA model. The SASproc arma attempts to get around the problem of obtaining values for the error terms prior to observation 1 by optionally removing them from the calculated residual sum of squares. This approach has not been implemented inB34S at this time because it appears to be arbitrary.
A major problem in ARIMA model estimation is high parameter correlation. Consider the following alternative models for series .
(7.1-23)
(7.1-24)
Equation (7.1-23), which is specified as P=(1)(12), implies a model of the form of equation (7.1-24), where
(7.1-25)
Model (7.1-24) can be estimated directly as P=(1,12,13) without constraining by equation (7.1-25). If model (7.1-24) is estimated in error, when model (7.1-23) is the correct model, the correlation between and the other two parameters, and , will be high. The bjest command displays a correlation matrix of the parameters to test for this problem. The reciprocal of the condition of the correlation matrix of the parameters is given as a further measure to detect problems of parameter redundancy.
ARIMA and Transfer Function Models 7-1
The bjest command also provides a rank condition on the covariance matrix. If there is extreme parameter collinearity, a message is given and the SEs of the coefficients are not calculated. If the user still wants to proceed, the irisk option can be specified to force inversion. If irisk is set, B34Smay generate a traceback. SEs calculated with irisk set should be used with caution. Rather than proceeding with irisk, the correct procedure is to try and determine whether it is possible to simplify a model of the form of equation (7.1-24) to a form like equation (7.1-23). The covariance of the parameters will indicate which parameters are related. If this is not available due to extreme rank problems, the best strategy is to simplify the model and attempt to "walk" in the parameters one at a time to determine what is causing the problem.
Define as the forecast for x in period t+j, given information in period t. From equation (7.1-11), we can write
(7.1-26)
where, by notational convention, and μ is the mean of the process. Since , then
(7.1-27)
and
(7.1-28)
since the sum converges where and as we get sufficiently far out, all future shocks are set to their expected value of 0 . The bjest command lists the weights if forecasting is requested. Inspection of these weights will indicate how soon the forecast will converge to the mean. If the weights die out for , this means the model cannot forecast anything but the mean of the process more than k periods out. The standard deviation of the error in the forecast j periods in the future, , is defined in terms of weights as
(7.1-29)
In addition to the ACF and PACF, often the inverse autocorrelation function IACF (Wei(2006, 122) and extended sample autocorrelation function , ESACF, Wei(2006, 128-133) and Tiao-Tsay (1983), Tsay-Tiao (1984) ) are employed to provide added insight. Since a model can be written as an model and an model can be written as a model, the idea of the IACF is given an unknownARMA(p,q) model,to calculate the ACF of the ARMA(q,p) model. For a ARMA(0,1) the ACF has one spike. The IACF will trail off since it will be seen as an ARMA(,0). Unlike the ACF and PACF the IACF depends on the number of terms used to model the AR form of the model. The two basic approaches for estimation are shown in equation (7.1-30). Wei (2006, 126) suggests first estimating an AR(p) model using OLS where p is sufficient to capture the process and forming the IACF as
(7.1-30)
for and 0.0 otherwise. Abraham and Ledolter (1984) suggest using the PACF to estimate via the Yule-Walker approach of Box and Jenkins (1976, 82) and forming the IACF using (7.1-30) with PACF values substituted in place of the OLS values for. Using OLS, if an AR(k) model was estimated, will in general differ if an AR(k+j) j>0 is estimated. This is not true if the PACF is used. The maximum order of the AR(k) model must be selected such that the process is fully captured. Using (7.1-30) note that from Wei (2006, 127)
(7.1-31)
Abraham and Ledolter (1984, 610) make the point that contrary to the view of some that the IACF is more informative than the sample PACF, they "find that the informnation in the inverse autocorrelations is rather limited."
The ESACF involves calculation of iterated regressions that attempt to calculate consistent estimates of the AR terms assuming higher and higher order MA terms. Inspection of an ESACF table for p=1,maxar and q=1,maxma, where maxar and maxma are the maximum orders considered, gives insight into the range of models that might be considered, since there no one unique filter. This approach is not as useful for seasonal models.[7] Following Wei (2006) assume a series whose model was ARMA(p,q). If a model is estimated, then will inconsistent and the residuals not white. The idea is to try and detect the appropriate number of AR and MA terms. Define from which we get the new model . This generates a second model of the form or in general
(7.1-32)
where
(7.1-33)
The mth ESACF at lag j, , is the sample autocorrelation of the transformed series
(7.1-34)
The B34S implementation provides a table of ,and a simplified table and tables of for all values. Section 7.8 provides an example.
Once an ARIMA model is tentatively identified, the next task is a preliminary estimation. The residuals of this model are then tested and, if need be, the model is adjusted until the residuals are "clean" or in other words contain no significant spikes. The nonlinear estimation procedure of the ARIMA model in bjest uses a variant of the GAUSHAUS program developed by Meeter (1964a, 1964b). Further discussion of this program is contained in Chapter 11, which discusses nonlinear estimation. The bjest command contains a number of parameters to assist the user in estimating the model. The parameter mit sets the maximum allowable iterations. The default is 20 and is rarely exceeded. The parameter esp1 sets the maximum change in the sum of squares before iteration stops. If this parameter is not given, this stopping rule is not used. The parameter esp2 allows the user to set the maximum allowable percent change in any parameter before iteration stops. The default for esp2 is .004. The user is cautioned against raising eps2. In the next section, ARIMA filters are used to identify transfer functions.
7.2 Identifying and Estimating a Transfer Function Model
Assume a multiple-input transfer function model of the form
(7.2-1)
where are all polynomials in the lag operator B, and there are k input series, where is the tth observation of the ith input series, is the error term for the tth observation and is the dependent variable. If we assume that , then equation (7.2-1) converges to an ARIMA model such as equation (7.1-4). If we assume that , equation (7.2-1) is a rational distributed lag model. If, in addition, we simplify , then equation (7.2-1) converges to a multiple-input OLS model, such as equation (2.1-7). Thus, the transfer function model is a more general case of the OLS and the ARIMA models. Equations (2.5-13) – (2.5-17) showed how (7.2-1) is a more general case of the GLS model, which itself could be written as a constrained Granger Causality model.