(Previously presented at SAS Global Forum, Dallas, 2015)

Adventures in Forecasting

David A. Dickey

NC State University

Learning objectives:

__Understand ARIMA models.

__Interpret ARIMA output from PROC ARIMA

__Forecast with forecast intervals

__Understand when to difference data

__Understand advantages/disadvantages of deterministic vs.

stochastic inputs

__Compare forecasts from deterministic versus stochastic input

models.

__ Incorporate Trends

__ Incorporate Seasonality

__ (optional) Introduce Cointegration

Note: Examples here are run in SASTM

TM SAS and its products is the registered trademark of SAS Institute, Cary NC, USA

Brief index

Pg. Topic

3 Overview

4 Autoregressive Models

8 Model checking

10 Prediction Intervals

11 Moving Average Models

14 Model Selection - AIC

19 Stationarity – Unit Roots

26 Determining Lag Differences for Unit Root Tests

33 Models with Inputs (PROC AUTOREG)

47 Detecting Outliers

52 Seasonal Models

63 (optional) Nonlinear Trends

67 (optional) St. Petersburg Visitor Example

72 (optional) Seasonal Unit Roots

77 (optional) Cointegrated Series

Real data examples

Silver pg. 5, 24, 36 Iron &Steel 12 Brewers 14, 23

Corn Yields 24 Harley 33 NCSU Energy 40

Deer Crash 52 Ice Classic 63 St. Pete Visits 67

T-bill rates 78

Overview of Time Series and Forecasting:

Data taken over time (usually equally spaced)

Yt = data at time t  = mean (constant over time)

Simplest model: Yt =  + et

where et ~ N(0,2) independent. Forecast:

Example 0: Yt =  + Zt , corr(Zt,Zt-1)=0.8

Model (Yt(Yt1et , et ~ N(0,2) independent

ARIMA Models:

AR(p): “Autoregressive”

et independent, constant variance: “White Noise”

How to find p? Regress Y on lags.

PACF Partial Autocorrelation Function

(1) Regress Yt on Yt-1 then Yt on Yt-1 and Yt-2

then Yt on Yt-1,Yt-2, Yt-3 etc.

(2) Plot last lag coefficients versus lags.

Example 1: Supplies of Silver in NY commodities exchange:

Getting PACF (and other identifying plots). SAS code:

PROC ARIMA data=silver

plots(unpack) = all;

IDENTIFY var=silver; run;

PACF

“Spikes” outside 2 standard error bands are statistically significant

Two spikes  p=2

How to estimate  and ’s ?

PROC ARIMA’s ESTIMATE statement.

Use maximum likelihood (ml option)

PROC ARIMA data=silver plots(unpack) = all;

identify var=silver;

ESTIMATE p=2 ml;

Maximum Likelihood Estimation
Parameter / Estimate / Standard Error / t Value / Approx
Pr > |t| / Lag
MU / 668.29592 / 38.07935 / 17.55 / <.0001 / 0
AR1,1 / 1.57436 / 0.10186 / 15.46 / <.0001 / 1
AR1,2 / -0.67483 / 0.10422 / -6.48 / <.0001 / 2

Backshift notation: B(Yt)=Yt-1, B2(Yt)=B(B(Yt))=Yt-2

SAS output: (uses backshift)

Autoregressive Factors
Factor 1: / 1 - 1.57436 B**(1) + 0.67483 B**(2)

Checks:

(1) Overfit (try AR(3) )

ESTIMATE p=3 ml;

Maximum Likelihood Estimation
Parameter / Estimate / Standard Error / t Value / Approx
Pr > |t| / Lag
MU / 664.88129 / 35.21080 / 18.88 / <.0001 / 0
AR1,1 / 1.52382 / 0.13980 / 10.90 / <.0001 / 1
AR1,2 / -0.55575 / 0.24687 / -2.25 / 0.0244 / 2
AR1,3 / -0.07883 / 0.14376 / -0.55 / 0.5834 / 3

(2) Residual autocorrelations

Residual rt

Residual autocorrelation at lag j: Corr(rt, rt-j) = (j)

Box-Pierce Q statistic: Estimate, square, and sum k of these. Multiply by sample size n. PROC ARIMA: k in sets of 6. Limit distribution Chi-square if errors independent.

Later modification: Box-Ljung statistic for H0:residuals uncorrelated

SAS output:

Autocorrelation Check of Residuals
To Lag / Chi-Square / DF / Pr > ChiSq / Autocorrelations
6 / 3.49 / 4 / 0.4794 / -0.070 / -0.049 / -0.080 / 0.100 / -0.112 / 0.151
12 / 5.97 / 10 / 0.8178 / 0.026 / -0.111 / -0.094 / -0.057 / 0.006 / -0.110
18 / 10.27 / 16 / 0.8522 / -0.037 / -0.105 / 0.128 / -0.051 / 0.032 / -0.150
24 / 16.00 / 22 / 0.8161 / -0.110 / 0.066 / -0.039 / 0.057 / 0.200 / -0.014

Residuals uncorrelatedResiduals are White Noise  Residuals are unpredictable

SAS computes Box-Ljung on original data too.

Autocorrelation Check for White Noise
To Lag / Chi-Square / DF / Pr > ChiSq / Autocorrelations
6 / 81.84 / 6 / <.0001 / 0.867 / 0.663 / 0.439 / 0.214 / -0.005 / -0.184
12 / 142.96 / 12 / <.0001 / -0.314 / -0.392 / -0.417 / -0.413 / -0.410 / -0.393

Data autocorrelated predictable!

Note: All p-values are based on an assumption called “stationarity” discussed later.

How to predict?

One step prediction

Two step prediction

Prediction error variance ( 2 = variance(et) )

From prediction error variances, get 95% prediction intervals. Can estimate variance of et from past data. SAS PROC ARIMA does it all for you!

Moving Average, MA(q), and ARMA(p,q) models

MA(1) Yt =  + et  et-1 Variance (1+2)2

Yt-1 =  + et-1  et-2 (1)=-/(1+2)

Yt-2 =  + et-2  et-3 (2)=0/(1+2)=0

Autocorrelation function “ACF” ((j)) is 0 after lag q for MA(q). PACF is useless for identifying q in MA(q).

PACF drops to 0 after lag 3  AR(3) p=3

ACF drops to 0 after lag 2  MA(2) q=2

Neither drops  ARMA(p,q) p= ___ q=____

Example 2: Iron and Steel Exports.

PROC ARIMA plots(unpack)=all;

IDENTIFY var=EXPORT;

ACF: could be MA(1) PACF: could be AR(1)

Spike at lags 0, 1 (No spike displayed at lag 0)

ESTIMATE P=1 ML;

ESTIMATE Q=2 ML;

ESTIMATE Q=1 ML;

Maximum Likelihood Estimation

Approx

Parameter Estimate t Value Pr>|t| Lag

MU 4.42129 10.28 <.0001 0

AR1,1 0.46415 3.42 0.0006 1

MU 4.43237 11.41 <.0001 0

MA1,1 -0.54780 -3.53 0.0004 1

MA1,2 -0.12663 -0.82 0.4142 2

MU 4.42489 12.81 <.0001 0

MA1,1 -0.49072 -3.59 0.0003 1

How to choose? AIC - smaller is better

AIC: -2 ln(Lmax)+2(# parameters)

Lmax = max of likelihood function

AIC 165.8342 (MA(1)) 

AIC 166.3711 (AR(1))

AIC 167.1906 (MA(2))

FORECAST lead=5 out=out1 id=date interval=year;

Example 3: Brewers’ Proportion Won

Mean of Working Series 0.478444

Standard Deviation 0.059934

Autocorrelations

Lag -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1

Correlation Std Error

0 1.00000 | |********************| 0

1 0.52076 | . |********** | 0.149071

2 0.18663 | . |**** . | 0.185136

3 0.11132 | . |** . | 0.189271

4 0.11490 | . |** . | 0.190720

5 -.00402 | . | . | 0.192252

6 -.14938 | . ***| . | 0.192254

7 -.13351 | . ***| . | 0.194817

8 -.06019 | . *| . | 0.196840

9 -.05246 | . *| . | 0.197248

10 -.20459 | . ****| . | 0.197558

11 -.22159 | . ****| . | 0.202211

12 -.24398 | . *****| . | 0.207537

"." marks two standard errors

Could be MA(1)

Autocorrelation Check for White Noise

To Chi- Pr >

Lag Square DF ChiSq ------Autocorrelations------

6 17.27 6 0.0084 0.521 0.187 0.111 0.115 -0.004 -0.149

12 28.02 12 0.0055 -0.134 -0.060 -0.052 -0.205 -0.222 -0.244

NOT White Noise!

SAS Code:

PROC ARIMA data=brewers;

IDENTIFY var=Win_Pct nlag=12; run;

ESTIMATE q=1 ml;

Maximum Likelihood Estimation

Standard Approx

Parameter Estimate Error t Value Pr > |t| Lag

MU 0.47791 0.01168 40.93 <.0001 0

MA1,1 -0.50479 0.13370 -3.78 0.0002 1

AIC -135.099

Autocorrelation Check of Residuals

To Chi- Pr >

Lag Square DF ChiSq ------Autocorrelations------

6 3.51 5 0.6219 0.095 0.161 0.006 0.119 0.006 -0.140

12 11.14 11 0.4313 -0.061 -0.072 0.066 -0.221 -0.053 -0.242

18 13.54 17 0.6992 0.003 -0.037 -0.162 -0.010 -0.076 -0.011

24 17.31 23 0.7936 -0.045 -0.035 -0.133 -0.087 -0.114 0.015

Estimated Mean 0.477911

Moving Average Factors

Factor 1: 1 + 0.50479 B**(1)

Partial Autocorrelations

Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1

1 0.52076 | . |********** |

2 -0.11603 | . **| . |

3 0.08801 | . |** . |

4 0.04826 | . |* . |

5 -0.12646 | . ***| . |

6 -0.12989 | . ***| . |

7 0.01803 | . | . |

8 0.01085 | . | . |

9 -0.02252 | . | . |

10 -0.20351 | . ****| . |

11 -0.03129 | . *| . |

12 -0.18464 | . ****| . |

OR … could be AR(1)

ESTIMATE p=1 ml;

Maximum Likelihood Estimation

Standard Approx

Parameter Estimate Error t Value Pr > |t| Lag

MU 0.47620 0.01609 29.59 <.0001 0

AR1,1 0.53275 0.12750 4.18 <.0001 1

AIC -136.286 (vs. -135.099)

Autocorrelation Check of Residuals

To Chi- Pr >

Lag Square DF ChiSq ------Autocorrelations------

6 3.57 5 0.6134 0.050 -0.133 -0.033 0.129 0.021 -0.173

12 8.66 11 0.6533 -0.089 0.030 0.117 -0.154 -0.065 -0.181

18 10.94 17 0.8594 0.074 0.027 -0.161 0.010 -0.019 0.007

24 13.42 23 0.9423 0.011 -0.012 -0.092 -0.081 -0.106 0.013

Model for variable Win_pct

Estimated Mean 0.476204

Autoregressive Factors

Factor 1: 1 - 0.53275 B**(1)

Conclusions for Brewers:

Both models have statistically significant

parameters.

Both models are sufficient (no lack of fit)

Predictions from MA(1):

First one uses correlations

The rest are on the mean.

Predictions for AR(1):

Converge exponentially fast toward mean

Not much difference but AIC prefers AR(1)

Stationarity

(1) Mean constant (no trends)

(2) Variance constant

(3) Covariance (j) and correlation

(j) = (j)/(0)

between Yt and Yt-j depend only on j

ARMA(p,q) model:

Stationarity guaranteed whenever solutions of equation (roots of “characteristic polynomial”)

Xp – 1Xp-12Xp-2…p =0

are all <1 in magnitude.

Examples

(1) Yt = .8(Yt-1) + et X-.8=0  X=.8 stationary

(2) Yt = 1.00(Yt-1) + et nonstationary

Note: Yt= Yt-1 + et Random walk

(3) Yt = 1.6(Yt-1)  0.6(Yt-2)+ et

“characteristic polynomial”

X21.6X+0.6=0  X=1 or X=0.6

nonstationary (unit root X=1)

(Yt)(Yt-1) =0.6[(Yt-1) (Yt-2)]+ et

(YtYt-1) =0.6(Yt-1 Yt-2) + et

First differences form stationary AR(1) process!

No mean – no mean reversion – no gravity pulling toward the mean.

(4) Yt = 1.60(Yt-1)  0.63(Yt-1)+ et

X21.60X+0.63=0  X=0.9 or X=0.7

|roots|<1  stationary

(Yt)(Yt-1) = 0.03(Yt-1)

+ 0.63[(Yt-1) (Yt-2)]+ et

YtYt-1 = 0.03(Yt-1) + 0.63(Yt-1Yt-2)+ et *

Unit Root testing (H0: Series has a unit root)

Regress YtYt-1 on Yt-1 and (Yt-1Yt-2)

Look at t test for Yt-1. If it is significantly negative then stationary.

*Note: If X=1 then –(X2–1.60X+0.63) = 0.3

(always equals lag Y coefficient so 0 unit root)

Problem: Distribution of “t statistic” on Yt-1 is not t distribution under unit root hypothesis.

Distribution looks like this histogram:

(1 million random walks of length n=100)

Overlays: N(sample mean & variance) N(0,1)

Correct distribution: Dickey-Fuller test in PROC ARIMA.

-2.89 is the correct (left) 5th %ile

46% of t’s are less than -1.645

(the normal 5th percentile)

Example 3 revisited: Brewers

PROC ARIMA data=brewers;

IDENTIFY var=Wins nlag=12

stationarity=(ADF=0); run;

Dickey-Fuller Unit Root Tests

Type Lags Rho Pr < Rho Tau Pr < Tau

Zero Mean 0 -0.1803 0.6376 -0.22 0.6002

Single Mean 0 -21.0783 0.0039 -3.75 0.0062

Trend 0 -21.1020 0.0287 -3.68 0.0347

Why “single mean?” Series has nonzero mean and no trend.

Conclusion reject H0:unit roots so Brewers series is stationary (mean reverting).

0 lags  do not need lagged differences in model (just regress Yt-Yt-1 on Yt-1)

Example 1 revisited: Stocks of silver

Needed AR(2) (2 lags) so regress

Yt-Yt-1 (D_Silver) on

Yt-1 (L_Silver) and Yt-1-Yt-2 (D_Silver_1)

PROC REG:

Parameter

Variable DF Estimate t Value Pr>|t|

Intercept 1 75.58073 2.76 0.0082

L_Silver 1 -0.11703 -2.78 0.0079  wrong

distn.

D_Silver_1 1 0.67115 6.21 <.0001  OK

PROC ARIMA:

Augmented Dickey-Fuller Unit Root Tests

Type Lags Rho Pr<Rho Tau Pr<Tau

Zero Mean 1 -0.2461 0.6232 -0.28 0.5800

Single Mean 1 -17.7945 0.0121 -2.78 0.0689  OK

Trend 1 -15.1102 0.1383 -2.63 0.2697

Same t statistic, corrected p-value!

Conclusion: Unit root  difference the series.

1 lag  need 1 lagged difference in model (regress Yt-Yt-1 on Yt-1 and Yt-1-Yt-2 )

PROC ARIMA data=silver;

IDENTIFY var=silver(1)

stationarity=(ADF=(0));

ESTIMATE p=1 ml;

FORECAST lead=24 out=outN

ID=date Interval=month;

Unit root forecast & forecast interval

HOW MANY LAGGED DIFFERENCES?

Regression:

(YtYt1) =

b0 + b1Yt1 + b2(Yt1Yt2) + …+ bp(Ytp1Ytp)

 not.  |  standard distributions for these |

standard

Dickey & Fuller (1979)

  • Lagged difference coefficients b2 … bp have standard (asymptotically normal) distributions. Trust their t test p-values in PROC REG.
  • b0 and b1 have t statistics with same nonstandard limit distributions as in the AR(1) model.
  • Implication: Just use PROC REG to determine appropriate number of lagged differences.
  • Too few => invalid tests
  • Too many => loss of power

Said & Dickey (1984, 1985) prove that methods work even if moving average terms are present.

Chang and Dickey (1993) show that the Inverse Autocorrelation Function (IACF) can be used to check for overdifferencing.

Yt- = (Yt-1-) + et – et-1 (||<1)

Autoregressive Moving Average

“Dual” model:

Yt- =  (Yt-1-) + et – et-1

Definition: Inverse Autocorrelation Function is Autocorrelation Function of dual model.

IACF estimation: (a) Fit long autoregression,

(b) move coefficients to moving average (MA) side, (c) calculate ACF as if estimated MA is true.

Chang (1993) Moving average unit root (e.g. =1)  slow decay in IACF (Inverse AutoCorrelation Function)

Differencing whenever you see a trend is NOT appropriate:

IACF from generated linear trend plus white noise:

Example 4: Corn yields in the U.S. (bushels per acre 1866-1942 and 1943-2014)

Analysis of post 1942 yields.

Levels data:

PROC ARIMA;

IDENTIFY var=Yield stationarity=(adf=0);

Dickey-Fuller Unit Root Tests

Type Lags Tau Pr<Tau

Zero Mean 0 0.61 0.8454

Single Mean 0 -1.23 0.6559

Trend 0 -8.65 <.0001

Autocorrelation Check of Residuals

(from linear trend plus white noise)

To Chi- Pr >

Lag Square DF ChiSq --Autocorrelations-----

6 4.27 6 0.6403 -0.05 ... 0.20 -0.06

12 10.82 12 0.5445 -0.02 ... -0.21 -0.08

18 15.54 18 0.6247 0.03 ... 0.04 -0.08

24 22.43 24 0.5535 0.19 ... 0.08 0.09

Suppose we difference anyway:

IACF

Example 2 revisited again: Silver Series

DATA CHECK;

SET SILVER;

Lag_silver = LAG(silver);

Diff = silver-Lag_silver;

OUTPUT; RETAIN;

Diff5=Diff4; Diff4=Diff3; Diff3=Diff2;

Diff2=Diff1; Diff1=Diff;

PROC REG;

MODEL Diff = Lag_silver Diff1-Diff5;

REMOVE_2andup: TEST Diff2=0, Diff3=0, Diff4=0,

Diff5=0; run

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 124.70018 45.46720 2.74 0.0092 X

Lag_silver 1 -0.19292 0.07144 -2.70 0.0102 X

Diff1 1 0.62085 0.14427 4.30 0.0001 OK **

Diff2 1 0.06292 0.17499 0.36 0.7211 OK

Diff3 1 0.00293 0.17224 0.02 0.9865 OK

Diff4 1 0.21501 0.17392 1.24 0.2238 OK

Diff5 1 0.08100 0.16209 0.50 0.6201 OK

Test REMOVE_2andup Results for Dependent Variable Diff

Mean

Source DF Square F Value Pr > F

Numerator 4 803.50398 0.89 0.4796 OK

Denominator 39 903.76157

What actually happened next in Silver series?

(1) Fit stationary (AR(2)) and nonstationary models (differences~ AR(1)) to the data.

(2) Compute forecasts, stationary and nonstationary

PROC AUTOREG

Fits a regression model (least squares)

Fits stationary autoregressive model to error terms

Refits accounting for autoregressive errors.

Example 5-A: AUTOREG Harley-Davidson closing stock prices 2009-present.

PROC AUTOREG data=Harley;

MODEL close=date/ nlag=15 backstep;

run;

One by one, AUTOREG eliminates insignificant lags then:

Estimates of Autoregressive Parameters
Lag / Coefficient / Standard Error / t Value
1 / -0.975229 / 0.006566 / -148.53

Final model:

Parameter Estimates
Variable / DF / Estimate / Standard Error / t Value / Approx
Pr > |t|
Intercept / 1 / -412.1128 / 35.2646 / -11.69 / <.0001
Date / 1 / 0.0239 / 0.001886 / 12.68 / <.0001

In PROC AUTOREG model is Zt+Zt-1=et rather than ZtZt-1=et so with = 0.975229,

error term Zt satisfies Zt0.97Zt-1=et.

Example 5-B: ARIMA Harley-Davidson closing stock prices 01/01/2009through 05/13/ 2013. (vs. AUTOREG)

Apparent upward movement:

Linear trend or nonstationary?

Regress

Yt – Yt-1 on 1, t, Yt-1 (& lagged differences)

H0: Yt=  + Yt-1 + et “random walk with drift”

H1: Yt=t + Zt with Zt stationary AR(p) *

New distribution for t-test on Yt-1

With trend

* Yt=t + Zt general model so

YtYt-1 =t-(t-1) + Zt –Zt-1

and if Zt=Zt-1+et (unit root) then

YtYt-1 = + et or YtYt-1 + et

“Random walk with drift ”

Without trend

With trend

1 million simulations - runs in 7 seconds!

SAS code for Harley stock closing price

PROC ARIMA data=Harley;

IDENTIFY var=close stationarity=(adf)

crosscor=(date) noprint;

ESTIMATE input=(date) p=1 ml;

FORECASE lead=120 id=date interval=weekday

out=out1; run;

Stationarity test (0,1,2 lagged differences):

Augmented Dickey-Fuller Unit Root Tests
Type / Lags / Rho / Pr < Rho / Tau / Pr < Tau
Zero Mean / 0 / 0.8437 / 0.8853 / 1.14 / 0.9344
1 / 0.8351 / 0.8836 / 1.14 / 0.9354
2 / 0.8097 / 0.8786 / 1.07 / 0.9268
Single Mean / 0 / -2.0518 / 0.7726 / -0.87 / 0.7981
1 / -1.7772 / 0.8048 / -0.77 / 0.8278
2 / -1.8832 / 0.7925 / -0.78 / 0.8227
Trend / 0 / -27.1559 / 0.0150 / -3.67 / 0.0248
1 / -26.9233 / 0.0158 / -3.64 / 0.0268
2 / -29.4935 / 0.0089 / -3.80 / 0.0171

Conclusion: stationary around a linear trend.

Estimates: trend + AR(1)

Maximum Likelihood Estimation
Parameter / Estimate / Standard Error / t Value / Approx
Pr > |t| / Lag / Variable / Shift
MU / -412.08104 / 35.45718 / -11.62 / <.0001 / 0 / Close / 0
AR1,1 / 0.97528 / 0.0064942 / 150.18 / <.0001 / 1 / Close / 0
NUM1 / 0.02391 / 0.0018961 / 12.61 / <.0001 / 0 / Date / 0
Autocorrelation Check of Residuals
To Lag / Chi-Square / DF / Pr > ChiSq / Autocorrelations
6 / 3.20 / 5 / 0.6694 / -0.005 / 0.044 / -0.023 / 0.000 / 0.017 / 0.005
12 / 6.49 / 11 / 0.8389 / -0.001 / 0.019 / 0.003 / -0.010 / 0.049 / -0.003
18 / 10.55 / 17 / 0.8791 / 0.041 / -0.026 / -0.022 / -0.023 / 0.007 / -0.011
24 / 16.00 / 23 / 0.8553 / 0.014 / -0.037 / 0.041 / -0.020 / -0.032 / 0.003
30 / 22.36 / 29 / 0.8050 / 0.013 / -0.026 / 0.028 / 0.051 / 0.036 / 0.000
36 / 24.55 / 35 / 0.9065 / 0.037 / 0.016 / -0.012 / 0.002 / -0.007 / 0.001
42 / 29.53 / 41 / 0.9088 / -0.007 / -0.021 / 0.029 / 0.030 / -0.033 / 0.030
48 / 49.78 / 47 / 0.3632 / 0.027 / -0.009 / -0.097 / -0.026 / -0.074 / 0.026

What actually happened?

Example 6 (with inputs): NCSU Energy Demand

Type of day

Class Days

Work Days (no classes)

Holidays & weekends.

Temperature

Season of Year

Step 1: Make some plots of energy demand vs. temperature and season. Use type of day as color.

Seasons: S = A sin(2t/365) , C=B cos(2t/365)

Temperature Season of Year

Step 2: PROC AUTOREG with all inputs:

PROC AUTOREG data=energy;

MODEL demand = temp tempsq class work s c

/nlag=15 backstep dwprob;

output out=out3

predicted = p predictedm=pm

residual=r residualm=rm;

run;

Estimates of Autoregressive Parameters
Lag / Coefficient / Standard Error / t Value
1 / -0.559658 / 0.043993 / -12.72
5 / -0.117824 / 0.045998 / -2.56
7 / -0.220105 / 0.053999 / -4.08
8 / 0.188009 / 0.059577 / 3.16
9 / -0.108031 / 0.051219 / -2.11
12 / 0.110785 / 0.046068 / 2.40
14 / -0.094713 / 0.045942 / -2.06

Autocorrelation at 1, 7, 14, and others.

After autocorrelation adjustments, trust t tests etc.

Parameter Estimates
Variable / DF / Estimate / Standard Error / t Value / Approx
Pr > |t|
Intercept / 1 / 6076 / 296.5261 / 20.49 / <.0001
TEMP / 1 / 28.1581 / 3.6773 / 7.66 / <.0001
TEMPSQ / 1 / 0.6592 / 0.1194 / 5.52 / <.0001
CLASS / 1 / 1159 / 117.4507 / 9.87 / <.0001
WORK / 1 / 2769 / 122.5721 / 22.59 / <.0001
S / 1 / -764.0316 / 186.0912 / -4.11 / <.0001
C / 1 / -520.8604 / 188.2783 / -2.77 / 0.0060

Residuals from regression part.

Large residual on workday January 2.

Add dummy variable.

Same idea: PROC ARIMA

Step 1: Graphs

Step 2: Regress on inputs, diagnose residual autocorrelation:

Not white noise (bottom right)

Activity (bars) at lag 1, 7, 14

Step 3: Estimate resulting model from diagnostics plus trial and error:

e input = (temp tempsq class work s c)

p=1 q=(1,7,14) ml;

Maximum Likelihood Estimation
Parameter / Estimate / Standard Error / t Value / Approx
Pr > |t| / Lag / Variable / Shift
MU / 6183.1 / 300.87297 / 20.55 / <.0001 / 0 / DEMAND / 0
MA1,1 / 0.11481 / 0.07251 / 1.58 / 0.1133 / 1 / DEMAND / 0
MA1,2 / -0.18467 / 0.05415 / -3.41 / 0.0006 / 7 / DEMAND / 0
MA1,3 / -0.13326 / 0.05358 / -2.49 / 0.0129 / 14 / DEMAND / 0
AR1,1 / 0.73980 / 0.05090 / 14.53 / <.0001 / 1 / DEMAND / 0
NUM1 / 26.89511 / 3.83769 / 7.01 / <.0001 / 0 / TEMP / 0
NUM2 / 0.64614 / 0.12143 / 5.32 / <.0001 / 0 / TEMPSQ / 0
NUM3 / 912.80536 / 122.78189 / 7.43 / <.0001 / 0 / CLASS / 0
NUM4 / 2971.6 / 123.94067 / 23.98 / <.0001 / 0 / WORK / 0
NUM5 / -767.41131 / 174.59057 / -4.40 / <.0001 / 0 / S / 0
NUM6 / -553.13620 / 182.66142 / -3.03 / 0.0025 / 0 / C / 0

Note: class days get class effect 913 plus work effect 2972.

Note 2: Lags are sensible.

Step 4: Check model fit (stats look OK):

Autocorrelation Check of Residuals
To Lag / Chi-Square / DF / Pr > ChiSq / Autocorrelations
6 / 2.86 / 2 / 0.2398 / -0.001 / -0.009 / -0.053 / -0.000 / 0.050 / 0.047
12 / 10.71 / 8 / 0.2188 / 0.001 / -0.034 / 0.122 / 0.044 / -0.039 / -0.037
18 / 13.94 / 14 / 0.4541 / -0.056 / 0.013 / -0.031 / 0.048 / -0.006 / -0.042
24 / 16.47 / 20 / 0.6870 / -0.023 / -0.028 / 0.039 / -0.049 / 0.020 / -0.029
30 / 24.29 / 26 / 0.5593 / 0.006 / 0.050 / -0.098 / 0.077 / -0.002 / 0.039
36 / 35.09 / 32 / 0.3239 / -0.029 / -0.075 / 0.057 / -0.001 / 0.121 / -0.047
42 / 39.99 / 38 / 0.3817 / 0.002 / -0.007 / 0.088 / 0.019 / -0.004 / 0.060
48 / 43.35 / 44 / 0.4995 / -0.043 / 0.043 / -0.027 / -0.047 / -0.019 / -0.032

Looking for “outliers” that can be explained

PROC ARIMA, OUTLIER statement

Available types

(1) Additive (single outlier)

(2) Level shift (sudden change in mean)

(3) Temporary change (level shift for k contiguous time points – you specify k)

NCSU energy: tested every point – 365 tests.

Adjust for multiple testing

Require p < 0.05/365 = .0001369863 (Bonferroni)

OUTLIER type=additive alpha=.0001369863 id=date;

FORMAT date weekdate.; run;

/*****************************************

January 2, 1980 Wednesday: Hangover Day :-)

March 3,1980 Monday:

On the afternoon and evening of March 2, 1980, North Carolina experienced a major winter storm with heavy snow across the entire state and near blizzard conditions in the eastern part of the state. Widespread snowfall totals of 12 to 18 inches were observed over Eastern North Carolina, with localized amounts ranging up to 22 inches at Morehead City and 25 inches at Elizabeth City, with unofficial reports of up to 30 inches at Emerald Isle and Cherry Point (Figure 1). This was one of the great snowstorms in Eastern North Carolina history. What made this storm so remarkable was the combination of snow, high winds, and very cold temperatures.

May 10,1980 Saturday. Graduation!

****************************************/;

Outlier Details
Obs / Time ID / Type / Estimate / Chi-Square / Approx Prob>ChiSq
186 / Wednesday / Additive / -3250.9 / 87.76 / <.0001
315 / Saturday / Additive / 1798.1 / 28.19 / <.0001
247 / Monday / Additive / -1611.8 / 22.65 / <.0001
Outlier Details
Obs / Time ID / Type / Estimate / Chi-Square / Approx Prob>ChiSq
186 / 02-JAN-1980 / Additive / -3250.9 / 87.76 / <.0001
315 / 10-MAY-1980 / Additive / 1798.1 / 28.19 / <.0001
247 / 03-MAR-1980 / Additive / -1611.8 / 22.65 / <.0001

Outliers: Jan 2 (hangover day!), March 3 (snowstorm), May 10 (graduation day).

AR(1) produces 3 ‘rebound’ outlying next day residuals ().

Add dummy variables for explainable outliers

data next; merge outarima energy; by date;

hangover = (date="02Jan1980"d);

storm = (date="03Mar1980"d);

graduation = (date="10May1980"d);

PROC ARIMA data=next;

IDENTIFY var=demand crosscor=(temp tempsq

class work s c hangover graduation

storm) noprint;

ESTIMATE input = (temp tempsq class work

s c hangover graduation storm) p=1

q=(7,14) ml;

FORECAST lead=0 out=outARIMA2 id=date

interval=day; run;

Maximum Likelihood Estimation
Parameter / Estimate / Standard Error / t Value / Approx
Pr > |t| / Lag / Variable / Shift
MU / 6127.4 / 259.43918 / 23.62 / <.0001 / 0 / DEMAND / 0
MA1,1 / -0.25704 / 0.05444 / -4.72 / <.0001 / 7 / DEMAND / 0
MA1,2 / -0.10821 / 0.05420 / -2.00 / 0.0459 / 14 / DEMAND / 0
AR1,1 / 0.76271 / 0.03535 / 21.57 / <.0001 / 1 / DEMAND / 0
NUM1 / 27.89783 / 3.15904 / 8.83 / <.0001 / 0 / TEMP / 0
NUM2 / 0.54698 / 0.10056 / 5.44 / <.0001 / 0 / TEMPSQ / 0
NUM3 / 626.08113 / 104.48069 / 5.99 / <.0001 / 0 / CLASS / 0
NUM4 / 3258.1 / 105.73971 / 30.81 / <.0001 / 0 / WORK / 0
NUM5 / -757.90108 / 181.28967 / -4.18 / <.0001 / 0 / S / 0
NUM6 / -506.31892 / 184.50221 / -2.74 / 0.0061 / 0 / C / 0
NUM7 / -3473.8 / 334.16645 / -10.40 / <.0001 / 0 / hangover / 0
NUM8 / 2007.1 / 331.77424 / 6.05 / <.0001 / 0 / graduation / 0
NUM9 / -1702.8 / 333.79141 / -5.10 / <.0001 / 0 / storm / 0

Model looks fine.

Comparison:

AUTOREG - regression with AR(p) errors

versus

ARIMA – regression with differencing,

ARMA(p,q) errors.

SEASONALITY

Many economic and environmental series show seasonality.

(1) Very regular (“deterministic”) or

(2) Slowly changing (“stochastic”)

Example 7: NC accident reports involving deer.

Method 1: regression

PROC REG data=deer;

MODEL deer=date X11; run;

(X11: 1 in Nov, 0 otherwise)

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 1181.09091 78.26421 15.09 <.0001

X11 1 2578.50909 271.11519 9.51 <.0001

Looks like December and October need dummies too!

PROC REG data=deer;

MODEL deer=date X10 X11 X12;

run;

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 929.40000 39.13997 23.75 <.0001

X10 1 1391.20000 123.77145 11.24 <.0001

X11 1 2830.20000 123.77145 22.87 <.0001

X12 1 1377.40000 123.77145 11.13 <.0001

Average of Jan through Sept. is 929 crashes per month.

Add 1391 in October, 2830 in November, 1377 in December.

Try dummies for all but one month (need “average of rest” so must leave out at least one)

PROC REG data=deer;

MODEL deer=X1-X11;

OUTPUT out=out1 predicted=p residual=r; run;

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 2306.80000 81.42548 28.33 <.0001

X1 1 -885.80000 115.15301 -7.69 <.0001

X2 1 -1181.40000 115.15301 -10.26 <.0001

X3 1 -1220.20000 115.15301 -10.60 <.0001

X4 1 -1486.80000 115.15301 -12.91 <.0001

X5 1 -1526.80000 115.15301 -13.26 <.0001

X6 1 -1433.00000 115.15301 -12.44 <.0001

X7 1 -1559.20000 115.15301 -13.54 <.0001

X8 1 -1646.20000 115.15301 -14.30 <.0001

X9 1 -1457.20000 115.15301 -12.65 <.0001

X10 1 13.80000 115.15301 0.12 0.9051

X11 1 1452.80000 115.15301 12.62 <.0001

Average of rest is just December mean 2307. Subtract 886 in January, add 1452 in November. October (X10) is not significantly different than December.

Residuals for Deer Crash data:

Looks like a trend – add trend (date):

PROC REG data=deer;

MODEL deer=date X1-X11;

OUTPUT out=out1 predicted=p residual=r;

run;

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 -1439.94000 547.36656 -2.63 0.0115

X1 1 -811.13686 82.83115 -9.79 <.0001

X2 1 -1113.66253 82.70543 -13.47 <.0001

X3 1 -1158.76265 82.60154 -14.03 <.0001

X4 1 -1432.28832 82.49890 -17.36 <.0001

X5 1 -1478.99057 82.41114 -17.95 <.0001

X6 1 -1392.11624 82.33246 -16.91 <.0001

X7 1 -1525.01849 82.26796 -18.54 <.0001

X8 1 -1618.94416 82.21337 -19.69 <.0001

X9 1 -1436.86982 82.17106 -17.49 <.0001

X10 1 27.42792 82.14183 0.33 0.7399

X11 1 1459.50226 82.12374 17.77 <.0001

date 1 0.22341 0.03245 6.88 <.0001

Trend is 0.22 more accidents per day (1 per 5 days) and is significantly different from 0.

What about autocorrelation?

Method 2: PROC AUTOREG

PROC AUTOREG data=deer;

MODEL deer=date X1-X11/nlag=13 backstep;

run;

Backward Elimination of

Autoregressive Terms

Lag Estimate t Value Pr > |t|

6 -0.003105 -0.02 0.9878

11 0.023583 0.12 0.9029

4 -0.032219 -0.17 0.8641

9 -0.074854 -0.42 0.6796

5 0.064228 0.44 0.6610

13 -0.081846 -0.54 0.5955

12 0.076075 0.56 0.5763

8 -0.117946 -0.81 0.4205

10 -0.127661 -0.95 0.3489

7 0.153680 1.18 0.2458

2 0.254137 1.57 0.1228

3 -0.178895 -1.37 0.1781

Preliminary MSE 10421.3

Estimates of Autoregressive Parameters

Standard

Lag Coefficient Error t Value

1 -0.459187 0.130979 -3.51

Parameter Estimates

Standard Approx

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 -1631 857.3872 -1.90 0.0634

date 1 0.2346 0.0512 4.58 <.0001

X1 1 -789.7592 64.3967 -12.26 <.0001

X2 1 -1100 74.9041 -14.68 <.0001

X3 1 -1149 79.0160 -14.54 <.0001

X4 1 -1424 80.6705 -17.65 <.0001

X5 1 -1472 81.2707 -18.11 <.0001

X6 1 -1386 81.3255 -17.04 <.0001

X7 1 -1519 80.9631 -18.76 <.0001

X8 1 -1614 79.9970 -20.17 <.0001

X9 1 -1432 77.8118 -18.40 <.0001

X10 1 31.3894 72.8112 0.43 0.6684

X11 1 1462 60.4124 24.20 <.0001

Method 3: PROC ARIMA

PROC ARIMA plots=(forecast(forecast));

IDENTIFY var=deer crosscor=

(date X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11);

ESTIMATE p=1 ML input=

(date X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11);

FORECAST lead=12 id=date interval=month;

run;

Maximum Likelihood Estimation

Standard Approx

Parameter Estimate Error t Value Pr>|t| Lag Variable

MU -1640.9 877.41683 -1.87 0.0615 0 deer

AR1,1 0.47212 0.13238 3.57 0.0004 1 deer

NUM1 0.23514 0.05244 4.48 <.0001 0 date

NUM2 -789.10728 64.25814 -12.28 <.0001 0 X1

NUM3 -1099.3 74.93984 -14.67 <.0001 0 X2

NUM4 -1148.2 79.17135 -14.50 <.0001 0 X3

NUM5 -1423.7 80.90397 -17.60 <.0001 0 X4

NUM6 -1471.6 81.54553 -18.05 <.0001 0 X5

NUM7 -1385.4 81.60464 -16.98 <.0001 0 X6

NUM8 -1518.9 81.20724 -18.70 <.0001 0 X7

NUM9 -1613.4 80.15788 -20.13 <.0001 0 X8

NUM10 -1432.0 77.82871 -18.40 <.0001 0 X9

NUM11 31.46310 72.61873 0.43 0.6648 0 X10

NUM12 1462.1 59.99732 24.37 <.0001 0 X11

Autocorrelation Check of Residuals

To Chi- Pr >

Lag Square DF ChiSq ------Autocorrelations------

6 5.57 5 0.3504 0.042 -0.175 0.146 0.178 0.001 -0.009

12 10.41 11 0.4938 -0.157 -0.017 0.102 0.115 -0.055 -0.120

18 22.18 17 0.1778 0.158 0.147 -0.183 -0.160 0.189 -0.008

24 32.55 23 0.0893 -0.133 0.013 -0.095 0.005 0.101 -0.257

Autoregressive Factors

Factor 1: 1 - 0.47212 B**(1)

Method 4: Differencing

Compute and model Dt = Yt-Yt-12

Removes seasonality

Removes linear trend

Use (at least) q=(12)  et-et-12

(A) if  near 1, you’ve overdifferenced

(B) if 0<<1 this is seasonal exponential smoothing model.

Forecast is a weighted (exponentially smoothed) average of past values:

IDENTIFY var=deer(12) nlag=25;

ESTIMATE P=1 Q=(12) ml; run;

Maximum Likelihood Estimation

Standard Approx

Parameter Estimate Error t Value Pr>|t| Lag

MU 85.73868 16.78380 5.11 <.0001 0

MA1,1 0.89728 0.94619 0.95 0.3430 12

AR1,1 0.46842 0.11771 3.98 <.0001 1

Autocorrelation Check of Residuals

To Chi- Pr >

Lag Square DF ChiSq ------Autocorrelations------

6 4.31 4 0.3660 0.053 -0.161 0.140 0.178 -0.026 -0.013

12 7.47 10 0.6801 -0.146 -0.020 0.105 0.131 -0.035 -0.029

18 18.02 16 0.3226 0.198 0.167 -0.143 -0.154 0.183 0.002

24 24.23 22 0.3355 -0.127 0.032 -0.083 0.022 0.134 -0.155

Lag 12 MA 0.897 somewhat close to 1 with large standard error, model OK but not best. Variance estimate 15,122 (vs. 13,431 for dummy variable model).

Forecasts are similar 2 years out.

OPTIONAL (time permitting) Trend Breaks

Accounting for changes in trend

Example 7: Nenana Ice Classic data (trend break)

Exact time (day and time) of thaw of the Tanana river in Nenana Alaska:

1917 Apr 30 11:30 a.m.

1918 May 11 9:33 a.m.

1919 May 3 2:33 p.m.

(more data)

2010 Apr 29 6:22 p.m.

2011 May 04 4:24 p.m.

2012 Apr 23 7:39 p.m.

When the tripod moves downstream, that is the unofficial start of spring.

Get “ramp” with PROC NLIN ____/

X= 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 …

PROC NLIN data=all;

PARMS point=1960 int=126 slope=-.2;

X = (year-point)*(year>point);

MODEL break = int + slope*X;

OUTPUT out=out2 predicted=p residual=r;

Approx Approximate 95% Confidence

Parameter Estimate Std Error Limits

point 1965.4 11.2570 1943.0 1987.7

int 126.0 0.7861 124.5 127.6

slope -0.1593 0.0592 -0.2769 -0.0418

PROC SGPLOT data=out2;

SERIES Y=break X=year;

SERIES Y=p X=year/

lineattrs = (color=red thickness=2);