Example Fitting a Seasonal ARIMA Model

Example Fitting a Seasonal ARIMA Model

Example – Fitting a seasonal ARIMA model

Monthly electricity usage data (from Homework 8)

[1]The yt plot doesn’t show a pronounced trend so regular differencing will probably not be necessary. To check this, we generate the ACF of the raw data:

ACF of yt

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

+----+----+----+----+----+----+----+----+----+----+

1 0.524 XXXXXXXXXXXXXX

2 -0.144 XXXXX

3 -0.476 XXXXXXXXXXXXX

4 -0.322 XXXXXXXXX

5 0.021 XX

6 0.210 XXXXXX

7 0.057 XX

8 -0.299 XXXXXXXX

9 -0.474 XXXXXXXXXXXXX

10 -0.164 XXXXX

11 0.441 XXXXXXXXXXXX  “near seasonal spike”

12 0.774 XXXXXXXXXXXXXXXXXXXX  seasonal spike

13 0.442 XXXXXXXXXXXX  “near seasonal” spike

14 -0.148 XXXXX

15 -0.441 XXXXXXXXXXXX

16 -0.296 XXXXXXXX

17 0.003 X

18 0.150 XXXXX

19 0.026 XX

20 -0.276 XXXXXXXX

21 -0.423 XXXXXXXXXXXX

22 -0.107 XXXX

23 0.422 XXXXXXXXXXXX  “near seasonal spike”

24 0.695 XXXXXXXXXXXXXXXXXX  seasonal spike

[2]The ACF of yt has spikes at seasonal lags (12, 24, 36, etc) that don’t die out rapidly, so seasonal differencing appears to be necessary. The ACF and PACF

of the seasonally differenced series 12yt is shown below:

ACF of 12yt

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

+----+----+----+----+----+----+----+----+----+----+

1 0.327 XXXXXXXXX 

2 0.182 XXXXXX 

3 0.033 XX  ACF dies out quickly

4 -0.027 XX  at ‘regular’ lags

5 -0.064 XXX :

6 0.075 XXX :

7 0.130 XXXX

8 0.047 XX

9 -0.058 XX

10 -0.214 XXXXXX

11 -0.210 XXXXXX

12 -0.412 XXXXXXXXXXX  ACF at lags 12, 24,…

13 -0.105 XXXX cuts off after lag 1

14 -0.098 XXX 

15 0.123 XXXX

16 0.004 X

17 0.002 X

18 -0.078 XXX

19 -0.067 XXX

20 -0.045 XX

21 0.094 XXX

22 0.225 XXXXXXX

23 0.119 XXXX

24 0.017 X 

PACF of 12yt

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

+----+----+----+----+----+----+----+----+----+----+

1 0.327 XXXXXXXXX

2 0.085 XXX

3 -0.055 XX

4 -0.042 XX

5 -0.043 XX

6 0.133 XXXX

7 0.099 XXX

8 -0.059 XX

9 -0.112 XXXX

10 -0.193 XXXXXX

11 -0.060 XX

12 -0.322 XXXXXXXXX  ACF at lags 12, 24,…

13 0.124 XXXX dies out

14 -0.065 XXX

15 0.196 XXXXXX

16 -0.078 XXX

17 0.009 X

18 -0.009 X

19 0.021 XX

20 -0.036 XX

21 0.054 XX

22 0.064 XXX

23 -0.050 XX ACF at lags 12, 24,…

24 -0.225 XXXXXXX dies out

[3]The seasonal spikes in the ACF & PACF seem clear – they die out in the PACF and cut

off after lag 1 in the ACF – the sign of a seasonal MA(1) model. So, it seems (P,D,Q)

= (0,1,1) is a good model to try for the seasonal part of the model.

The patterns in the ACF and PACF are not as clear – they could indicate either an MA or AR type of model; that is, the ‘regular’ part of the model could be either (1,0,0) or (0,0,1) – so we will try both.

Putting the regular and seasonal parts together, the models that we will try to fit are:

(1,0,0) (0,1,1)12 and (0,0,1) (0,1,1)12

[4]Fitting the (1,0,0) (0,1,1)12 model gives the following result:

Estimates at each iteration

Iteration SSE Parameters

0 233079 0.100 0.100

1 204134 0.189 0.250

2 186089 0.259 0.400

3 174483 0.319 0.550

4 168117 0.348 0.660

5 164008 0.359 0.736

6 161123 0.366 0.792

7 159813 0.368 0.826

8 159523 0.367 0.840

9 159483 0.366 0.846

10 159478 0.365 0.847

11 159477 0.365 0.848

Relative change in each estimate less than 0.0010

Final Estimates of Parameters

Type Coef StDev T P

AR 1 0.3652 0.0998 3.66 0.000

SMA 12 0.8479 0.0743 11.42 0.000

Differencing: 0 regular, 1 seasonal of order 12

Number of observations: Original series 106, after differencing 94

Residuals: SS = 123338 (backforecasts excluded)

MS = 1341 DF = 92

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 9.1 22.8 29.0 42.2

DF 10 22 34 46

P-Value 0.524 0.414 0.713 0.630

Note that all diagnostics are good: both the AR(1) and SMA(1) parameters are significant; there were no error messages; and the Ljung-Box Q test of the residuals is non-significant. This model appears to fit the data.

Fitting the (0,0,1) (0,1,1)12 model gives the following result:

ARIMA model for kwhours

Estimates at each iteration

Iteration SSE Parameters

0 266986 0.100 0.100

1 221638 -0.050 0.219

2 193116 -0.191 0.369

3 178260 -0.282 0.519

4 169195 -0.332 0.654

5 164065 -0.353 0.740

6 160947 -0.365 0.799

7 159914 -0.368 0.830

8 159761 -0.366 0.840

9 159744 -0.365 0.844

10 159742 -0.364 0.845

11 159742 -0.364 0.845

Relative change in each estimate less than 0.0010

Final Estimates of Parameters

Type Coef StDev T P

MA 1 -0.3643 0.0954 -3.82 0.000

SMA 12 0.8451 0.0760 11.12 0.000

Differencing: 0 regular, 1 seasonal of order 12

Number of observations: Original series 106, after differencing 94

Residuals: SS = 124186 (backforecasts excluded)

MS = 1350 DF = 92

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 9.9 25.0 30.9 45.5

DF 10 22 34 46

P-Value 0.446 0.296 0.622 0.492

Again - all diagnostics are good: both the MA(1) and SMA(1) parameters are significant; there were no error messages; and the Ljung-Box Q test of the residuals is non-significant. This model also appears to fit the data.

[5] Notice the similarities between the parameters of these two models:

(1,0,0) (0,1,1)12 (0,0,1) (0,1,1)12

AR parameter = .3652MA parameter = -.3643

SMA parameter = .8479SMA parameter = .8451