Example – Fitting a seasonal ARIMA model
Monthly electricity usage data (from Homework 8)
[1]The yt plot doesn’t show a pronounced trend so regular differencing will probably not be necessary. To check this, we generate the ACF of the raw data:
ACF of yt
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
+----+----+----+----+----+----+----+----+----+----+
1 0.524 XXXXXXXXXXXXXX
2 -0.144 XXXXX
3 -0.476 XXXXXXXXXXXXX
4 -0.322 XXXXXXXXX
5 0.021 XX
6 0.210 XXXXXX
7 0.057 XX
8 -0.299 XXXXXXXX
9 -0.474 XXXXXXXXXXXXX
10 -0.164 XXXXX
11 0.441 XXXXXXXXXXXX “near seasonal spike”
12 0.774 XXXXXXXXXXXXXXXXXXXX seasonal spike
13 0.442 XXXXXXXXXXXX “near seasonal” spike
14 -0.148 XXXXX
15 -0.441 XXXXXXXXXXXX
16 -0.296 XXXXXXXX
17 0.003 X
18 0.150 XXXXX
19 0.026 XX
20 -0.276 XXXXXXXX
21 -0.423 XXXXXXXXXXXX
22 -0.107 XXXX
23 0.422 XXXXXXXXXXXX “near seasonal spike”
24 0.695 XXXXXXXXXXXXXXXXXX seasonal spike
[2]The ACF of yt has spikes at seasonal lags (12, 24, 36, etc) that don’t die out rapidly, so seasonal differencing appears to be necessary. The ACF and PACF
of the seasonally differenced series 12yt is shown below:
ACF of 12yt
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
+----+----+----+----+----+----+----+----+----+----+
1 0.327 XXXXXXXXX
2 0.182 XXXXXX
3 0.033 XX ACF dies out quickly
4 -0.027 XX at ‘regular’ lags
5 -0.064 XXX :
6 0.075 XXX :
7 0.130 XXXX
8 0.047 XX
9 -0.058 XX
10 -0.214 XXXXXX
11 -0.210 XXXXXX
12 -0.412 XXXXXXXXXXX ACF at lags 12, 24,…
13 -0.105 XXXX cuts off after lag 1
14 -0.098 XXX
15 0.123 XXXX
16 0.004 X
17 0.002 X
18 -0.078 XXX
19 -0.067 XXX
20 -0.045 XX
21 0.094 XXX
22 0.225 XXXXXXX
23 0.119 XXXX
24 0.017 X
PACF of 12yt
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
+----+----+----+----+----+----+----+----+----+----+
1 0.327 XXXXXXXXX
2 0.085 XXX
3 -0.055 XX
4 -0.042 XX
5 -0.043 XX
6 0.133 XXXX
7 0.099 XXX
8 -0.059 XX
9 -0.112 XXXX
10 -0.193 XXXXXX
11 -0.060 XX
12 -0.322 XXXXXXXXX ACF at lags 12, 24,…
13 0.124 XXXX dies out
14 -0.065 XXX
15 0.196 XXXXXX
16 -0.078 XXX
17 0.009 X
18 -0.009 X
19 0.021 XX
20 -0.036 XX
21 0.054 XX
22 0.064 XXX
23 -0.050 XX ACF at lags 12, 24,…
24 -0.225 XXXXXXX dies out
[3]The seasonal spikes in the ACF & PACF seem clear – they die out in the PACF and cut
off after lag 1 in the ACF – the sign of a seasonal MA(1) model. So, it seems (P,D,Q)
= (0,1,1) is a good model to try for the seasonal part of the model.
The patterns in the ACF and PACF are not as clear – they could indicate either an MA or AR type of model; that is, the ‘regular’ part of the model could be either (1,0,0) or (0,0,1) – so we will try both.
Putting the regular and seasonal parts together, the models that we will try to fit are:
(1,0,0) (0,1,1)12 and (0,0,1) (0,1,1)12
[4]Fitting the (1,0,0) (0,1,1)12 model gives the following result:
Estimates at each iteration
Iteration SSE Parameters
0 233079 0.100 0.100
1 204134 0.189 0.250
2 186089 0.259 0.400
3 174483 0.319 0.550
4 168117 0.348 0.660
5 164008 0.359 0.736
6 161123 0.366 0.792
7 159813 0.368 0.826
8 159523 0.367 0.840
9 159483 0.366 0.846
10 159478 0.365 0.847
11 159477 0.365 0.848
Relative change in each estimate less than 0.0010
Final Estimates of Parameters
Type Coef StDev T P
AR 1 0.3652 0.0998 3.66 0.000
SMA 12 0.8479 0.0743 11.42 0.000
Differencing: 0 regular, 1 seasonal of order 12
Number of observations: Original series 106, after differencing 94
Residuals: SS = 123338 (backforecasts excluded)
MS = 1341 DF = 92
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 9.1 22.8 29.0 42.2
DF 10 22 34 46
P-Value 0.524 0.414 0.713 0.630
Note that all diagnostics are good: both the AR(1) and SMA(1) parameters are significant; there were no error messages; and the Ljung-Box Q test of the residuals is non-significant. This model appears to fit the data.
Fitting the (0,0,1) (0,1,1)12 model gives the following result:
ARIMA model for kwhours
Estimates at each iteration
Iteration SSE Parameters
0 266986 0.100 0.100
1 221638 -0.050 0.219
2 193116 -0.191 0.369
3 178260 -0.282 0.519
4 169195 -0.332 0.654
5 164065 -0.353 0.740
6 160947 -0.365 0.799
7 159914 -0.368 0.830
8 159761 -0.366 0.840
9 159744 -0.365 0.844
10 159742 -0.364 0.845
11 159742 -0.364 0.845
Relative change in each estimate less than 0.0010
Final Estimates of Parameters
Type Coef StDev T P
MA 1 -0.3643 0.0954 -3.82 0.000
SMA 12 0.8451 0.0760 11.12 0.000
Differencing: 0 regular, 1 seasonal of order 12
Number of observations: Original series 106, after differencing 94
Residuals: SS = 124186 (backforecasts excluded)
MS = 1350 DF = 92
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 9.9 25.0 30.9 45.5
DF 10 22 34 46
P-Value 0.446 0.296 0.622 0.492
Again - all diagnostics are good: both the MA(1) and SMA(1) parameters are significant; there were no error messages; and the Ljung-Box Q test of the residuals is non-significant. This model also appears to fit the data.
[5] Notice the similarities between the parameters of these two models:
(1,0,0) (0,1,1)12 (0,0,1) (0,1,1)12
AR parameter = .3652MA parameter = -.3643
SMA parameter = .8479SMA parameter = .8451