Extensions for
Forecasting Models
- Testing for trend (Daniel’s Test)
- Daniel’s non-parametric test
- Testing when the error is normal
- Forecasting Linear Trend:
- Linear regression,
- Holt’s,
- Double moving average
- Double Exponential Smoothing
- Forecasting Curvilinear trend
- Simple exponential Growth
- Modifies Exponential Growth
- Using Linear Regression to forecast non-linear time series
- Seasonal time series
- Auto-Regressive time series
Introduction
In this set of notes we discuss different types of time series, and the forecasting of their future values. It is assumed that the student is familiar with forecasting models of stationary time series such as moving average and exponential smoothing, of trended time series such as the Holt’s model and linear regression, and of basic statistical hypotheses testing and confidence intervals.
We look at time series with trend, seasonality, and autocorrelation component, for which two types of forecasts are developed:
- Single forecasts scheme – where the forecasting model is based on the whole set of data available and all the data points have the same influence on the forecast (e.g. linear regression).
- The updating scheme – where the model’s parameters are changing with the added information, and recent data influence the forecast more than older data (e.g. exponential smoothing).
The general structure of these notes is as follows: First we test to detect a component of interest (such as trend), then we present a single forecast model(s) and updating model(s). We use examples to demonstrate the concepts, and Excel to solve them.
1.Testing the Presence of a Trend
A time series can exhibit non-stationary behavior in various ways. For example it may develop temporary “drift” which creates a cycle, or it may be trended, or seasonal. In this note we focus on the trend, which can be caused by demand changes due to population growth, improving technology that improves productivity, changes in social behavior (divorce rates, TV time), etc. Identifying the presence of trend will prevent the erroneous use of stationary models. Recall that a stationary time series can be formulated by
Yt=0 +t, while a time series with trend is formulated by Yt = t + t where Tt is the trend component (for example, for linear trend Tt = 0 +1t).Stationary forecasting models producehorizontal forecasts of the form; this means that the forecast for any period in the future remains the same: the current estimate of 0. Thus a stationary model dose not account for time effects.To demonstrate the possible “damage” of using a stationary model to produce forecasts for a trended time series assume an m-period moving average was used to forecast a linear trend time series. Let MAt(m) represent the moving average value as of time ‘t’. The average age of information used in calculating the moving average is: Average Age = [0+1+2+…+(m-1)]/m = (m-1)/2. Then, t-(m-1)/2 can be interpreted as the point in time where the moving average is centered,and a forecast for period t+p as of period‘t’ for the trended time seriesshould be MAt(m)+1[(m-1)/2+p]. This demonstrates the possible error if using only MAt(m) to forecast a trended time series.
1.1Daniel’s Test for the Trend of a Time Series
The Daniel’s test is designed specifically to identify trend. The test is built on the calculation of differences between the time and the ranks of the time series values for each point; specifically if there is a trend, than the ranks of the ranked data should form a series similar to the time periods themselves. The following graph demonstrates the situation in a “perfect world”:
For the case shown above the difference between ‘t’ and the rank Rt assigned to Yt (that is (t – Rt)) is always zero. Since the time series is random, its behavior is not ‘perfect’ and therefore sometimes the differences are positive, other times negative, and of course zeros too. Observe the following example:
We conclude the values of (t – Rt)2 should be small when the time series is mostly trended. Thus, a statistic that builds on these differences should get a small value when trend is present. More specifically, the statistic is a coefficient called The Spearman’s Correlation Coefficient calculated by rs = 1- [a function of‘t – Rt’).
Because of the way the statistic is calculated the null hypothesis is rejected if
rs is sufficiently large. More formally,
H0: = 0 (There is no trend)
H1: 0 (There is trend)
Two cases need to be considered.
Case 1: n 30
The testing procedure depends on the existence or non-existence of ties in the time series.
No ties exist in the time series:
- Sort the data from smallest to largest and rank them from 1 through n.
- Sort the ranksof Ytby the time t when Yt took place.
- Calculate the Spearman’s coefficient for the ranks:
Definitions:
Rt = the rank of the data point that belongs to time period t.
dt = t - Rt
n = the sample size (how many data points were recorded).
- Reject the null hypothesis if |rs|> rcr, where the statistic rcr is a value taken from the Spearman table provided below (it depends on ‘n’ and
There are ties in the time series.
After ranking the time series from smallest to largest and assigning ranks, average the ranks of each tie separately and assign the averaged rank of each tie to all the tie members. To calculate the correlation coefficient estimate, use the Pearson’s correlation definition for the ranks.
Pearson’s correlation coefficient:
We’ll use Excel to determine r, then test whether = 0. If this hypothesis is rejected then trend is present. To test we use the Fisher’s transformation.
F(r) is approximately normally distributed with a = .5Ln[(1+)/(1-)] and
= 1/. So is normally distributed and the Z statisticbecomes:
.
H0 is rejected if |Z| > Z/2The following two examples demonstrates the procedures when n < 30. The large sample case is discussed later.
Example1(a)
Test whether or not the following time series of sales exhibits any trend.
Time t / Sales / Sorted Sales / Time of Sorted Sales / Ranks Dt21 / 9 / 5 / 11 / 1 / 100
2 / 6 / 5.5 / 13 / 2 / 121
3 / 13 / 6 / 2 / 3 / 1
4 / 11 / 7 / 5 / 4 / 1
5 / 7 / 7.5 / 7 / 5 / 4
6 / 8 / 8 / 6 / 6 / 0
7 / 7.5 / 9 / 1 / 7 / 36
8 / 10 / 10 / 8 / 8 / 0
9 / 14 / 11 / 4 / 9 / 25
10 / 11.5 / 11.5 / 10 / 10 / 0
11 / 5 / 12 / 12 / 11 / 1
12 / 12 / 13 / 3 / 12 / 81
13 / 5.5 / 14 / 9 / 13 / 16
No ties were observed.We use Spearman’s coefficient: -.06
The critical value from Spearman’s table is r(n=13,.05) = .5549. There is insufficient evidence to support the presence of trend at 5% significance level.
Spearman critical values
One-Tailed : .001 .005 .010 .025 .050 .100
Two-Tailed : .002 .010 .020 .050 .100 .200 n
Example 1(b): There are ties in the time series.
Time t / Sales / Sorted Sales /Ranks / Tie Breaking AvgRanks / Time of
Sorted Sales
1 / 1 / 1 / 1 / 1 / 1
2 / 4 / 3 / 2 / 2.5 / 3
3 / 3 / 3 / 3 / 2.5 / 6
4 / 6 / 4 / 4 / 5.5 / 2 /
5 / 6 / 4 / 5 / 5.5 / 7
6 / 3 / 4 / 6 / 5.5 / 13
7 / 4 / 4 / 7 / 5.5 / 14
8 / 5 / 5 / 8 / 10.5 / 8
9 / 5 / 5 / 9 / 10.5 / 9
10 / 9 / 5 / 10 / 10.5 / 12
11 / 8 / 5 / 11 / 10.5 / 16
12 / 5 / 5 / 12 / 10.5 / 22
13 / 4 / 5 / 13 / 10.5 / 26
14 / 4 / 6 / 14 / 15 / 4
15 / 7 / 6 / 15 / 15 / 5
16 / 5 / 6 / 16 / 15 / 17
17 / 6 / 7 / 17 / 17 / 15
18 / 7 / 7 / 18 / 17 / 18
19 / 11 / 7 / 19 / 17 / 20
20 / 7 / 8 / 20 / 21 / 11
21 / 8 / 8 / 21 / 21 / 21
22 / 5 / 8 / 22 / 21 / 24
23 / 10 / 9 / 23 / 23.5 / 10
24 / 8 / 9 / 24 / 23.5 / 27
25 / 12 / 10 / 25 / 25 / 23
26 / 5 / 11 / 26 / 27 / 19
27 / 9 / 11 / 27 / 27 / 28
28 / 11 / 11 / 28 / 27 / 29
29 / 11 / 12 / 29 / 29 / 25
Case 2: n > 30
In cases where the number of observations is greater than 30, rs is approximately normally distributed.
When no ties occur the test statisticis calculated as follows:
H0 is rejected if |z| > z/2.
With ties we determine r as before by Excel, and run a t-test where
.
H0is rejected if |t| > t/2,n-2.
1.2Testing for trend when t is normal and independent
No assumption with respect to was necessary when applying Daniel’s test to the data. But when there is justification to believe the error term form a set of independent random variables normally distributed with a mean of “zero” and some standard deviation more powerful tests can be utilized for testing the presence of a trend. Under this assumption linear regression becomes a very attractive tool to help determine if there is trend present.
Testing linear relationship – the linear regression approach
A ‘t – test’ is utilized when testing the validity of a linear regression model, that is when testing the existence of linear relationship between the time variable ‘t’ and the time series Yt. Recall: The linear regression is a model fitting procedure that determines the “best estimates” for the coefficients 0 and 1 of the linear trend model Yt = 0+1t+t. Once the estimated model was determined based on the data set available, a test is needed to verify linear relationship exists between the independent variable ‘t’ and the dependent variable Yt. Linear relationship exists when the slope (1) is not ‘zero’. So we test:
H0: 1 = 0
H1: 10
T statistic: t = .
Rejection rule: |t| > t/2,n-2
If H0 is rejected we have sufficient evidence at alpha% significant level to claim that linear relationship exists between‘t’ and Yt, which means that linear trend is present. Most of the time this test rejects H0 even when a non-linear trend is present!
Example 2:
A large retailer is happy to experience an increase in sales over the last 16 quarters. Quarterly sales are shown:
Yt / 177 / 187 / 211 / 237 / 220 / 255 / 300 / 351 / 401 / 360 / 520 / 575 / 545 / 558 / 560 / 580t / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 / 11 / 12 / 13 / 14 / 15 / 16
Observing the graph that describes this time series we recognize an upward trend.
We want to statistically verify there is a linear trend present. We run regression analysis, and obtain the Excel output shown (assume the test for normality of the regression line errors (residuals) reveals the required normality):
SUMMARY OUTPUTRegression Statistics
Multiple R / 0.9652374
R Square / 0.9316833
Adjusted R Square / 0.9268035
Standard Error / 42.23754
Observations / 16
ANOVA
df / SS / MS / F / Significance F
Regression / 1 / 340617.3 / 340617.3 / 190.92793 / 1.5E-09
Residual / 14 / 24976.137 / 1784.0098
Total / 15 / 365593.44
Coefficients / Standard Error / t Stat / P-value / Lower 95% / Upper 95%
Intercept / 108.275 / 22.149553 / 4.8883606 / 0.0002394 / 60.768934 / 155.78107
t / 31.651471 / 2.2906522 / 13.817667 / 1.5E-09 / 26.73851 / 36.564431
Now we test whether the slope (trend) could be equal to zero.
H0: 1 = 0; H1: 10. If the p-value < alpha, the null hypothesis (H0) is rejected in favor of H1, and we conclude that there is sufficient evidence (at alpha% significance level) that there is linear trend present in the time series. (This is just another approach to performing the t-test explained above). In this case, p-value = 1.5(10)-9, clearly less than any practical alpha value, so we conclude there is overwhelming evidence for linear trend.Moreover, since b1 = 31.65 > 0 the trend is positive.
Once the presence of trend was established a forecasting model needs to be fitted to the data.
2.1Forecasting Linear Trend with Linear Regression
Linear regression with ‘time’ as the independent variable is a popular single forecast linear trend model. The model formulation is Yt = 0 + 1t + t. The parameters 0 and 1are estimated by b0 and b1.
- Forecasting p periods into the future
(6) Ft+p = b0 + b1(t+p), p = 1, 2, ...
- Prediction interval
Recall that the error variable for the regression model is assumed to be normally distributed and independent. When and are all known the future value Yt+p is normally distributed with = (t+p)and = So the confidence interval for is Ft+pZ/2However the more common case is when none of these parameters are known, thus need to be estimated from the sample.To estimate we use the unbiased estimator. The prediction interval is calculated for the regression as follows:
2.2Forecasting Linear Trend with the Holt’s method
The Holt’s method is an updating scheme, which can be considered an extension of the simple exponential smoothing model. It is adding a slope component to the intercept estimate (of the stationary model) since it is assuming the presence of linear trend. Thus in each step two estimates are made:
(i) T(t) the trend estimate as of time t: T(t)=b0+b1t.
(ii) b1(t) estimates the slope at time ‘t’.
Details are left for the student’s review.
- Forecasting p periods into the future:
(8) Ft+p = Tt + b1(t)(t+p)
- Prediction interval
(9a) Ft+1 ±Z/2S
For p = 2, 3, …
2.3Forecasting Linear Trend with Double Moving Average
This method is based on the central location of the average.
Notation
- MA(t,k) denotes the k-period moving average as of time t for the original time series.
- MA'(t,k) denotes the k-period moving average as of time t for the series of moving averages.
- T(t) denotes the slope of the time series as of period t.
In order to build a forecast for period t+p as of time‘t’, we need to estimate the slope and the series level as of time t. This is done next.
The estimated slope of the trended model as of time t
MA(t,k) is centered (k-1)/2 periods behind t, at t-(k-1)/2.
Explanation: A ‘k-period’ moving average contains the periods t, t-1, …, t-k+1. The center of this sequence is (t + t-k+1)/2 = t-(k-1)/2
MA'(t,k) is centered (k-1) periods behind t, at t-(k-1).
Explanation: The information included in the k-period moving average of the k-period moving averages belong to periods t, t-1, … , t-2(k-1), centered at t-(k-1).
On an average basis and assuming a linear trend MA(t,k) - MA'(t,k) =
[(k-1)/2]b1(t). This is so because the average is changing by T(t) units per period. Therefore,
The estimated level of the trended model as of time t (T(t) = b0+b1t)
L(t) is the expected value of the time series as of time t. So, on an average basis, the value of MA(t,k) is [(k-1)/2]b1(t) units different from T(t). Thus,
T(t) = MA(t,k) + [(k-1)/2]b1(t) =MA(t,k)+[MA(t,k)-MA'(t,k)]. This leads to -
With these two coefficient estimates we can now build a forecast as follows:
(12) Ft+p = T(t) + b1(t)p
The following example demonstrates the implementation of this model.
Example3:
Find the forecast for periods 14 and 15 of the following time series:
1 5322 546
3 557
4 559
5 554 /
6 573
7 574
8 604 /
9 611
10 626
11 636 /
12 649
13 643
Calculating the model coefficients:
b1(13) = 2/(k-1)[MA(13,6) – MA’(13,6)] = 2/(6-1)[628.1667 – 598.08] = 12.03468
T(13) = 2MA(13,6) – MA’(13,6) = 2(628.1667) – 598.08 = 658.2534
Forecasting the time series for t +1= 14 and t +2= 15: (Note that currently t = 13).
F(14) = F(13+1) = T(13) + 1b1(13) = 658.2534 + 12.03468
F(15) = F(13+2) = T(13) +2b1(13) = 658.25 + 2(12.03468)
2.4Double Exponential Smoothing
One problemwith the double moving average (DMA) model is the 2k-1 forecasts lost. Double Exponential Smoothing (DES) overcomes this problem. Much the same way as we did with the DMA model here too we’ll formulate the level and the slope estimates as follows:
Define ES(t) as the smoothed value of the time series as of time t, and ES’(t) as the smoothed value of ES(t) as of time t. Specifically,
ESt = Yt + (1 – )ESt-1;
ES’t = ESt + (1-)ES’t-1.
The expected time series value at time t is:
T(t) = 2ESt – ES’t
The estimated slope as of time t is:
b1(t) = (ESt – ES’t)
3.Forecasting Curvilinear Time Series
Curvilinear time series are trended in a non-linear fashion. The rate at which the time series changes from period to period can be the same, can increase or decrease in a linear fashion, or can change over time. In what follows we observe two non-linear time series. The first one is quadratic where we apply the Holt’s updating approach to update the coefficients that generate the quadratic behavior and the secondone is changing exponentially over time where a single forecast approach is implemented. Additional such models are listed at the end of this section without details.
3.1Quadratic Trend – an Updating Scheme
In this section we present a modified Holt’s model, designed to forecast quadratic trend time series in an updating fashion. To do this we’ll need to estimate the trend component of the model as well as the quadratic function coefficients.
Note that Yt = Tt+t, where Tt = 0+1t+2t2.
First we exponentially smooth the time series three times:
A1 = A’1 = A”1 = Y1
(13a) At = Yt + (1– )At-1 t = 2, 3,…
(13b) A’t = At + (1– )A’t-1
(13c) A”t = A’t + (1 –A”t-1)
Secondly we compute the trend estimate and the coefficient estimates as of time ‘t’ using the following relationships:
Finally the p period ahead forecast can be determined by:
Example 4
There has been an impressive growth in the transportation sector of the economy from the 60th on. This was reflected by personal consumption expenditure. The following table summarizes this observation (in billions of dollars)
Expenditure / 44.8 / 47.4 / 49.5 / 54.3 / 58.4 / 60.4 / 63.3 / 69.3 / 75.5 / 80.6 / 92.3 / 105.4
Year / 1973 / 1974 / 1975 / 1976 / 1977 / 1978 / 1979 / 1980 / 1981 / 1982 / 1983 / 1984
Expenditure / 114.6 / 117.9 / 129.4 / 155.2 / 179.3 / 198.1 / 219.4 / 236.6 / 261.5 / 267.3 / 291.9 / 319.5
Draw the time series and fit a quadratic function to the data.
We assume the time series is Yt = 0 + 1t + 2t2 + t.
= .4
A1 = A’1 = A”1 = Y1 = 44.8
A2 = Y2 + (1– )A1 = .4(44.8) + (1-.4)44.8 = 44.8
A’2= A2 + (1– )A’1 = .4(44.8) + (1-.4)44.8 = 44.8
A”2 = A’2 + (1 –A”1 = .4(44.8) + (1-.4)44.8 = 44.8
= 0
= 0
A3 = .4Y3 + (1-.4)A2 = .4(57.2) + (1-.4)(44.8) = 50.904
So on.
At t=24 we have: T(24) = 317.8359; b1(24) = 22.9511;b2(24) = .7115
So
= 341.1428
= 365.1612
= 389.8911
The alpha was optimized with Solver when minimizing MSE. * = .2873.
3.2Simple Exponential growth model – a single forecast scheme
There are applications (economical, financial) where growth is exponential with a constant growth rate. The mathematical model that describes the behavior of such a time series is:
Typical examples are:
1 is the rate of change of this time series (note: in terms of average behavior, the trend ratio t/1t-1 = 1, which represents the time series changes not including the errors).
A logarithmic transformation on both sides of the equation provides a linear model in ‘t’.
Ln(Yt) = Ln(0)+ [Ln(1)]t+Ln(t), which can be re-written as Y’t = '0 + '1t + 't.
We estimate the coefficients by using any selected procedure running on the transformed time series (such as linear regression), and obtain a logarithmic linear trend prediction equation.
To generate forecasts for the original time series we perform the anti-log transformation. See example next:
Example 5(a)
A credit union is planning its investment strategy, and is trying to forecast the monthly loan requests for the next year. It collected loan request data over the last two years. Calculate the monthly forecast for next quarter on monthly basis using both the ratio and the logarithmic transformations.
The data
Year 1 / 297 / 249 / 310 / 305 / 307 / 381 / 449 / 465 / 485 / 500 / 550 / 580Year 2 / 610 / 688 / 702 / 801 / 965 / 958 / 977 / 1078 / 1165 / 1255 / 1248 / 1344
The time series graph is provided:
Logarithmic transformation: The transformed model is LnRequests = Ln0+Ln1t+ Lnt.
Linear relationship is quite clear (although tests were performed to verify this). After running regression analysis the normality of the error term was verified. The regression model is now used to perform the forecasts. The prediction equation is:
LnRequestst = 5.4927 + .07844t.LnRequest24+1 =5.4927+.07844(24+1) = 7.3388
Confidence limits (95%) for LnRequest are:
LCL(24+1)=LnReqt+1 – t.025,24-2S=7.3388-2.07(.0669)sqrt(*)
=7.188. The value of Swas found on the Excel regression output under "Standard Error". The upper confidence limit for LnRequest was found in a similar manner:
UCL(24+1)=LnReqt+1 + t.025,24-2S
The loan request forecasts and their confidence limits were calculated as follows:
Request25 = exp(7.3388) = 1538.864, LCL25 = exp(7.188) = 1323.738,
UCL25 = exp(7.489) = 1788.95. (Note that antiLn X = exp(x)). The following graph demonstrates these results:
Simple Exponential Growth – an Updating Scheme
Much the same way the logarithmic transformation was used above to produce forecasts based on the single forecast approach with linear regression, we can take advantage of the linearity of the transformed model, and apply (say) the Holt’s method to obtain Ft+p based on an updating scheme. Using the same example above, let us demonstrate a few steps of the Holt’s procedure:
Example 5(b): Let = .9; = .5.
T2 = Y’2 = LnY2 = Ln249 = 5.517
b1(2)= Y’2 – Y’1 = Ln249 – Ln297 =5.517 – 5.693 = -.1763
F3= T2 + b1(2) = 5.341
T3 = .9Y’3 + (1-.9)F3 = 5.697
b1(3) = .5(T3 – T2) + (1-.5)b1(3) = .00165
F4= T3 + b1(3) = 5.698
And so on.
At t=24 we have: T24 = 7.2006; b1(24) = .0509;
So F’24+1 = 7.2006+.0509 = 7.2515; F’24+2 = 7.2515 + 2(.0509) = 7.3024’ and the time series predicted values are:
F24+1 = exp(7.2515) = 1410.22; F24+2 = exp(7.3024) = 1483.86.
Observe the graph:
Prediction interval can now be constructed using the prediction interval formula (7) provided above for the Holt’s model.Let us require 95% confidence level. S is estimated from the data as follows: .
95% Prediction Interval for period 25=
95% Prediction Interval for period 26=
The antilog procedure (exp(.)) will provide the confidence limits for the future forecast values of the time series F24+1, and F24+2.
There is some criticism on the simple exponential growth model regarding the unlimited growth it allows. A correction for this problem is suggested by the following model.
3.3The Modified Exponential Growth Model
The following model provides an asymptotic growth which can model the behavior if real processes over time.
As opposed to the simple exponential growth model where the time series itself changes at a constant rate, here the first differences are changing at a constant rate:
Here are a few illustrations:
1< 0; 2 < 11 > 0; 2 < 11 < 0;2 > 1