DS 303
Spring 2004
Final Exam
Name: ____KEY______
Show All your Work
1. Decision Science Associates has been asked to do a feasibility study for a proposed destination resort to be located within half a mile of the Grand Coulee Dam. Mark Craze is not happy with the regression model that used the price of a regular gallon of gasoline to predict the number of visitors to the Grand Coulee Visitor Center. After plotting the data, Mark decides to use a dummy variable to represent significant celebrations in the general area. Mark uses a 1 to represent a celebration and a 0 to represent no celebration.. Mark also decides to use time as a predictor variable. Mark runs these data on the computer using Excel. The partial out put is given below.
SUMMARY OUTPUTRegression Statistics
Multiple R / 0.904651648
R Square / 0.818394605
Adjusted R Square / 0.763912986
Standard Error / 70006.05694
Observations / 14
ANOVA
df / SS / MS / F
Regression / 3 / 2.20854E+11 / 73617995859 / 15.02148113
Residual / 10 / 49008480079 / 4900848008
Total / 13 / 2.69862E+11
Coefficients / Standard Error / t Stat / P-value
Intercept / 309899.41 / 59495.89 / 5.209 / 0.00040
Time index / 24430.90 / 7240.12 / 3.374 / 0.0071
Price of gasoline / -193330.79 / 97705.70 / -1.979 / 0.076
Celebration / 217138.10 / 47412.24
a) Write the estimated least square regression line?
ŷ = 309899.41 + 24430.9 Time – 193330.79 Gas + 217138.1 Celeb.
b) Should we keep all the variables in the model? If no, which one do you suggest to drop and why?
Yes, they are all statistically significant predictors of the number of visitors. Price of gasoline is border line, but since it is not far from 5% I still keep it.
c) How are b2 and b3 are interpreted here?
b2 For every dollar increase in price of gas while all other variables are kept the same , the number of visitors will go down by 193330.
b3 While all other variables are kept the same, the number of visitors is 217138 more for times when there is a significant celebration in the area.
d) What is the estimated number of visitors if the Time index is 16, price of Gasoline is 1.45, and there is celebration in the general area.
ŷ = 309899.41 + 24430.9(16) – 193330.79(1.45) + 217138.1(1)
ŷ = 637602.26
e) Give a 90% confidence interval for b2.
b2 ± t*s (b2) t* = t.05, 10 = 1.812
-193330.79 ± 1.812 (97705.70)
-193330.79 ± 177042.72
(-370373.52, 16288.1)
f) Test the overall fit of the model (State the null and alternative hypothesis, test statistic, the decision criteria, and your conclusion) usea =5%.
Ho: β1 = β2 = β3 = 0 Decision criteria Reject Ho if F > F.05, 3, 10 = 3.71
Ha: Not all βi = 0
F = 15.02 F = 15.02 > 3.71 reject Ho: at least one of these explanatory variables is a significant predictor.
2. An ANOVA table is
Source / DF / SS / MS / FRegression / 1 / 50 / 50 / 2.556
Error / 23 / 450 / 19.56
Total / 24 / 500
a. Complete the table.
b. How large was the sample?
25
c. Determine the coefficient of determination.
R2 = SSR/SST = 50/450 = .11
3. A sample of 25 mayoral campaigns in cities with population larger than 50,000 showed that the correlation between the percent of the vote received and the amount spent on the campaign by the candidate was .34. At the 5% significance level, is there a positive correlation between the variables? State the null and the alternative hypothesis, the test statistic, the decision criteria, and your conclusion.
n = 25 Ho: ρ = 0 Reject Ho if t > t.05, 23 = 1.714
r = .34 Ha: ρ > 0
α = 5%
Reject Ho: there is a positive correlation between the % vote received and the
amount spent on the campaign.
4. You test for serial correlation, at the .05 level with 32 residuals from a regression with two independent variables. If the calculated Durbin-Watson statistic is equal to 1.0, what is your conclusion? State the null, and the alternative hypothesis, the decision criteria, and your conclusion.
α = .05
DW = 1 Ho: ρ = 0
n = 32 Ha: ρ ≠ 0
k = 2 from table 1.31 = L
1.57 = U
Since DW = 1 < L = 1.31 Reject Ho
There is serial correlation
5. A tanning parlor located in a major shopping center near a large New England city has the following history of customers over the last four years (data are in hundreds of customers):
Number of / Moving / Centered / CMA / Seasonal / Seasonal / CycleYear / Quarters / Customers / Average / Moving Average / Trend / Factor / Index / Factor
1 / 1 / 3.50
2 / 2.90
3 / 2.00 / 2.9 / 2.975 / 2.90 / .672 / .73 / 1.026
4 / 3.20 / 3.05 / 3.113 / 3.10 / 1.028 / .98 / 1.004
2 / 1 / 4.10 / 3.175 / 3.288 / 3.29 / 1.247 / 1.27 / .999
2 / 3.40 / 3.4 / 3.45 / 3.49 / .986 / 1.01 / .989
3 / 2.90 / 3.5 / 3.638 / 3.68 / .797 / .989
4 / 3.60 / 3.775 / 3.913 / 3.88 / .920 / 1.009
3 / 1 / 5.20 / 4.05 / 4.075 / 4.08 / 1.276 / .999
2 / 4.50 / 4.1 / 4.213 / 4.27 / 1.068 / .987
3 / 3.10 / 4.325 / 4.438 / 4.47 / .699 / .993
4 / 4.50 / 4.55 / 4.613 / 4.66 / .976 / .999
4 / 1 / 6.10 / 4.675 / 4.838 / 4.86 / 1.261 / .995
2 / 5.00 / 5 / 5.188 / 5.05 / .964 / 1.027
3 / 4.40 / 5.375 / 5.25
4 / 6.00 / 5.45
a) Find a four period moving average for each quarter.
b) Find the centered moving average for the sample.
c) Find the seasonal factors and the seasonal indexes.
d) Find the cycle factors.
e) Use the multiplicative decomposition method to forecast the number of customers
for each quarter of year 4.
ŷ1 = 6.09
ŷ2 = 5.24
ŷ3 = 3.81
ŷ4 =5.29
f) Evaluate the RMSE for the year 4.
e1 = 6.10 – 6.09 = .01
e2 = 5 – 5.24 = -.24
e3 = 4.40 – 3.81 = .59
e4 = 6 – 5.29 = .71
Σ(y – ŷ)2 = .9099
RMSE = √(.9099/4) = √.227 = .48
Multiple Choice Questions
Select the best answer
1. Seasonal components
A) cannot be predicted.
B) are regular repeated patterns.
C) are long runs of observations above or below the trend line.
D) reflect a shift in the series over time.
2. Forecast errors
A) are the difference in successive values of a time series
B) are the differences between actual and forecast values
C) should all be nonnegative
D) should be summed to judge the goodness of a forecasting model
3. To select a value for a for exponential smoothing
A) use a small a when the series varies substantially.
B) use a large a when the series has little random variability.
C) use any value between 0 and 1
D) All of the alternatives are true.
4. Quantitative forecasting methods do not require that patterns from the past will necessarily continue in the future.
A) True B) False
5. All quarterly time series contain seasonality.
A) True B) False
6. You are given a time series of sales data with 10 observations. You construct forecasts according to last period’s actual level of sales. How many data points will be lost in the forecast process relative to the original data series?
A) One.
B) Two.
C) Three.
D) Zero.
E) None of the above.
7. If a sample of n= 30 is used to estimate a multiple regression having four independent variables, the degrees of freedom for the F distribution test of model significance are
A) 3 and 24
B) 2 and 24
C) 3 and 23
D) 3 and 21
D) None of the above
8. How many first initial values must the forecaster set using Holt's exponential smoothing?
A) 0.
B) 1.
C) 2.
D) 3.
E) None of the above.
9. Which of the following is not correct? Seasonality in a time series dataset containing quarterly observations can be handled by
A) using four dummy variables, one for each season.
B) using only three dummy variables.
C) using Winter's smoothing.
D) deseasonalizing the data and then applying nonseasonal methods.
10. In the linear model, the slope coefficient bi measures the expected change in Y per unit change in Xi given the other independent variables are fixed.
A) True B) false C) Not enough information
11. Which of the following is not a technique used to generate forecasts with time series decomposition?
A) Moving averages.
B) Trend projection.
C) Multiplicative seasonality.
D) Dummy variables.
E) All of the above.
12. In time-series decomposition analysis, decomposition refers to:
A) converting an annual trend line into a monthly trend line.
B) deseasonalizing the data.
C) separating a time series into component parts.
D) isolating the cyclical component of a time series.
E) None of the above.
13. When calculating centered moving-averages using a 4-period moving average, how many data points are lost at the beginning of the original series?
A) 1.
B) 2.
C) 3.
D) 4.
E) None of the above.
14. A seasonal index number of 1.80 for quarter one of an automobile parts manufacturer suggests
A) Quarter one sales are 80% above the norm.
B) Quarter one sales are 1.80% below the norm.
C) Quarter one sales are 20% below the norm.
D) Quarter one sales are 80% below the norm.
E) None of the above.
15. Quarter one sales for a tire manufacturer were $120,000,000. If the quarter one seasonal index was 1.20, what is an estimate of annual sales for this firm?
A) $100,000,000.
B) $144,000,000.
C) $400,000,000.
D) $576,000,000.
E) None of the above.
16. Suppose Nike sales are expected to be 1.2 billion dollars for the year 2005. If the January seasonal index for Nike is .98, what is a reasonable estimate for January 2005 sales revenue?
A) .098 billion.
B) .1 billion.
C) 1.176 billion.
D) 2.18 billion.
Note: The next three questions relate to the following data:
Time Period / Actual Series / Forecast Series / Forecast Error1 / 100 / 100 / 0
2 / 110 / -- / --
3 / 115 / -- / --
17. If a smoothing constant of .3 is used, what is the exponentially smoothed forecast for period 4?
A) 106.6.
B) 103.0.
C) 115.0.
D) 112.6.
E) 104.4.
18. What is the forecast error for period 3?
A) +3.
B) +12.
C) +12.
D) -7.
F) +7.
19. If a three-month moving-average model is used, what is the forecast for period 4?
A) 104.4.
B) 106.6.
C) 107.1.
D) 108.3.
E) 110.2.
20. If the smoothing constant were chosen to be unity, the exponential smoothing model would equal
A) moving average smoothing.
B) Holt's exponential smoothing.
C) the simple naive model.
D) Winter's exponential smoothing.
E) moving average smoothing with a one-year lag.