STAT 202

Recitation Problem: Simple Regression Analysis

Betty’s Bodacious Burgers sells burgers (duh), and Betty has recorded advertising expenses and gross sales data for n = 12 randomly selected months. Expense and sales values are measured in $1k units. Here’s Betty’s data and some other useful results. For accurate calculation I recommend that you work to five (or more) significant digits!

x = advertising expense
($1k units) / y = gross sales
($1k units)
2.4 / 22.4
3.1 / 31.8
4.2 / 34.7
1.8 / 12.1
3.5 / 30.2
3.7 / 39.1
4.1 / 45.9
3.6 / 28.7
3.4 / 25.2
4.1 / 38.7
3.7 / 42.8
2.9 / 33.6

(a) Find the equation of the sample regression line.

(b) Next month, Betty plans to spend $2,500 on advertising. Predict (point predictor) next month’s gross sales.

(c) Calculate sy.x, the ‘standard error of estimate.’

(d) At the 5% level of significance, test to see if gross sales is related to advertising.

(e) Refer to Part (b). Find a 95% prediction interval for next month’s gross sales.

‘Take-home’ problem:

Use Excel or Minitab to generate a regression analysis printout for this data. Verify that your results for (a), (c), and (d) match those given on the printout, give or take rounding error. Also calculate the correlation coefficient and verify that the square of your correlation coefficient matches the R-squared value on the printout.

Possibly Useful Formulas:

CI:

PI:


Solution:

(a) First note that

So the sample regression line is

(b) but that’s in $1k units so predicted gross sales is $22,677.30.

(c)

(d) Let’s use a t-test to test H0: b1 = 0 (no relation) vs. H1: b1 ≠ 0 (is a relation).

Test stat is where

. So

. Degrees of freedom = n – 2 = 10, the decision rule is to reject H0 if t < -2.2281 or t > +2.2281. Since 4.7277 > 2.2281 we reject H0 and conclude that there is a relation (the slope is different from zero).

(e)

So there is a 0.95 probability that next month’s gross sales will fall between $9,272.10 and $36,082.50.


Here’s most of the regression printout:

SUMMARY OUTPUT
Regression Statistics
Multiple R / 0.831385
R Square / 0.691201
Adjusted R Square / 0.660321
Standard Error / 5.454499
Observations / 12
ANOVA
df / SS / MS / F / Significance F
Regression / 1 / 665.9443796 / 665.9444 / 22.38351 / 0.000803
Residual / 10 / 297.5156204 / 29.75156
Total / 11 / 963.46
Coefficients / Standard Error / t Stat / P-value / Lower 95% / Upper 95%
Intercept / -4.2448 / 7.841777188 / -0.54131 / 0.600148 / -21.7174 / 13.22777
x / 10.76883 / 2.276168371 / 4.731121 / 0.000803 / 5.69721 / 15.84045