Practice questions on Chapter 9 (Multiple Regression) (Solutions)
- Regression models that include more than one dependent variable are called multiple regression models. False
In Multiple Regression there are more than one independent variables but only one dependent variable.
- In the linear multiple regression model E(y) = α + β1x1 + β2x2 + β3x3, β2 represents the slope of the line relating y to x2 when β1 and β3 are both held fixed. False. because x1 and x3 are held fixed, not β1 and β3
- The printout shows the results of a linear multiple regression analysis relating the sales price y of a product to the time in hours x1 and the cost of raw materials x2 needed to make the product.
a. What is the least squares prediction equation?Y-hat = -26.484-2.1686 (Time) + 8.142 (Materials)
b. Identify the SSE or SS-Residual from the printout.2.809638
c. What is the standard error of the model. 1.18525
d. Which independent variable is the most significant? Materials, because its p-valule is the lowest (0.0176). Lower the p-value, more the significance.
e. Is the overall model significant at 0.01 level? Yes, because the p-value of the overall model is 0.004837 which is less than 0.01
- Following is the regression output of a multiple regression model with three independent variables.
Predictor / Coeff / Std. Error / T / P
Intercept / 900 / 1589 / 0.5664 / 0.5789
X1 / 0.45 / 0.2458 / 1.8307 / 0.0858
X2 / 1.82 / 0.8506 / 2.1397 / 0.0481
X3 / 3.29 / 1.2565 / 2.6184 / 0.0186
ANOVA
Source / DF / SS / MS / F / PRegression / 3 / 8,126,548 / 2,708,849.33 / 3.73 / 0.033
Residual / 20 / 14,523,685 / 726,184.25
Total / 23 / 22,650,233
- What is the least square regression line? Y-hat = 900 + 0.45.X1 + 1.82.X2 + 3.29.X3
- What is the SSE? 14,523,685
- If X2 and X3 are held constant and if X1 increases by 5 units, what will be the estimated average change in y? y will increase by an estimated average value of 5 * 0.45 = 2.25
- Which independent variable is the most significant?X3 because its p-value is the lowest.
- Is X1 significant at 0.05 level? No (because its p-value is higher than 0.05)
- Is the overall model significant at 0.05 level?Yes (because the overall p-value of 0.033 is < 0.05)
- What is the R-square value?SSR/SSyy = 8126548/22650233 = 0.3588
- Test the hypothesis (at alpha of 0.05) that beta1 is not zero
H0 : beta1 = 0
Ha : beta1 ≠ 0
p-value = 0.0858
Fail to Reject the Null
Conclusion: Not sufficient evidence at significance level of 0.05 that beta1 is not zero.
- Test the hypothesis (at alpha of 0.05) that beta2 is not zero
H0 : beta2 = 0
Ha : beta2 ≠ 0
p-value = 0.0481
Reject the Null
Conclusion: There is sufficient evidence at significance level of 0.05 that beta2 is not zero.
- Retail price data for n = 60 hard disk drives were recently reported in a computer magazine. Three variables were recorded for each hard disk drive:
y = Retail PRICE (measured in dollars)
x1 = Microprocessor SPEED (measured in megahertz)(Values in sample range from 10 to 40)
x2 = CHIP size (measured in computer processing units) (Values in sample range from 286 to 486)
A linear regression modelwas fit to the data. Part of the printout follows:
VARIABLE DFPARAMETERSTANDARD
ESTIMATEERRORTPROB > |T|
INTERCEPT 1 -373.526392 1258.1243396 -0.297 0.7676
SPEED 1 104.838940 22.36298195 4.688 0.0001
CHIP 1 3.571850 3.89422935 0.917 0.3629
a. Identify and interpret the estimate for the SPEED β-coefficient,
A) = 105; For every 1-megahertz increase in SPEED, we estimate PRICE (y) to increase $105, holding CHIP fixed.
B) = 105; For every $1 increase in PRICE, we estimate SPEED to increase 105 megahertz, holding CHIP fixed.
C) = 3.57; For every 1-megahertz increase in SPEED, we estimate PRICE to increase $3.57, holding CHIP fixed.
D) = 3.57; For every $1 increase in PRICE, we estimate SPPED to increase by about 4 megahertz, holding CHIP fixed.
- Identify and interpret the estimate of β2.
Estimate ofβ2is b2 = 3.57; The interpretation is: for every one unit increase in CHIP size, the Price (y) is estimated to increase by an average of $3.57, holding SPEED fixed
- As part of a study at a large university, data were collected on n = 224 freshmen computer science (CS) majors in a particular year. The researchers were interested in modeling y, a student’s grade point average (GPA) after three semesters, as a function of the following independent variables (recorded at the time the students enrolled in the university):
x1 = average high school grade in mathematics (HSM)
x2 = average high school grade in science (HSS)
x3 = average high school grade in English (HSE)
x4 = SAT mathematics score (SATM)
x5 = SAT verbal score (SATV)
A linear regression model was fit to data. A 95% confidence interval for β1 is (.06, .22). Interpret this result.
A)We are 95% confident that a CS freshman’s GPA increases by an amount between .06 and .22 for every 1-point increase in average HS math grade, holding x2 - x5 constant.
B)95% of the GPAs fall within .06 to .22 of their true values.
C)We are 95% confident that a CS freshman’s HS math grade increases by an amount between .06 and .22 for every 1-point increase in GPA, holding x2 - x5 constant.
D)We are 95% confident that the mean GPA of all CS freshmen after three semesters falls between .06 and .22.
- A linear regression model was fit to data with the following results:
______
SOURCE DF SS MS F VALUE PROB > F
MODEL 5 28.64 5.73 11.69 .0001
ERROR 218 106.82 0.49
TOTAL 223 135.46
ROOT MSE 0.700 R-SQUARE 0.211
DEP MEAN 4.635 ADJ R-SQ 0.193
Interpret the value under the column heading PROB > F.
A)There is sufficient evidence (at α = .01) to conclude that the linear model is statistically useful for predicting GPA.
B)There is insufficient evidence (at α = .01) to conclude that the linear model is statistically useful for predicting GPA.
C)Over 99% of the variation in GPAs can be explained by the model.
D)Fail to rejectH0 (at α = .01) where H0 says that all the betas are zero.