1) Consider the Following Partially Completed Computer Printout for a Regression Analysis

1) Consider the Following Partially Completed Computer Printout for a Regression Analysis

1) Consider the following partially completed computer printout for a regression analysis where the

dependent variable is the price of a personal computer and the independent variable is the size of

the hard drive.

Based on the information provided, what is the estimate for the standard error of the estimate for

the regression model?

A) Approximately 690.50 B) About 4,026

C) 476,800 D) Just under 376.23

2) A recent study of 15 shoppers showed that the correlation between the time spent in the store and

the dollars spent was 0.235. Using a significance level equal to 0.05, which of the following is the

appropriate null hypothesis to test whether the population correlation is zero?

A) Ho : μ = 0.0 B) Ho : p = 0.0 C) Ho : p ≠ 0 D) Ho : r = 0.0

3) Recently, an automobile insurance company performed a study of a random sample of 15 of its

customers to determine if there is a positive relationship between the number of miles driven and

the age of the driver. The sample correlation coefficient is r = .38. Given this information, which of

the following is appropriate critical value for testing the null hypothesis at an alpha = .05 level?

A) t = 1.7531 B) t = 1.7709 C) t = 1.7613 D) t = 2.6104

4) The following regression output was generated based on a sample of utility customers. The

dependent variable was the dollar amount of the monthly bill and the independent variable was

the size of the house in square feet.

Based on this regression output, which of the following statements is not true?

A) The average increase in the monthly power bill is about 66.4 for each additional square foot of

space in the house.

B) The correlation of the monthly power bill with the square footage of the house is 0.149.

C) The number of square feet in the house explains only about 2 percent of the variation in the

monthly power bill.

D) At an alpha level equal to 0.05, there is no basis for rejecting the hypothesis that the slope

coefficient is equal to zero.

5) When using regression analysis for descriptive purposes, which of the following is of importance? 5)

A) The standard error of the regression slope coefficient

B) The size of the regression slope coefficient

C) The sign of the regression slope coefficient

D) All of the above

6) The assumption that the errors or residuals are independent is best checked by: 6)

A) looking at a normal probability plot of the residuals.

B) looking at a residual plot versus x and checking for curvature.

C) looking a scatter plot of each x versus y.

D) looking at a plot of the residuals versus time and checking for trends.

7) The following model:

y =β0 + β 1x1 + β 2x2 + β 3x1x2 + ᵋ

A) is a convex model. B) is a linear model with interaction.

C) is a composite model. D) is a second order polynomial model.

8) Use the following regression results to answer the question below.

Regression Statistics

Multiple R 0.8851

R Square 0.7835

Adjusted R Square 0.7474

Standard Error 5.4006

Observations 8

ANOVA

df SS MS F

Regression 1 633.242 633.242 21.711

Residual 6 175.000 29.167

Total 7 808.242

Coefficients Standard Error t Stat P-value

Intercept 5.93118 4.17721 1.41989 0.20545

Total Bill -2.71551 0.58279 -4.65952 0.00347

In conducting a hypothesis test of the slope using a 0.05 level of significance, which of the following

is correct?

A) The slope differs significantly from 0 because p = 0.003 is less then 0.05.

B) The slope differs significantly from 0 because p = 0.205 is greater then 0.05.

C) The slope does not differ significantly from 0 because p = 0.003 is less then 0.05.

D) The slope does not differ significantly from 0 because p = 0.205 is greater then 0.05.

9) The editors of a national automotive magazine recently studied 30 different automobiles sold in the

United States with the intent of seeing whether they could develop a multiple regression model to

explain the variation in highway mileage per gallon. A number of different independent variables

were collected. The following correlation matrix was developed:

If only one variable were to be brought into the model, which variable should it be if the goal is to

explain the highest possible percentage of variation in the dependent variable?

A) Horsepower B) Displacement C) Curb weight D) 0 to 60 mph

10) In analyzing the residuals to determine whether the simple regression analysis satisfies the

regression assumptions, which of the following is true?

A) The histogram of the residuals should be approximately bell-shaped.

B) The scatter plot of the residuals against the dependent variable should illustrate that the

variation in residuals is the same over all levels of y.

C) Neither A nor B are true.

D) Both A and B are true.

True/False

11.) In a study of 30 customers' utility bills in which the monthly bill was the dependent variable and

the number of square feet in the house is the independent variable, the resulting regression model

is y = 23.40 + 0.4x. Given this model, for a customer with a 2,000 square foot house and a monthly

utility bill equal to $100.00, the residual from the regression model is approximately -$3.40.

12) A bank is interested in determining whether its customers' checking balances are linearly related to

their savings balances. A sample of n = 20 customers was selected and the correlation was

calculated to be +0.40. If the bank is interested in testing to see whether there is a significant linear

relationship between the two variables using a significance level of .05, the correct null and

alternative hypotheses to test are:

Ho: r = 0.0

Hα: r ≠ 0.0.

13) In curvilinear regression modeling, a composite model is one that contains either the basic terms or

the interactive terms but not both.

14) The standard error of the estimate for a simple linear regression model measures the variation in

the slope coefficient from sample to sample.

15) In multiple regression analysis, the residual is the absolute difference between the actual value of y

and the predicted value of y.