Date:______Period:______

Chapter 3: Practice Quiz – Bivariate Data Part 1: Multiple Choice. (2 points each) Hand write the letter corresponding to the best answer in space provided on page 6.

_____1. In a statistics course, a linear regression equation was computed to predict the final exam score from the score on the first test. The equation was ŷ = 10 + .9x where y is the final exam score and x is the score on the first test. Carla scored 95 on the first test. What is the predicted value of her score on the final exam?

(a) 95

(b) 85.5

(c) 90

(d) 95.5

(e) None of the above

_____2. Refer to the previous problem. On the final exam Carla scored 98. What is the value of her residual?

(a) 98

(b) 2.5

(c) –2.5

(d) 0

(e) None of the above

_____3. A study of the fuel economy for various automobiles plotted the fuel consumption (in liters of gasoline used per 100 kilometers traveled) vs. speed (in kilometers per hour). A least squares line was fit to the data. Here is the residual plot from this least squares fit. What does the pattern of the residuals tell you about the linear model?

(a) The evidence is inconclusive.

(b) The residual plot confirms the linearity of the fuel economy data.

(c) The residual plot does not confirm the linearity of the data.

(d) The residual plot clearly contradicts the linearity of the data.

(e) None of the above

_____4. All but one of the following statements contains a blunder. Which statement iscorrect?

(a) There is a correlation of 0.54 between the position a football player plays and theirweight.

(b) The correlation between planting rate and yield of corn was found to be r = 0.23.

(c) The correlation between the gas mileage of a car and its weight is r = 0.71 MPG.

(d) We found a high correlation (r = 1.09) between the height and age of children.

(e) We found a correlation of r = –.63 between gender and political party preference.

_____5. In regression, the residuals are which of the following?

(a) Those factors unexplained by the data

(b) The difference between the observed responses and the values predicted by the regression line

(c) Those data points which were recorded after the formal investigation was completed

(d) Possible models unexplored by the investigator

(e) None of the above

_____6.What does the square of the correlation (r2) measure?

(a) The slope of the least squares regression line

(b) The intercept of the least squares regression line

(c) The extent to which cause and effect is present in the data

(d) The fraction of the variation in the values of y that is explained by least-squares regression on the other

_____7. A researcher finds that the correlation between the personality traits “greed” and “superciliousness” is –.40. What percentage of the variation in greed can be explained by the relationship with superciliousness?

(a) 0%

(b) 16%

(c) 20%

(d) 40%

(e) 60%

_____8. Suppose the following information was collected, where X = diameter of tree trunk in inches, and Y = tree height in feet.

If the LSRL equation is y = –3.6 + 3.1x, what is your estimate of the average height of all trees having a trunk diameter of 7 inches?

(a) 18.1

(b) 19.1

(c) 20.1

(d) 21.1

(e) 22.1

_____9. The following are resistant:

(a) Least squares regression line

(b) Correlation coefficient

(c) Both the least square line and the correlation coefficient

(d) Neither the least square line nor the correlation coefficient

(e) It depends

_____10. If dataset A of (x,y) data has correlation coefficient r = 0.65, and a second dataset B has correlation r = –0.65, then

(a) The points in A exhibit a stronger linear association than B.

(b) The points in B exhibit a stronger linear association than A.

(c) Neither A nor B has a stronger linear association.

(d) You can’t tell which dataset has a stronger linear association without seeing the data or seeing the scatterplots.

_____11. A set of data relates the amount of annual salary raise and the performance rating. The least squares regression equation is yˆ = 1,400 + 2,000x where y is the estimated raise and x is the performance rating. Which of the following statements is not correct?

(a) For each increase of one point in performance rating, the raise will increase on average by $2,000.

(b) This equation produces predicted raises with an average error of 0.

(c) A rating of 0 will yield a predicted raise of $1,400.

(d) The correlation for the data is positive.

(e) All of the above are true.

_____12. There is a linear relationship between the number of chirps made by the striped ground cricket and the air temperature. A least squares fit of some data collected by a biologist gives the model yˆ = 25.2 + 3.3x, 9 < x < 25, where x is the number of chirps per minute and yˆ is the estimated temperature in degrees Fahrenheit. What is the estimated increase in temperature that corresponds to an increase in 5 chirps per minute?

(a) 3.3°F (b) 16.5°F (c) 25.2°F (d) 28.5°F (e) 41.7°F

_____13. A simple random sample of 35 world-ranked chess players provides the following statistics: Number of hours of study per day: x = 6.2 ,s = 1.3

Yearly winnings: y = $208,000 ,sy = $42,000

Correlation r = 0.15

Based on this data, what is the resulting linear regression equation?

(a) Winnings = 178,000 + 4850 Hours

(b) Winnings = 169,000 + 6300 Hours

(c) Winnings = 14,550 + 31,200 Hours

(d) Winnings = 7750 + 32,300 Hours

(e) Winnings = -52,400 + 42,000 Hours

_____14. Suppose the correlation is negative. Given two points from the scatterplot, which of the following is possible?

I. The first point has a larger x-value and a smaller y-value than the second point.

II. The first point has a larger x-value and a larger y-value than the second point.

III. The first point has a smaller x-value and a larger y-value than the second point.

(a) I and II

(b) I and III

(c) II and III

(d) I, II and III

(e) None of the above gives the complete set of true responses.

_____15. Suppose the regression line for a set of data, y = 3x + b, passes through point (2, 5). If x and y are the sample means of the x- and y-values, respectively, then y =

(a) x

(b) x- 2

(c) x+ 5

(d) 3 x

(e) 3 x - 1

_____16. Consider the set of points {(2, 5), (3, 7), (4, 9), (5, 12), (10, n)}. What should the value for n be so that the correlation between the x- and y-values is 1?

(a) 21

(b) 24

(c) 25

(d) A value different from any of the above choices.

(e) No value for n can make r = 1.

_____17. Extra study sessions were offered to students after the midterm to help improve their understanding of statistics. Student scores on the midterm and the final exam were recorded. The following scatterplot shows final test scores against the midterm test scores.

Score_On_Midterm

Which of the following statements correctly interprets the scatterplot?

(a) All students have shown significant improvements in the final exam scores as a result of the extra study sessions.

(b) The extra study sessions were of no help. Each student’s final exam score was about the same as his or her score on the midterm.

(c) The extra study session further confuses students. All student scores decreased from midterm to final exam.

(d) Students that scored below 55 on the midterm showed considerable improvement on the final exam; those who scored between 55 and 80 on the midterm showed minimal improvement on the final exam; and those who scored above 80 on the midterm showed almost no improvement on the final exam.

(e) Students that scored below 55 on the midterm showed minimal improvement on the final exam; those who scored between 55 and 80 on the midterm showed moderate improvement on the final exam; and those who scored above 80 on the midterm showed considerable improvement on the final exam.

_____19. Which of the following statements about residuals are true?

I. The mean of the residuals is always zero

II. The regression line for a residual plot is a horizontal line.

III. A definite pattern in the residual plot is an indicator that a nonlinear model will show a better fit to the data than the straight regression line.

(f) I and II

(g) I and III

(h) II and III

(i) I, II and III

(j) None of the above gives the complete set of true responses.

_____20. Data are obtained for a group of college freshmen examining their SAT scores (math plus verbal) from their senior year of high school (x) and their GPAs during their first year of college (y). The resulting regression equation is yˆ = 0.00161x + 1.35 with r = 0.632 What percentage of the variation in GPAs can be explained by looking at SAT scores?

(a) 0.0161%

(b) 16.1%

(c) 39.9%

(d) 63.2%

(e) The value cannot be computed from the information given.

PART II -Answer completely, but be concise. Write sequentially and show all steps. Show all your work. Indicate clearly the methods you used, because you will be graded on the correctness of your methods as well as on the accuracy of your results and explanations.

YOU MUST WRITE ANSWERS IN BY HAND – you need not print out the first 5 pages.

SKIP # 1

2. Each of 25 adult women was asked to provide her own height (y), in inches, and the height (x), in inches, of her father. The scatterplot below displays the results. Only 22 of the 25 pairs are distinguishable because some of the (x, y) pairs were the same. The equation of the least squares regression line is yˆ = 35.1 + 0.427x. (16 points)

(a) Draw the least squares regression line on the scatterplot below.

(b) One father’s height was x = 67 inches and his daughter’s height was y = 61 inches. Circle the point on the scatterplot above that represents this pair and draw the segment on the scatterplot that corresponds to the residual for it. Give a numerical value for the residual.

(c) Suppose the point x = 84, y = 71 is added to the data set. Would the slope of the least squares regression line increase, decrease, or remain about the same? Explain. (Note: No calculations are necessary to answer this question.)

(d) Would the correlation increase, decrease, or remain about the same? Explain. (Note: No calculations are necessary to answer this question.)