Question 1: The table represents data collected on the time spent studying (in minutes) and the resulting test grade.
Time Spent Studying (min) / 52 / 37 / 31 / 9 / 26 / 40 / 22 / 10 / 45 / 34 / 19 / 60Grade on Test / 95 / 84 / 72 / 58 / 77 / 86 / 72 / 43 / 90 / 81 / 62 / 98
Part 1: Create a scatter plot
Determine the type of correlation (positive, negative, no correlation):
The Type of Correlation is Positive Because It is Increasing.
Predict the model that will be used (Linear/ Non Linear):
Linear
Part 2: Find the line of best fit. Write your equation here:
I Rounded the equation to the nearest hundredth:
First I got point (52, 95) and put it in Geogebra. Then I used the best fit tool in the tools section.
What does x-represent?
X represents the amount of time spent studying.
What does y-represent?
Y represents the Grade on the test.
Based on your equation, write the values for y below:
Time Spent Studying (min) / 52 / 37 / 31 / 9 / 26 / 40 / 22 / 10 / 45 / 34 / 19 / 60Grade on Test / 95 / 84 / 72 / 58 / 77 / 86 / 72 / 43 / 90 / 81 / 62 / 98
Grade per linear model / 95.73 / 81.25 / 75.47 / 54.24 / 70.64 / 84.15 / 66.78 / 55.2 / 88.98 / 78.36 / 63.89 / 103.45
Difference / 0.73 / -2.75 / 3.47 / -3.76 / -6.36 / -1.85 / -5.22 / 12.2 / -1.02 / -2.64 / 1.89 / 5.45
Note: I rounded each of the grades per linear models to the nearest hundredth.
Find the sum of the differences to calculate the residual:
Since there are the double negatives the equation turned out like this:
0.73 – 2.75 + 3.47 – 3.76 – 6.36 – 1.85 – 5.22 + 12.2 – 1.02 -2.64 + 1.89 + 5.45
Which when solved is:
0.14
Do you think you equation was a good approximate of the data? Why? Why not?
Yes because it worked out with the data.
Question 2: Look back at the data from the New York Marathon.
Year / Competitors* / Year / Competitors* / Year / Competitors*1976 / 2.1 / 1987 / 22.5 / 1997 / 31.4
1977 / 4.8 / 1988 / 23.5 / 1998 / 32.4
1978 / 9.8 / 1989 / 25 / 1999 / 32.5
1979 / 11.5 / 1990 / 25.8 / 2000 / 30
1980 / 14 / 1991 / 26.9 / 2001 / 24
1981 / 14.5 / 1992 / 28.6 / 2002 / 32.5
1982 / 14.3 / 1993 / 28.1 / 2003 / 35.3
1983 / 15.2 / 1994 / 31.1 / 2004 / 37.3
1984 / 14.6 / 1995 / 29 / 2005 / 37.6
1985 / 16.7 / 1996 / 29 / 2006 / 38.4
1986 / 20.5 / * in thousands
Use this original data, but update it with the following information so that it is accurate through 2011. (The 2012 NY Marathon was cancelled due to the aftermath of Hurricane Sandy.) Use the entire data set (1976-2011) to answer all parts of Question 2.
Year / Competitors in thousands2007 / 39.2
2008 / 37.9
2009 / 43.7
2010 / 44.8
2011 / 46.8
Part 1: Find a regression model for this new, updated data set (1976-2011).
Part 2: How well does it fit? Explain your answer and reasoning. (You could begin by comparing some of the values in your model to the actual values and see if it’s a good match)
Part 3: Use your model to predict the attendance in 2017.
There will be 50.6, rounded to the nearest tenth, visitors (in thousands) in 2017.