Stat 4220 Homework

Due March 27

1)A research group is studying whether the height of a building affects the ground temperature around the building. To find out they randomly sampled 250 buildings and recorded their height and the temperature around the building. The output from the regression is shown below. Test whether there is evidence of building height affecting the ground temperature.

Regression Analysis: Temperature versus Height

The regression equation is

Temperature = 75.0 - 0.00164 Height

Predictor Coef SE Coef T P

Constant 74.9795 0.3069 244.35 0.000

Height -0.001638 0.001072 -1.53 0.127

S = 3.34230 R-Sq = 0.5% R-Sq(adj) = 0

2)A study on the average wattage of Laramie power lines sampled 100 randomlychosen power lines and found a 99% confidence interval of (1158.8, 1262.3) Watts.

Which of the following sentences is statistically accurate?

a)If we did a new study of power lines in Laramie we have a 99% probability of getting a confidence interval for the Wattage between 1158.8 and 1262.3

b)We are 99% confident that the true population average Wattage for all power lines anywhere in the US is between 1158.8 and 1262.3

c)Of all Laramie power lines 99% of them will have a Wattage level that falls within the interval (1158.8, 1262.3)

d)There is a 99% probability the next confidence interval done would correctly capture the true average wattage of power lines in Laramie

e)99% of the time that an interval on the wattage of power lines is made from Laramie power lines the population average will be in (1158.8, 1262.3)

3)A regression model was fit to determine how time studying for a test affects grade. The plot of the residuals is given below. Based on this plot which assumptions necessary for regression do you think may have been violated?

4)Which of the following data sets has the highest value of R?

A)B)

C)D)

A study on how the time of exercise affects heart rate had the following output

5)According to the output, if I exercise for time=150, what should be my heart rate?

6)After exercising everybody has different heart rates, which means there is a lot of variability in heart rates. How much of that variability is explained by exercise time?

A study was done to compare tree height with trunk thickness. The following output was generated from the regression model.

Simple linear regression results:
Dependent Variable: Tree Height
Independent Variable: Trunk Size
Height = 26.540844 + 8.024617 Trunk
Sample size: 51
R (correlation coefficient) = 0.9415
R-sq = 0.88648456
Estimate of error standard deviation: 8.624407

Parameter estimates:

Parameter / Estimate / Std. Err.
Intercept / 26.540844 / 2.7416365
Trunk / 8.024617 / 0.41022122

7)Assuming the conditions are met, test if trunk size is a good predictor of tree height

The output below studies whether salary should increase each year that you get older.

Simple linear regression results:
Dependent Variable: Salary
Independent Variable: Age
Salary = 43130.348 + 8.739329 Age
Sample size: 100
R (correlation coefficient) = 0.0142
R-sq = 2.0028706E-4
Estimate of error standard deviation: 10103.091

Parameter estimates:

Parameter / Estimate / Std. Err. / DF / T-Stat / P-Value
Intercept / 43130.348 / 3426.885 / 98 / 12.5858755 / <0.0001
Slope / 8.739329 / 62.372787 / 98 / 0.14011447 / 0.00489

8)Would it be a good idea to use this model to predict salary given a specific age?

9)The temperature of the reactor in a nuclear submarine is normally distributed. A random sample of 3 different times showed an average temperature of 324°C with a standard deviation of 54°C. Find a 95% confidence interval for the true average temperature of the sub’s reactor.

10)A 95% confidence interval for μ1-μ2, based on two independent samples of sizes 38 and 40, respectively, is (45.6, 56.7).

a)Is the difference between the two sample means included in the 95% confidence interval?

b)Is the difference between the two population means included in the 95% confidence interval?

c)Would the interval contain more values if the samples size were increased?

d)Is the probability that the difference between the two population means, μ1-μ2, falls between 45.6 and 56.7 equal to 0.95?

11)The average August temperatures (y) and geographic latitudes (x) of 20 cities in the United States were studied. The regression equation for these data is

Temperature = 113.6 – 1.01*(latitude)

  1. What is the slope of the line?
  2. Interpret the slope (how the mean August temperature is affected by a change in latitude)
  3. Estimate the mean August temperature for a city with latitude of 32.
  4. San Francisco has a latitude of 38. What would you predict for the mean August temperature of San Francisco?
  5. Given that the mean August temperature in San Francisco is actually 64 calculate the residual (prediction error) for San Francisco.
  6. The latitude at the equator is 0. Estimate the average August temperature at the equator.
  7. Explain why we should not use this equation to estimate average August temperature at the equator.

12)A car was driven 20 different times with different octane levels. Using the output from the regression, give a 99% confidence interval for the effect of octane on the car.

Simple linear regression results:
Dependent Variable: mileage
Independent Variable: octane
mileage = -53.426544 + 0.8503097 octane
Sample size: 20
R (correlation coefficient) = 0.9134
R-sq = 0.8343458
Estimate of error standard deviation: 1.8180993

Parameter estimates:

Parameter / Estimate / Std. Err. / DF / T-Stat / P-Value
Intercept / -53.426544 / 7.824635 / 18 / -6.827992 / <0.0001
Slope / 0.8503097 / 0.08930362 / 18 / 9.52156 / <0.0001

13) 50 different companies are competing for a bid with WYDOT to build roads. Each company submitted a sample of their asphalt for the WYDOT to test. The plot below shows the relationship between asphalt strength and the asphalt tar concentration from each company.

What would be an appropriate conclusion based on this graph?

Check all that apply (there may be more than one correct answer)

______The more tar that is put into the asphalt the stronger the asphalt will be

______High levels of tar concentration are associated with stronger asphalt

______Tar causes asphalt to be stronger

______There is a correlation between tar concentration and asphalt strength

______The stronger the asphalt is the more tar that will be put into it

14)In January 2013, the journal Pediatricspublished data collected from 214 mother and infant pairs of low-income African-American mothers aged 18 to 35 years in central North Carolina. Data was collected on the number of televisions in the household. The data showed a mean of 3.0 televisions with a standard deviation of 1.2.

“Maternal Characteristics and perception of Temperament Associated with Infant TV Exposure“ Pediatrics February 1, 2013 vol. 131 no. 2 e390-e397 doi: 10.1542/peds.2012-1224

a)Construct a 95% Confidence interval

b)Interpret your interval with a proper English sentence

c)If a computer got the t-score instead of the t-table, how would the interval change?