Introduction to Linear Regression and Correlation analysis.

1.

The following information taken from the 1998 annual report of Baldor Electric Company shows net Sales and Working capital (in thousand dollars) for 1988 and 1998.

Year / Net Sales / Working Capital
1988 / 243463 / 67168
1989 / 281462 / 69788
1990 / 294030 / 75306
1991 / 286495 / 84740
1992 / 318930 / 97343
1993 / 356595 / 108601
1994 / 418152 / 118550
1995 / 473103 / 145069
1996 / 502875 / 146975
1997 / 557940 / 141268
1998 / 589406 / 176126
  1. Plot the variables Net Sale(Y) and Working Capital (x) in scatter-plot format. What type of relationship appears to exist between Working Capital and Net Sales? Indicate whether a regression model or a correlation model would be more appropriate. Give statistical reason for your answer.
  2. Compute the correlation coefficient between working Capital and Net Sales. What does the correlation coefficient measure?
  3. Test to determine if when Net Sales declines, Working Capital will also decline. (Hint: Think what this indicates for the value of the population correlation coefficient.) Clearly state your null and alternative hypotheses. Conduct your test at a significance level of 0.05. Be sure to state a conclusion for your test.

2.

One of the editors of major automobile publications has collected data on 30 of the best selling cars in United States. See attached file Automobiles. The editor is particularly interested in the relationship between highway mileages and curbs weight of the vehicles.

  1. Develop a scatter plot for these data. Discus what the plot implies about relationship between two variables. Assume that you wish to predict highway mileage by using vehicle curb weight.
  2. Compute the correlation coefficient for the two variables and test to determine whether there is a linear relationship between the curb weight and the highway mileage of automobiles.
  3. 1. Compute the linear regression equation based on sample data. 2. Cadillac’s 1999 Sedan DeVille weight approximately 4,012 pounds. Provide an estimate of the average highway mileage you would expect to obtain from this model.

3.

Referring again to the automobile magazine editor discussed in exercise 2, the editor now wants to examine the relation between price of the vehicle and horsepower of engine.

  1. 1. Develop a scatter plot for these data. 2. Discuss what the plot implies about the relationship between the two variables. Use the price as the depend (y) variable.
  2. Compute the correlation coefficient for the two variables.
  3. Compute the linear regression equation based on the sample data.
  4. Toyota’s 1999 Camry four-cylinder model generates 133 horsepower. Provide estimate of the price of the 1999 Camry. Toyota’s suggested retail price for the Camry LE 4A model was $20,278. Calculate the appropriate residual for this model of Camry.
  5. 1. Compute the R-squared value and discuss what this value means. 2. At a significance level of 0.01, can you conclude that engine horsepower is a good predictor of the price of an automobile?

4.

A 1998 articles in Fortune magazines titled The 100 Best Companies to Work in America (January 12, 1998) contained data on the 100 companies. See attached file Best Companies.

Three variables of interest are the revenue of each company, the number of hours of training per year per employee and number of employees.( Note: You will need to omit companies with data marked N.A. before completing analysis.)

  1. Compute the linear regression equation based on the sample data if the revenue of each company is to be used to predict the number of hours of training per year per employee.
  2. Would you feel comfortable using the revenue of one of the 100 companies to determine the number of hours of training per year per employee with a simple regression model? Conduct a statistical procedure to answer this question.
  3. Synovus Financial has 8,827 employees. Predict the number of hours of training per year per employee for Synovus.
  4. Referring to part c, develop and interpret a 90% prediction interval for the average training hours per employee for the companies with 8,827 employees.
  1. Referring part d, what is a 90% prediction interval for average training hours per employee for companies with 40,000 employees? Compare this interval with the one computed in part d and discusses why the widths of the two are different.
  2. Referring part d and e, at what number of employees would width of 90% prediction interval for average training hours be minimized?
  3. Referring to part d and e develop and interpret a 90% prediction interval for the actual training hours per employee for Synovus.