Vojtush 1

Josh Vojtush

Economics 426

Applied Econometrics

Spring 2014

Possible Points: 20

Homework # 2

Date Due: Wednesday, February12, 2014

1. A soda vendor at LouisianaState university football games observes that more sodas are sold the warmer the temperature at game time. Based on 32 home games covering 5 years, the vendor estimates the relationship between soda sales and temperature to be:

Sales = -240+ 6 Temperature + e

where Temperature is expressed in degrees Fahrenheit.

  1. Interpret the estimated slope and intercept. Do the estimates make sense? Why or why not? (2 points)

-The intercept is -240, so at 0 degrees, there are -240 sodas sold. You cannot sell -240 sodas. The slope is 6, so for every one degree increase, 6 more sodas are sold. That could be possible.

  1. On a day that the temperature at the game time is forecast to be 80 degrees predict how many sodas the vendor will sell. (1point)

-Y=-240+6(80) -> Y=240 -> So 240 sodas will be sold at 80 degrees.

  1. Below what temperature are the predicted sales zero? (1 point)

-0=-240+6T -> 240=6T -> T=40 -> So 40 degrees is where 0 sodas are sold, anything lower becomes “negative sales”.

2. The data file cps_small.csvcontains 1,000 observations on hourly wage rates, education, and other variables from the 1997 Current Population Survey. The variables are listed below:

Variable / Definition
Wage / Earnings per hour
Educ / Years of education
Exper / Years of work experience
Female / =1 if female; 0 otherwise
Black / =1 if black, 0 otherwise
White / =1 if white; 0 otherwise
Midwest / =1 if from the Midwest, 0 otherwise
South / =1 if from the South, 0 otherwise
West / =1 if from the Midwest, 0 otherwise

With the help of SAS, answer the following questions:

  1. Attach the SAS output with descriptive statistics of the data set. (1 point)

Simple Statistics
Variable / N / Mean / Std Dev / Sum / Minimum / Maximum
Wage / 1000 / 10.21302 / 6.24664 / 10213 / 2.03000 / 60.19000
Educ / 1000 / 13.28500 / 2.46817 / 13285 / 1.00000 / 18.00000
Exper / 1000 / 18.78000 / 11.31882 / 18780 / 0 / 52.00000
Female / 1000 / 0.49400 / 0.50021 / 494.00000 / 0 / 1.00000
Black / 1000 / 0.08800 / 0.28344 / 88.00000 / 0 / 1.00000
White / 1000 / 0.91200 / 0.28344 / 912.00000 / 0 / 1.00000
Midwest / 1000 / 0.23700 / 0.42546 / 237.00000 / 0 / 1.00000
South / 1000 / 0.31500 / 0.46475 / 315.00000 / 0 / 1.00000
West / 1000 / 0.22200 / 0.41580 / 222.00000 / 0 / 1.00000
Pearson Correlation Coefficients, N = 1000
Prob > |r| under H0: Rho=0
Wage / Educ / Exper / Female / Black / White / Midwest / South / West
Wage / 1.00000
/ 0.44985
<.0001
/ 0.14928
<.0001
/ -0.21275
<.0001
/ -0.09722
0.0021
/ 0.09722
0.0021
/ 0.01616
0.6097
/ -0.11177
0.0004
/ -0.00269
0.9324
Educ / 0.44985
<.0001
/ 1.00000
/ -0.18232
<.0001
/ -0.02334
0.4609
/ -0.05020
0.1127
/ 0.05020
0.1127
/ -0.02149
0.4972
/ -0.04605
0.1456
/ -0.03635
0.2508
Exper / 0.14928
<.0001
/ -0.18232
<.0001
/ 1.00000
/ 0.00896
0.7772
/ 0.00136
0.9657
/ -0.00136
0.9657
/ 0.05678
0.0727
/ -0.05113
0.1061
/ 0.01294
0.6828
Female / -0.21275
<.0001
/ -0.02334
0.4609
/ 0.00896
0.7772
/ 1.00000
/ 0.03197
0.3125
/ -0.03197
0.3125
/ -0.05681
0.0725
/ 0.07057
0.0256
/ 0.01122
0.7230
Black / -0.09722
0.0021
/ -0.05020
0.1127
/ 0.00136
0.9657
/ 0.03197
0.3125
/ 1.00000
/ -1.00000
<.0001
/ -0.04031
0.2028
/ 0.19970
<.0001
/ -0.14045
<.0001
White / 0.09722
0.0021
/ 0.05020
0.1127
/ -0.00136
0.9657
/ -0.03197
0.3125
/ -1.00000
<.0001
/ 1.00000
/ 0.04031
0.2028
/ -0.19970
<.0001
/ 0.14045
<.0001
Midwest / 0.01616
0.6097
/ -0.02149
0.4972
/ 0.05678
0.0727
/ -0.05681
0.0725
/ -0.04031
0.2028
/ 0.04031
0.2028
/ 1.00000
/ -0.37794
<.0001
/ -0.29771
<.0001
South / -0.11177
0.0004
/ -0.04605
0.1456
/ -0.05113
0.1061
/ 0.07057
0.0256
/ 0.19970
<.0001
/ -0.19970
<.0001
/ -0.37794
<.0001
/ 1.00000
/ -0.36224
<.0001
West / -0.00269
0.9324
/ -0.03635
0.2508
/ 0.01294
0.6828
/ 0.01122
0.7230
/ -0.14045
<.0001
/ 0.14045
<.0001
/ -0.29771
<.0001
/ -0.36224
<.0001
/ 1.00000

b. Estimate the following linear regression and discuss the results for the education variable parameter estimate:

Wage = β0+ β1Educ + e

-Wage = -4.91 + 1.13 Educ

-Education only explains about 20% of the variation in wages, which means it doesn’t explain 80%. This is not a very effective model.

The REG Procedure

Model: MODEL1

Dependent Variable: Wage

Number of Observations Read / 1000
Number of Observations Used / 1000
Analysis of Variance
Source / DF / Sum of
Squares / Mean
Square / F Value / Pr > F
Model / 1 / 7888.51140 / 7888.51140 / 253.20 / <.0001
Error / 998 / 31093 / 31.15530
Corrected Total / 999 / 38981
Root MSE / 5.58169 / R-Square / 0.2024
Dependent Mean / 10.21302 / Adj R-Sq / 0.2016
Coeff Var / 54.65272
Parameter Estimates
Variable / DF / Parameter
Estimate / Standard
Error / t Value / Pr > |t|
Intercept / 1 / -4.91218 / 0.96679 / -5.08 / <.0001
Educ / 1 / 1.13852 / 0.07155 / 15.91 / <.0001

(2 points)

c. Plot the least squares residuals and plot them against Educ. Attach the output to this assignment. Discuss any patterns that are evident from this plot. (2 points)

Hint: Following the model statement in SAS you can plot the residuals using the following statement:

Plot r.*Educ;

Clearly seen on the residual plot, as education levels increase, the residuals stray farther from 0 in both directions.

  1. Add experience (Exper) as an additional regressor to the model you estimated in “b” above. Interpret the parameter estimates for the education and experience variables in a way that your non-economist boss could understand. (2 points)

-Wages and education are fairly positively correlated (.44), while the positive correlation between wages and experience is not very strong (.14). Also, there is weak negative correlation between experience and education (-.18). A positive correlation means that the two variables move together in a certain direction, whereas a negative correlation means they move in opposite directions.

The REG Procedure

Model: MODEL1

Dependent Variable: Wage

Number of Observations Read / 1000
Number of Observations Used / 1000
Analysis of Variance
Source / DF / Sum of
Squares / Mean
Square / F Value / Pr > F
Model / 2 / 10046 / 5022.82440 / 173.06 / <.0001
Error / 997 / 28936 / 29.02292
Corrected Total / 999 / 38981
Root MSE / 5.38729 / R-Square / 0.2577
Dependent Mean / 10.21302 / Adj R-Sq / 0.2562
Coeff Var / 52.74926
Parameter Estimates
Variable / DF / Parameter
Estimate / Standard
Error / t Value / Pr > |t|
Intercept / 1 / -8.85844 / 1.03934 / -8.52 / <.0001
Educ / 1 / 1.24891 / 0.07023 / 17.78 / <.0001
Exper / 1 / 0.13204 / 0.01532 / 8.62 / <.0001

3. Please answer question 6 (all parts) and the end of Chapter 2 in the Studenmund text (pp. 60-61).

(6 points)

4. Supposethataverageworkerproductivityatmanufacturingfirms(avgprod)dependson

twofactors,averagehoursoftraining(avgtrain)andaverageworkerability(avgabil):

01avgtrain2avgabilu

Supposethatworkers withlowerabilitytendtoneedmoretraining.What,then,istheconsequenceofomitting avgabilfromtheRHS,fortheestimateofthecoefficient onavgtrain? Explain fully the basis for your answer. (3 points)

-If you were to omit avgabil from the RHS of the equation, you would be omitting a confounding factor and the coefficient on avgtrain would be too far off from the actual number. There would be “omitted variable bias” and your regression would not be very accurate.

data one;

set work.csv;

run;

procreg;

model wage=educ;

symbol2value=+ color=red;

plotr.*educ;

run;

procreg;

model wage=educ exper;

run;

procmeans;

proccorr;

run;