Economics 3111 Assignment 3: Weekly Work Hours in the Labour Force Survey

Economics 3111 Assignment 3: Wage Patterns in the Labour Force Survey

Due Date: November 10, 2014

Part 1: Distribution of Hourly Wages and a Profile of Low-Wage Workers

Digbe, Jean / Quebec
Grecica, Jason / British Columbia
Groulx, Kyle G / Ontario
Lamb, Cory / Quebec
Martineau, Celine / British Columbia
Matis, Janessa / Ontario
Mercado, Marvin / British Columbia
Nanayakkara, P. Alberta
Purtell, Pierre-Luc / Ontario
Vanderwey, E Quebec

This question looks at the distribution of hourly wages for the assigned province or region. Use the Labour Force Survey datafile for June 2014 from assignments 1 and 2. As before see the LFS codebook for variable definitions. To make things easier you do not need to weight your data in this assignment.

Use Gretl's smpl command to restrict your sample to people who have data on hourly earnings (HRLYEARN>0) and are in the province assigned to you.

(a) Use the variable HRLYEARN to look at the distribution of hourly wage rates. Gretl's “freq” command is your best option. In assignment 2 you used this command to look at the frequency of responses to question on why a person's last job ended. A complication here is that there are 1000s of possible values for hourly earnings. When a variable has many possible values Gretl groups observations into "bins" where a bin is defined over some range of values and the size of the range is the same in each bin. When calculating frequencies Gretl's default is to allow for only 30 or so bins or categories -- this doesn't give you much detail! To get a better idea of what the wage distribution looks like set the number of bins to 500 when you issue your "freq" command:

freq HRLYEARN --nbins=500 --show-plot

this will allow for 500 equal-size wage range categories and will display a graph of the resulting distribution (your output file will also contain a table of frequencies which you can use to answer part (b)). Save or copy the graph and include it as part of your answer to part (a).

Now look at your graph of the hourly earnings distribution and describe its shape.

(b) Based on your results in (a) answer the following:

- What is the most common hourly earnings range in your frequency table and what percent share of people in your sample are in that range? (the percentage share in each category is in the column labelled "rel.")

- What are the next three most common ranges?

- The median value of hourly earnings has 50% of observations above it and 50% below

it. What earnings range contains the median? (see the last column of your frequency

table: this is the cumulative share i.e. the share of people who have wages either in the

range you are looking at or below it).

- The median is located at the 50th percentile of the distribution (50% of observations are

below it). The 10th percentile has 10% of observations below it and the 90th percentile has 90% of observations below it. Report the earnings range for your sample at the 10th and 90th percentiles.

- Differences in earnings or ratios between earnings at various percentiles are often used as measures of inequality. Calculate the following differences:

(i) wage at the 90th percentile less the wage at the 10th percentile (90-10

differential -- a measure of overall spread);

(ii) the wage at the 90th percentile minus the wage at the 50th percentile (90- 50 differential -- a measure of upper-tail spread); and

(iii) the wage at the 50th percentile minus that at the 10th percentile (50-10

differential -- a measure of lower-tail spread).

In calculating these differences use the midpoint of the hourly earnings range that

contains the percentile of interest e.g. if the 50th percentile is in the $20.70-$21.00 range use (20.70+21.00)/2=$20.85 as the wage at the 50th percentile (the second column of the freq table reports midpoints). The 90-50 and 50-10 differentials for a symmetric

distribution would be about the same. Is your distribution symmetric? If not symmetric is the upper or lower tail longer?

(c) The "general" minimum wage applies to most workers in a given province (several provinces have a few additional (usually lower) minimum wage rates for specified groups but we will ignore them below). The general minimum wages in June 2014 for the three largest provinces were as follows:

Quebec$10.35British Columbia $10.25

Ontario$11.00

(i) Use the "summary" command to calculate the average wage for your province or region. How high is the minimum wage compared to the average wage in your province? How

high is the minimum wage compared to the median wage (either use the median from (b)

or the median reported by the summary command)?

(ii) Generate a dummy variable that equals 1 if a person's hourly earnings is exactly equal to the minimum wage. How many people in your province report earning the minimum

wage? What percentage share of workers is this?

(iii) Does the minimum wage seem to alter the shape of the wage distribution? How? (refer to

your diagram in (a)).

(e) Define another dummy, call it M10, that equals 1 if the person's wage is at or between the minimum wage and the minimum wage plus 10%, e.g. if the minimum wage in your province was $9.00 M10 equals 1 for everyone earning from $9.00-$9.90 . In that case the generate command for 'M10' would be as follows:

genr M10=(HRLYEARN>=9.00 & HRLYEARN<=9.90)

For this next step restrict your sample to people whose HRLYEARN is above the minimum wage for your assigned province (otherwise your M10=0 observations will mix people very low wages in with people with higher wages).

Use Gretl's 'xtab' command to generate the figures needed for you to get some idea of who these low-wage workers (M10=1 observations) are and how they differ from higher wage workers (people with M10=0). In particular look at how the two simples differ by sex (SEX), age (AGE_12), education (EDUC90), industry (NAICS_18), union status (UNION) and firm size (FIRMSIZE).

For example if you wanted to see how the two samples differ by marital status you would use:

xtab M10 MARSTAT --row

this will give a table where the first row gives the percentage breakdown of the M10=0 sample (higher wage workers) across the possible values of MARSTAT and the second row gives the breakdown of the M10=1 sample (lower wage workers) across values of MARSTAT. Do this for each of the variables mentioned above.

Provide a summary table of your results and a written description of how the low- and higher-wage groups differ. In your description highlight which categories of each characteristic are most common in the high wage group and which are most common in the low-wage group.

Part 2: Estimation of an hourly wage regression

As in question 1, use the sample of people with HRLYEARN>0 from your assigned province.

(a) In your Gretl program define the following dummy variables:

- a sex dummy (=1 if a woman): use SEX.

- six separate Age dummies: for age 15-24, age 25-34, age 35-44, age 45-54, age 55-64

and age 65+ : (use LFS variable AGE_12).

- four separate Education dummies: for Less than High school (0-8 years plus Some

secondary), High school graduate (Grade 11-13), Post-secondary not university

degree (Some post-secondary plus Post-secondary certificate or diploma) and

University degree (Bachelor’s or Graduate) -- (use LFS variable EDUC90).

- A union status dummy (=1 if the person is a union member or is not a union member but is covered by a collective agreement, =0 otherwise): use UNION

- A public sector dummy (use the variable COWMAIN in the codebook to define this

dummy – define it to equal 1 if a public employee and 0 otherwise).

- Four firm size dummies: for ‘Less than 20’ employees, ’20-99’ employees, ‘100-500’

employees and ‘more than 500’ employees (use FIRMSIZE).

Report the means for your dummy variables on your assigned sample.

(b) Estimate an hourly wage regression equation (go back and look at your assignment 2 regression -- as in assignment 2 include a constant term in your regression). The dependent variable in your regression is: HRLYEARN. The explanatory variables are the dummy variables defined in the previous section.

Note: as mentioned in the set of course notes, and in the last assignment, when a set of dummy variables describes the same characteristic one of dummies in the set must be used as a default group (i.e. left out of the regression). In your regression this issue arises for the sex dummies, the age dummies, the education dummies and the firm size dummies – to deal with this problem choose men, age 15-24, “Less than High School” and firm size ‘Less than 20’ employees as your default groups. Once this is done, the coefficients on the remaining dummies in the set measure the effect of having the characteristic described by that dummy rather than the characteristic of the default group e.g. the coefficient on “University degree” will measure the effect on hourly earnings of having a university degree rather than having the default characteristic “Less than High School”.

Report your regression output and based upon it answer the following questions:

(i) How do wages vary by age and education according to your regression?

(ii) What is the difference between pay of men and women? Is the difference statistically significant? What is the effect of being unionized (technically of being a union member or of being non-union but covered by a collective agreement) on wages? What is the effect of being a public employee worker?

(iii) How do wages vary by firm size?

(c) Use the regression result to predict the average hourly wage rate of a woman, aged 45-54, who has a university degree, is a union member, works for a large employer (>500 employees) and is in the public sector. Show your work.

(d) What does labour demand theory suggest about how wages are determined? Do the results of your wage regression seem consistent with labour demand theory? Explain.

Part 3:

Chapter 5, Problem 1, pp. 170-171 (for (b) assume all costs are labour costs when calculating

profit).

Chapter 6, Problem 2, pp. 189-190 (note that VMP is what we have been calling MRP).