IR 611

Professor Graham

Spring 2014

Homework 2: Merging and T-tests

As with homework 1, you must post a working .do file to the course dropbox folder. Do part 1 in one .do file, then do part 2 in a separate .do file. For part 2, annotate the file very clearly so I can see where you did each step. To hand in the homework, post both .do files to dropbox (don’t post the data).

Part 1:

Step 1: Take two datasets and merge them together. You can use the WDI data and the additional data that you downloaded for homework 1, or you can download and merge-prep one or two new datasets. I encourage you to work with data you might use in the future.

Step 2: To what degree do these datasets overlap? In a comment in the .do file, note the overlap (or lack thereof) for both units and time periods (e.g. countries and years if you’re working with country-year data)

Step 3: In addition to observations that lack coverage entirely in one of these datasets are there variables that have missing values? Write me another note as a comment describing the pattern of missingness for the variables of interest.

Part 2:

Step 1: Choose an independent variable and a dependent variable in your data between which there is plausibly a causal relationship.

Step 2: What is the universe of cases over which you have coverage for both of these variables (e.g. years, countries). How large is the missing data problem?

Step 3: Are there any non-sensical values or suspicious outliers recorded for these variables? How did you check?

Step 4: Is the dependent variable close to normally distributed? If not, does it make sense to log it? (see Gujarati and Porter p. 159-166).

Step 5: If your independent variable is not currently binary, create a binary version of the variable. This gives you your “treatment” and “control” groups.

Step 6: Read through the UCLA page on t-tests.

Step 6.

Run a t-test to see if your DV takes on different values in your two groups. Write a 2-3 sentence comment into your .do file giving me the substantive interpretation of this result (what do we learn from the test).

Example comments to put in your .do file (with made up numbers):

/********************************

For part 2 of Homework 2, I will be examining the relationship between gdp per capita and a country’s judicial system (common law vs. civil law)

**********************************/

command

another command

/**************************

Step 2: The data on gdp per capita covers 184 countries from 1960-2011. The data on common law vs. civil law covers 122 countries for 1975-2010. There are 120 countries that cover both variables.

***************************/