Observational Studies, Chapter 3.2.5, 10.1-10.2

Stat 921 Notes 8

Reading:

Multivariate matching:

In matching, the first impulse is to try to match each treated subject to a control who appears nearly the same in terms of observed covariates; however, this is quickly seen to be impractical when there are many covariates. For instance, with 20 binary covariates, there are or about a million types of individuals, so even with thousands of potential controls, it will often be difficult to find a control who matches a treated subject on all 20 covariates.

Randomization produces covariate balance, not perfect matches. Perfect matches are not needed to balance observed covariates. Multivariate matching methods attempt to produce matched pairs or sets that balance observed covariates, so that, in aggregate, the distributions of observed covariates are similar in treated and control groups. Of course, unlike randomization, matching cannot be expected to balance unobserved covariates.

A fundamental tool for constructing matched sets is the propensity score, proposed by Rosenbaum and Rubin (1983, Biometrika) (cited by 2219 papers on scholar.google.com).

I. Propensity Score

The propensity scoreis the probability that a person with observed covariates receives the treatment rather than the control, .

Suppose an observational study is free of hidden bias. Instead of stratifying or matching exactly on , imagine forming matched sets in which units in the same matched set have the same chance of receiving the treatment . Then within a stratum or matched set, units may have different values of but have the same propensity score . Call this exact matching on the propensity score. In this case, the arguments from Notes 7 about making inferences with exact matches go through without changes. In those arguments, equal ’s were only used to ensure equal ’s. In short, in an observational study free of hidden bias, exact matching or stratification on the propensity score yields a conditional distribution of treatment assignments given that is the same as a uniform randomized experiment. In this case, the randomization inference methods may be applied. The same conclusion is reached if the matched sets are formed based on and parts of .

In practice, is unknown and must be estimated.

II. Balancing Property of the Propensity Score

In a randomized trial in which subjects are assigned to treatment or control by the flip of a fair coin, for all x; therefore, subjects with different patterns of covariates all have the same chance of receiving the treatment and each possible value of x is as likely to turn up in the treated group as in the control group. Quite often, the published report of a randomized trial includes a table (typically Table 1) documenting that the randomization was effective, that the treated and control group were comparable in terms of the distributions of important covariates.

In contrast, in an observational study, some subjects are more likely than others to receive the treatment, so for some persons, and the pattern of covariates x often helps to predict whether the subject will receive the treatment or control. However, suppose that we compare two subjects who have the same chance of receiving the treatment given their observed covariates x, say two subjects with . These two subjects may be very different in terms of x, but these differences do not help predict which subject is more likely to receive the treatment. Given only the information in the observed covariates x, both subjects have the same probability, namely , of receiving the treatment. So, the first subject with his x and the second subject with her perhaps very different x have the same probability of ending up in the treated group; the first subject’s x is as likely to be found in the treated group as the second subject’s x.

This argument shows that if we sample from the population, pairs of treated/control subjects whose propensity score is the same, on average the treated and control subjects’ distributions of observed covariates are the same, i.e., on average the treated and control group are balanced on the observed covariates.

Proposition (page 298 of text book): If , then

Proof: If , then by Bayes’ theorem,

Now, and , so

proving the result.

II. Constructing Matched Sets Using Estimated Propensity Scores

Typically the propensity score is unknown and must be estimated. Most commonly the propensity score is estimated using logistic regression of the binary category, treatment/control, on the observed covariates.

Logistic Regression:

What to include in the propensity score model?

We want to consider any covariates that subject matter experts judge plays a role in the selection of treatments and/or might be associated with the outcome.

Propensity Score Model Building:

(Images from Thomas Love, Using Propensity Scores Effectively,

The Ten Commandments

Thou Shalt Ignore
Commandments 1 through 9…
And Instead Simply Ensure
That The Model Adequately
Balances The Covariates

Following a strategy similar to that suggested by Rosenbaum and Rubin (1984, Journal of the American Statistical Association), we build the propensity score model logistic regression as follows:

1. We use stepwise logistic regression with all covariates, squares of covariates and interactions to build an initial model.

2. We form the matched pairs to minimize the absolute propensity score differences between the matched pairs. The solution to this optimal matching problem can be formulated as finding a minimum cost flow in a certain network, a problem that has been extensively studied and for which good algorithms exist (Rosenbaum, Observational Studies, Chapter 10). An algorithm is implemented in the optmatch package in R that was developed by Ben Hansen.

3. We examine the standardized differences on the matched sample of all covariates, their squares and interactions and for any variable for which the absolute standardized difference is greater than 0.10, we add this variable to the logistic regression model.

For job training data:

The covariates are age, education, black, Hispanic, married, nodegree, earnings74 and earnings75.

Squared terms are labeled age.sq, education.sq, etc.

Interaction terms are labeled age.education, age.black, education.black, etc.

# Load Job training data

treated.table=read.table("nsw_treated_earn74.txt",header=TRUE);

control.table=read.table("psid_controls.txt",header=TRUE);

jobtraining=c(treated.table[,1],control.table[,1]);

age=c(treated.table[,2],control.table[,2]);

education=c(treated.table[,3],control.table[,3]);

black=c(treated.table[,4],control.table[,4]);

hispanic=c(treated.table[,5],control.table[,5]);

married=c(treated.table[,6],control.table[,6]);

nodegree=c(treated.table[,7],control.table[,7]);

earnings74=c(treated.table[,8],control.table[,8])

earnings75=c(treated.table[,9],control.table[,9]);

earnings78=c(treated.table[,10],control.table[,10]);

age.sq=age^2;

education.sq=education^2;

earnings74.sq=earnings74^2;

earnings75.sq=earnings75^2;

age.education=age*education;

age.black=age*black;

age.hispanic=age*hispanic;

age.married=age*married;

age.nodegree=age*nodegree;

age.earnings74=age*earnings74;

age.earnings75=age*earnings75;

education.black=education*black;

education.hispanic=education*hispanic;

education.married=education*married;

education.nodegree=education*nodegree;

education.earnings74=education*earnings74;

education.earnings75=education*earnings75;

black.married=black*married;

black.nodegree=black*nodegree;

black.earnings74=black*earnings74;

black.earnings75=black*earnings75;

hispanic.married=hispanic*married;

hispanic.nodegree=hispanic*nodegree;

hispanic.earnings74=hispanic*earnings74;

hispanic.earnings75=hispanic*earnings75;

married.nodegree=married*nodegree;

married.earnings74=married*earnings74;

married.earnings75=married*earnings75;

nodegree.earnings74=nodegree*earnings74;

nodegree.earnings75=nodegree*earnings75;

earnings74.earnings75=earnings74*earnings75;

Xmat=cbind(age,education,black,hispanic,married,nodegree,earnings74,earnings75,age.sq,education.sq,earnings74.sq,earnings75.sq,age.education,age.black,age.hispanic,age.married,age.nodegree,age.earnings74,age.earnings75,education.black,education.hispanic,education.married,education.nodegree,education.earnings74,education.earnings75,black.married,black.nodegree,black.earnings74,black.earnings75,hispanic.married,hispanic.nodegree,hispanic.earnings74,hispanic.earnings75,married.nodegree,married.earnings74,married.earnings75,nodegree.earnings74,nodegree.earnings75,earnings74.earnings75);

# Load Design and optmatch libraries

library(Design);

library(optmatch);

# Model that uses all main effects, squared main effects

# and interactions

firstmodel=glmD(jobtraining~Xmat);

# Stepwise logistic regression

stepmodel=fastbw(firstmodel);

Factors in Final Model

[1] education black hispanic

[4] married nodegree earnings74

[7] age.education age.hispanic age.married

[10] age.nodegree education.hispanic education.married

[13] education.nodegree black.married black.nodegree

[16] hispanic.married hispanic.nodegree hispanic.earnings74

[19] married.earnings74

# Use terms in stepwise model

secondmodel=glm(jobtraining~education+black+hispanic+married+nodegree+earnings74+age.education+age.hispanic+age.married+age.nodegree+education.hispanic+education.married+education.nodegree+black.married+black.nodegree+hispanic.married+hispanic.nodegree+hispanic.earnings74+married.earnings74,family=binomial);

summary(secondmodel)

Call:

glm(formula = jobtraining ~ education + black + hispanic + married +

nodegree + earnings74 + age.education + age.hispanic + age.married +

age.nodegree + education.hispanic + education.married + education.nodegree +

black.married + black.nodegree + hispanic.married + hispanic.nodegree +

hispanic.earnings74 + married.earnings74, family = binomial)

Deviance Residuals:

Min 1Q Median 3Q Max

-2.023982 -0.097931 -0.019898 -0.004055 4.466233

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 3.273e+00 2.390e+00 1.370 0.170841

education -4.145e-01 1.897e-01 -2.185 0.028908 *

black 1.957e+00 5.159e-01 3.793 0.000149 ***

hispanic 3.993e+01 2.585e+01 1.545 0.122338

married -6.712e-01 2.044e+00 -0.328 0.742575

nodegree -2.544e+00 2.696e+00 -0.944 0.345420

earnings74 -2.689e-04 3.476e-05 -7.734 1.04e-14 ***

age.education 2.899e-03 2.114e-03 1.371 0.170333

age.hispanic -5.733e-01 4.331e-01 -1.324 0.185533

age.married -5.413e-02 3.058e-02 -1.770 0.076683 .

age.nodegree -7.458e-02 2.408e-02 -3.097 0.001955 **

education.hispanic -1.845e+00 1.197e+00 -1.541 0.123396

education.married -6.841e-02 1.203e-01 -0.569 0.569457

education.nodegree 4.580e-01 1.983e-01 2.310 0.020887 *

black.married 1.349e+00 7.497e-01 1.800 0.071895 .

black.nodegree -3.572e-01 6.538e-01 -0.546 0.584812

hispanic.married -4.916e-01 2.039e+00 -0.241 0.809506

hispanic.nodegree -3.752e+00 2.585e+00 -1.451 0.146726

hispanic.earnings74 -2.550e-04 2.731e-04 -0.934 0.350410

married.earnings74 1.824e-05 5.491e-05 0.332 0.739688

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1345.30 on 2674 degrees of freedom

Residual deviance: 491.96 on 2655 degrees of freedom

AIC: 531.96

Number of Fisher Scoring iterations: 10

Next topic: Matching using the estimated propensity score.