Chapter 3: Empirical Tools of Public Finance

Chapter 3: Empirical Tools of Public Finance

Introduction

Empirical public finance is the use of data and statistical methodologies to measure the impact of government policy on individuals and markets.

For example, helps us figure out the magnitude of the labor supply response from a TANF benefit cut.

Key issue in empirical public finance is separating causation from correlation.

Correlated means that two economic variables move together.

Casual means that one of the variables is causing the movement in the other.

This lesson overviews the kinds of methods that economists rely on to learn about the causal effects of government policy.

DISTINCTION BETWEEN CORRELATION AND CAUSATION

There are many examples where causation and correlation get confused.

It is critical for government policy to understand the difference; otherwise policy may not have the intended impact.

One interesting example is about Russian peasants.

There was a cholera epidemic. Government sent doctors to the worst-affected areas to help.

Peasants observed that in areas with lots of doctors, there was lots of cholera.

Peasants concluded doctors were making things worse.

Based on this insight, they murdered the doctors.

Another example concerns SAT preparation courses.

In 1988, Harvard interviewed its freshmen and found those who took SAT “coaching” courses scored 63 points lower than those who did not.

One dean concluded that the SAT courses were unhelpful and “the coaching industry is playing on parental anxiety.”

The Problem

In both examples, there is a common problem: an attempt to interpret a correlation as a causal relationship, without sufficient thought to the underlying data generating process.
For any correlation between two variables A and B, there are three possible explanations for a correlation:

1.A is causing B.

2.B is causing A.

3.Some other factor is causing both (XA & B)

4.Some other factors is causing A and B (XA and YB)

Ex1: In the case of Russian peasant, the possibilities might be:
Doctors cause peasants to die from cholera through incompetent treatment.

more Dr  more Peasant death

Higher incidence of death caused more physicians to be present.

more Peasant death  more Drs

Peasants thought possibility (1) was correct.

Ex2: In the Harvard SAT case, the possibilities could be:

(1)SAT prep  worsen preparation for the SATs.

(2)Those with poorer test taking ability take prep courses to try to catch up.Or would be failing students took the prep courses.

Poor performers  take SAT prep

(3)Those who are generally nervous both like to take prep courses and do the worst on standardized exams.

Nervousness taking SAT prep

Nervousness  perform poorly

Harvard dean thought possibility (1) was correct.
Although the peasants or the Harvard dean could actually be correct, odds are they are misinterpreting the underlying process at work.

For policy purposes, what we care about is causation.
Knowing that two factors are correlated gives you no predictive power.

MEASURING CAUSATION WITH THE DATA

WE’D LIKE TO HAVE: RANDOMIZED TRIALS

The “gold standard” of causality is a randomized trial.
The trial proceeds by taking a group of volunteers and randomly assigning them to either a “treatment” group that gets the intervention, or a “control” (or comparison) group that is denied the intervention.
With random assignment, the assignment of the intervention is not determined by anything about the subjects.
As a result, the treatment group is identical to the control group in every facet but one: the treatment group gets the intervention.
In the SAT example,
the “treatment” group members are those who took the coaching course;
the “control” group members are those who did not.
In the Russian peasant example,
the “treatment” group were communities where doctors were assigned,
the “control” group were communities where doctors were not assigned.

The Problem of Bias

In both cases, the assignment of the intervention was not random.
This means the treatment and control groups are not identical.
Non-random assignment, in turn, could cause bias.
Bias represents any source of difference between treatment and control groups that is correlated with the treatment, but not due tothe treatment.
In the SAT example, the impact of SAT courses is biased by the fact that those who take the prep course are likely to do worse on the SAT for other reasons.
In the Russian peasant example, the estimates are biased by the fact that the government assigned doctors to the worst-off communities.
By definition, such differences do not exist in a randomized trial, since the groups are not different in any consistent fashion.
As a result, randomized trials have no bias, and it is for this reason they are the “gold standard” for empirically estimating causal effects.

Randomized Trials in the TANF Context

In the last lesson, we learned that economic theory predicts increases in labor supply when TANF benefits are cut, but the magnitude of the effect is unclear.
One could design a randomized trial to learn about the elasticity of employment with respect to TANF benefits.

Imagine a large group (say, 2000) of single mothers were randomly assigned to one of two groups with a coin flip:
The “control” group continues to receive a guarantee of $5,000.
The “treatment” group now has their TANF benefit cut to $3,000.
Follow groups for a period of time, and measure the work effort.
In an experiment like this in California in 1992
the elasticity of employment w.r.t welfare benefits was estimated to be -0.67.
Thus, a 10% decrease in benefits resulted in a 6.7% increase in employment.

Why We Need to Go Beyond Randomized Trials

Randomized trials present some problems:
They can be expensive.
They can take a long time to complete.
They may raise ethical issues (especially in the context of medical treatments).
The inferences from them may not generalize to the population as a whole (valid only for volunteered people).
Subjects may drop out of the experiment for non-random reasons, a problem known as attrition.

For these reasons (especially the first one about randomized trials being expensive), economists often take different approaches to try to assess causal relationships in empirical research.

ESTIMATING CAUSATION WITH THE DATA

WE ACTUALLY GET: OBSERVATIONAL DATA

Often researchers are faced with observational data, data generated from individual behavior observed in the real world.
For example, the data from the SAT example would consist of data on which Harvard freshmen took the coaching course, along with their SAT scores.

The goal of this section is to review the kinds of approaches researchers use to estimate causal effects with data like this.
There are five main approaches:
Time series analysis
Cross-sectional regression analysis
Quasi-experiments
Structural modeling
Panel data

Time Series Analysis

Time series analysis documents the correlation between the variables of interest over time.
For example, could gather data over time on the TANF income guarantee, and compare that to the labor supply of single mothers over time.
Figure 1 illustrates these trends.

Figure 1 reveals that real benefits have declined dramatically over time, while average hours have risen substantially.
Apparently supports the theory that TANF benefit cuts should increase labor supply.

Problems:

Two sub-periods (1968-1976, and 1978-1983) show negative effect on labor supply, or zero effect.
Highlights difficulty that when there is a slow moving trend (benefit declines), it is very difficult to infer causal effect of this on another variable.
There are many potential explanations for the changes, too, such as:
Greater acceptance of women in workplace.
Better child care options.
Changes in social norms about working.
Other government program like the earned income tax credit.
EITC is a federal wage subsidy to low income people
Economic growth.

Thus, in many cases time series analysis is not all that useful.
But if there are sharp changes in a policy variable over time, then there may be some room for valid inference.
Cigarette price war in April 1993.
Tobacco settlement in 1998.
Figure 2 illustrates these trends.

Cross-Sectional Regression Analysis

Cross-sectional regression analysis is a statistical method for assessing the relationship between two variables while holding other factors constant.
“Cross-sectional” means comparing many individuals at one point in time.
Bivariate regression is a means of quantifying the extent to which two series covary.
For example, Figure 3 shows this with TANF benefit cuts and labor supply.

Figure 3

Regression analysis takes the correlation further by finding the line that best fits the data.
It describes the relationship between the dependent variable (in this case, labor supply), and the independent variables (in this case, TANF benefits).
With 2 data points, the line fits perfectly.

In real data, there will be many more individuals.
For example, the Current Population Survey (CPS) collects information on sources of income, hours of work, and health insurance.

Figure 4 graphs hours of work per year on the y-axis for all single mothers in the CPS data set against TANF benefits on the x-axis.

The “best fitting” line has a slope of -110.
Can convert this into an elasticity:L= a + blnG
b = dL/dlnG = ∆L/%∆G so b/L = %∆L/%∆G
elasticity = %∆L/%∆G = b/L
L=Average work effort is 748 hours per year, b = -110
100% rise in TANF leads to a 15% reduction in hours worked (-110/748) = -0.147.
Thus the elasticity of work with respect to benefits is -0.15, a fairly inelastic response.

Technical

This line corresponds to the regression:

Where there is one observation for each mother “i”. In the regression, α is the constant term, β is the slope coefficient, and ε is the error term.
ε represents the difference for each observation between its actual value and its predicted value on the regression line.

Interpreting the results is potentially problematic.
One interpretation is that higher TANF benefits “cause” lower labor supply.
Another interpretation is that single mothers with a greater “taste” for leisure get higher TANF benefits due to the tax rate.

- These people wouldn’t work much even if TANF were not available.

- Working mothers automatically get lower benefits.

=> Higher taste for leisure => TANF↑ and not the other way around

Figure 3, for example, showed that the relationship may not be causal. Instead, preferences may differ.
Less obvious in Figure 4, since we do not know the underlying utility functions of the CPS respondents

One advantage of regression analysis is the ability to include control variables, that is, other independent variables that may affect the dependent variable (in addition to TANF benefits).
Control variables are included to account for differences between treatment and control groups that can lead to bias.

-i.e., we might be able to include a control variable for “taste for leisure.”

-Including these control variables allows us to reduce the systematic differences between different groups.

In reality, however, the kinds of control variables that exist in typical data sets like the CPS are crude at best.
“Taste for leisure” might be proxied with education, number of kids, age, etc.

Technical

Adding control variables changes the regression:

Where the control variables account for race, education, age, and location.
As the Appendix shows, many of these control variables are “yes/no” indicator variables.

In most applications, including this one, it is unlikely that control variables will ever completely solve the problem.
Thus, it’s difficult to get rid of the bias totally.
Economists typically cannot set up randomized trials for many public policy discussions. Yet, the time-series and cross-sectional approaches are often unsatisfactory.

Quasi (or natural)-Experiments

Quasi-experimentconsiders policy reform itself as an experiment and tries to find a naturally occurring comparison group that can mimic the properties of the control group in the properly designed experiment context.
Quasi-experiments are changes in the economic environment that create roughly identical treatment and control groups for studying the effect of that environmental change.
This allows researchers to take advantage of randomization created by external forces.
Find a naturally occurring comparison group that can mimic the properties of the control group.
Let outside forces do the randomization for us. In some cases, the situation happens naturally.
Suppose, for example, that Arkansas cut its TANF benefit by 20% in 1997, and that we had a large sample of single mothers in Arkansas in 1996 and 1998.
At the same time, imagine that Louisiana’s benefits remained unchanged.

In principle, the alteration in the states’ policies has essentially performed our randomization for us.
The women in Arkansas
who experienced the decrease in benefits are the treatment group.
The women in Louisiana
whose benefits were unchanged are the control.
By computing the change in labor supply across these groups, and then examining the difference between treatment (Arkansas) and control (Louisiana), we can obtain an estimate of the impact of benefits on labor supply that is free from bias.

Imagine we simply studied single mothers in Arkansas alone.
In the “experiment”single mothers in 1996 are the control group,and those in 1998 are the treatment group.

HOURSAR,1998 - HOURSAR,1996=Treatment effect + Bias from economic boom

In practice, this comparison runs into the criticisms that confront us with time series analysis.
i.e., the national economy was growing exceptionally fast during this period.This contains both the treatment effect and the bias from the economic boom.

To fix this problem wecompare the treatment group for whom the policy changed to a control group for whom it did not.

Single mothers in Louisiana did not experience the TANF cut, yet benefit from the growth in the economy.
The treatment group for whom the policy changed: Arkansas

HOURSAR,1998 -HOURSAR,1996=Treatment effect + Bias from economic boom

The control group for whom the policy did not change: Louisiana

HOURSLA,1998-HOURSLA,1996= Bias from economic boom

By subtracting the change in hours of work in Louisiana from that in Arkansas, we control for the bias caused by the economic boom.

= Bias from economic boom

Assumes that in the absence of the reform hours of work of single mothers in Arkansas would have changed by the same amount as in Louisiana. The method relies on two important assumptions:
Common time effects across groups
No systematic composition changes in each group

We examine single mothers in the neighboring state of Louisiana, in the bottom panel of Table 1.

In Arkansas while benefits fell by 20%, hours of work increased by 20%.

HOURESAK, 1998 - HOURESAK, 1996 = (1200 – 1000) =200 (20% rise)

=> elasticity of labor supply w.r.t. benefits levels = -1.
larger than the -0.67 elasticity estimate found in the randomized trial in California.

There is likely to be bias in this “first-difference,” because there was major economic growth during this period.
Thus, single mothers in Arkansas may have increased their work effort even if TANF benefits had not fallen.

This approach yields the difference-in-difference(DID) estimator – the difference between the changes in outcomes for the treatment group that experiences an intervention and a control group that does not.
The difference-in-difference estimator is:

= (1200 – 1000) – (1050 – 1100) =150 hr

These net out the bias from the growing economy.

Thus, the causal effect of TANF benefit cuts would be a 150-hour increase in labor supply.
The implied elasticity of hours w.r.t. welfare benefits = 0.75
elast. =%∆l/%∆G = (150/1000)/(1000/5000) = 15%/20%
Similar to that of California

Note that the cross-sectional analysis would suggest that the reduction in welfare benefits leads to a 100-hour increase in work,

Technical

The analogous difference-in-difference regression would be:

In this case, H is hours of work, B is a dummy for the state with a TANF benefits cut, and Y is a dummy for post-1997.
The coefficient β3 represents the causal effect of the policy change.
The subscripts i, j, and t indicate individual i in state j in time period t.

Problems with quasi-experimental analysis

It is possible that the two states experienced different time effects

-i.e., different economic growth rates (this problem can be dealt with DDD estimator see HW problem “A question in Quasi Experiment”)

-the economic boom can affectthe two regions differently

More generally, single mothers may be different across states.
We can never be completely certain that we have purged the treatment-control comparisons of bias.

Structural Modeling

Both randomized trials and quasi-experiments suffer from two drawbacks:

1)They only provide an estimate of the causal impact of a particular treatment. It is difficult to extrapolate beyond the changes in policy.

2)The approaches often do not tell us why the outcomes change. For example, the approaches do not separate outincome and substitution effects in the TANF example.

The former approaches provide reduced-form estimates only, which leaves much of the economic theory in a “black box.”
Structural estimation attempt to estimate the underlying parameters of the utility function.
This allows a more thorough exploration of economic responses.
These structural models are more difficult to estimate, and tend to rely on the same set of limited information as the reduced form models.
Structural models essentially assume an explicit form to the utility function, but if the form is incorrect, the policy conclusions could be wrong.
The approach of the text will be to rely on reduced form estimates because it is easier to think about and explain.