Outline for Class Meeting 2 (1/23/06)
Design-Based Estimation and the Horvitz-Thompson Estimator
I. An unbiased estimator for any design
A. If you know the first order selection probabilities for any design, you can construct an unbiased estimator of t as
.
This is called a Horvitz-Thompson estimator. To prove that this estimator is unbiased, notice that can be rewritten as
,
where Zi = 1 if and = 0 otherwise. Note that Zi is Bernoulli(pi). Then
.
An unbiased estimator of mean is then .
B. Weights
1. is often described as a “weight” attached to the ith unit. Units that have a high (relative) probability of selection are “weighted down” and those with a low (relative) probability of selection are “weighted up”.
2. A good check of your computation of selection probability is that the sum of the selection probabilities must equal sample size (if fixed) or expected sample size (if random). Why?
C. Example: Bernoulli design.
1. Consider a design in which each unit is selected into the sample independently and with probability p. Then an unbiased estimator of total is
.
2. Is the sample mean an unbiased estimator of population mean?
II. Variance
A. To find the variance of the estimator above, note that
,
where when i = j and when i ¹ j. Thus
This means that if the first and second order selection probabilities are known, you know what the variance of the estimator looks like. That is why it is useful to know how to compute second order selection probabilities.
B. To unbiasedly estimate the variance of the H-T estimator for designs in which pij > 0 for all i, j, just apply the indicator “trick” again:
C. Example (con’t): Bernoulli design
For the Bernoulli design, Cov(Zi, Zj) = 0. Thus
and
.
D. Is the H-T estimator the “best”?
III. The R.N. Example
Compare the H-T estimator with the “expected” estimator () of total for the R.N shift total example.
(a) Is the expected estimator (sample mean) unbiased?
(b) Is the expected estimator ever best?
(c) When does weighting have the most effect?