Outline for Class Meeting 5 (Chapters 2& 3, Lohr, 2/1/06)

Model-based sampling; Intro to Ratio Estimation

I.  Model-based sampling

We have discussed the “randomization” or “design-based” theory of estimation for sampling from finite populations. An alternative is the “model-based” theory of estimation.

A. The estimator

In model-based theory, we assume that the population is a realization of a random process model. For a srs model we assume the following. Suppose that the population can be thought of as a realization of the independent r.v.’s Y1, …, YN, where EM[Yi] = m and VM[Yi] = s2. Estimation of

can be thought of as a problem of predicting

,

where we predict the latter sum by (N-n), yielding = N.

B. Properties of the estimator

The estimator above is model unbiased, since

One can show that the mean square error of is

.

Since s2 can be estimated by s2, the standard error here is the same as for the design-based approach.

What about interpretation of confidence intervals?

II.  Ratio Estimation is primarily used to improve estimation of means and totals by making use of auxiliary information.

A. Example

Suppose we wanted to estimate the mean number of visitors to the state Capitol from the period Jan 1997 through Sept 2001. Suppose we had selected a SRS of 10 of the 57 months and had counted visitors in those months by standing at the door. Using the random number table in your text, I obtained a SRSWOR of months as 5 7 8 11 13 23 28 34 45 54. This would correspond to May 1997, July 1997, August 1997, etc. The actual count data obtained for those months is shown in the Attachment. Construct a 95% c.i. for the total number of visitors during the period.

Some visitors to the Capitol visit the Capitol Visitors Center and sign the guest register. Suppose that information about the number of visitors to the capitol visitors center during the entire period of interest is available. How might one use this auxiliary information to improve estimation?

B.  Ratio Estimators

1.  A ratio estimator of the total is

2.  A ratio estimator of the mean is

When would you expect ratio estimators to be good?

Example

Data Display

Row monthid numvis

1 5 51325

2 7 38742

3 8 81540

4 11 62954

5 13 115832

6 23 73842

7 28 66590

8 34 27932

9 45 42653

10 54 33985

Descriptive Statistics

Variable N Mean Median TrMean StDev SE Mean

numvis 10 59540 57140 56454 26575 8404

Variable Minimum Maximum Q1 Q3

numvis 27932 115832 37553 75767

Months
1996 / CVC Totals / Survey Totals / Months
1997 / CVC Totals / Survey Totals / Month
1998 / CVC Totals / Survey Totals
1 JAN / 3,620 / 13 JAN / 4,629 / 115,832 / 25 JAN / 3,940
2 FEB / 4,664 / 14 FEB / 6,078 / 26 FEB / 5,419
3 MAR / 7,789 / 15 MAR / 10,753 / 27 MAR / 8,674
4 APR / 3,921 / 16 APR / 10,845 / 28 APR / 9,489 / 66,590
5 MAY / 4,589 / 51,325 / 17 MAY / 10,234 / 29 MAY / 10,025
6 JUN / 5,130 / 18 JUN / 7,441 / 30 JUN / 6,216
7 JUL / 3,921 / 38,742 / 19 JUL / 7,945 / 31 JUL / 6,306
8 AUG / 4,589 / 81,540 / 20 AUG / 5,589 / 32 AUG / 5,249
9 SEPT / 5,130 / 21 SEPT / 4,762 / 33 SEPT / 3,091
10 OCT / 4,198 / 22 OCT / 5,672 / 34 OCT / 4,300 / 27,932
11 NOV / 5,672 / 62,954 / 23 NOV / 6,046 / 73,842 / 35 NOV / 4,582
12 DEC / 4,419 / 24 DEC / 5,333 / 36 DEC / 2,746
TOTAL / 57,642 / TOTAL / 85,327 / TOTAL / 70,037
Months
1999 / CVC Totals / Survey Totals / Months
2000 / CVC Totals / Survey Totals
37 JAN / 2,142 / 49 JAN / 3,563
38 FEB / 3,914 / 50 FEB / 5,144
39 MAR / 9,037 / 51 MAR / 8,705
40 APR / 9,859 / 52 APR / 10,322
41 MAY / 10,711 / 53 MAY / 9,321
42 JUN / 5,754 / 54 JUN / 5,072 / 33,985
43 JUL / 7,105 / 55 JUL / 6,589
44 AUG / 5,912 / 56 AUG / 4,343
45 SEPT / 3,955 / 42,653 / SEPT
46 OCT / 4,787 / OCT
47 NOV / 5,378 / NOV
48 DEC / 4,495 / DEC
TOTAL / 73,049 / TOTAL / 53,059 / GRAND TOTAL
CVC / 339,114