STAT 3507 Midterm B

Time: 2 hoursMarch 2006

SHOW YOUR SOLUTIONS. Answer in words of problem.

1.A chain of department stores is interested in estimating the proportion of accounts receivable that are delinquent. The chain consists of 2 stores in different parts of the country. For convenience, stratified random sampling is used with each store as a stratum. The results are shown below:

StratumStratum SizeSample Size

1100500.80

2 80200.50

a)Find an estimate for the proportion of delinquent accounts for the chain and give an approximate 95% confidence interval for your estimate.

b)For a future survey, it is desired to estimate the proportion of delinquent accounts to within 0.1 with approximate 99% confidence. If the costs of sampling are c1 = 4 and c2 = 1 find the approximate sample size and allocation under optimal allocation.

c)If Neyman allocation had been used in part (b) would the results still have been optimal? Why or why not?

d)Why might a survey designer decide to use proportional allocation be used in this situation?

2.For the following situations, explain why you might choose to use stratified random sampling rather than a SRS. What would you use for strata?

a)It is desired to estimate the average wheat yield for a province. Farm sizes range from 3 acres to 1000 acres.

b)It is desired to compare the workforce experience of male and female engineering graduates.

c)The formula for optimal allocation in stratified random sampling shows that we should take larger samples in those strata for which

i) ______

ii) ______

iii) ______

3.a)Give 3 reasons why it might be better to take a sample rather than carry out a census.

b)What is a probability sample and why should it be used?

c)Name 2 types of samples that are not probability samples.

4.Green Turf is a company that makes fertilizers. One of the basic ingredients of these fertilizers is nitrogen. The company estimates the total quantity of nitrogen used in a year on the basis of a SRS of the production orders for the year. Each order shows the quantity of nitrogen (y) and other ingredients used for a particular job. There were 2000 production orders this past year. A SRS of 200 production orders from the 2000 gave lb, and lb.

a)Give a 95% confidence interval for the total amount of nitrogen used.

b)How large should next year's sample be so that the estimated total quantity of nitrogen used will be within 10,000 lb of the true total with 99% confidence? Assume the population of production order is still N = 2000.

5.A small town is interested in estimating the proportion of its households that have at least one member over 65 years of age. The city has 621 households. How large a SRS should be taken to estimate this proportion to within 0.08 with 90% confidence.

6.In order to estimate the average wages of cashiers at supermarkets in a certain city a SRS is chosen from all the supermarkets listed in the telephone book and the supermarket managers are asked to supply a list of all cashier wages.

a)What is the target population?

b)What is the sampling frame?

c)What is the sampling unit?

d)What is the observation unit or element?

e)Discuss any possible sources of selection bias

f)Discuss possible sources of measurement error.

7.For each of the following situations, state whether there is nonresponse error, coverage error, measurement error or sampling error and whether such an error would result in selection bias or measurement bias, or neither.

a)The error in a SRS with no measurement error, no nonresponse, and for which the sampling frame is the same as the target population.

b)UK residents who drink alcohol tend to under-report their alcohol consumption in face-to-face interviews.

c)Part of the reason that public interest in saving the wetlands may have been overestimated is that our sample was selected from lists of contributors to charities.

d)Critics charge that the poll overestimated public interest in restored train service because those interested were more likely to have returned the questionnaire.

8.In order to find estimates of the total number of hogs in a certain region and of the average number of hogs per farm, the 500 farms in the region were stratified according to size (small, medium, large). A SRS was selected from each stratum. The results were as follows (y represents the number of hogs on a farm):

Stratum / Stratum Size / Sample
Size / /
large / 80 / 30 / 144 / 80
medium / 160 / 40 / 64 / 30
small / 260 / 30 / 16 / 10

a)Find an estimate for the average number of hogs per farm in the region and give an approximate 99% confidence interval for your estimate.

b)What is the estimate of the total number of hogs in the region? What is the estimated variance of this estimate?

c)Why is stratified random sampling better than than SRS for this problem (give at least two reasons).

9.The article "What Readers Say About Marijuana" reported that more than 75% of the readers who took part in an informal PARADE telephone poll said marijuana should be as legal as alcoholic beverages (Parade, July 31, 1994). The telephone poll was announced on page 5 of the June 12 issue; readers were instructed to "call 1-900-773-1200, at 75 cents a call, if you would like to answer the following questions. Use touch-tone phones only. To participate, call between 8 a.m. EDT [Eastern Daylight Time] on Saturday, June 11, and midnight EDT on Wednesday, June 15."

a)What type of survey was this?

b)What might have been the target population?

c)Is 75% a valid estimate of the proportion of your part (b) target population who think marijuana should be as legal as alcoholic beverages? Why or why not?

10.Consider a population of 10 units and a sample size of 3 selected by SRS.

a)How many possible such samples are there?

b)What is the probability that the second element in the population belongs to the sample?