Estimation of Population Mean in Simple and Stratified Sampling

Estimation of Population Meanin Simple and Stratified Random Sampling

Hulya Cingi and Nilgün Özgül

HacettepeUniversity, Department of Statistics, Beytepe, 06800, Ankara, Turkey.

e-mails : ;

Abstract

We propose the ratioestimator for the estimation of the population mean in the simple random sampling by using the estimators in Bahl and Tuteja [1]andPrasad [7]. We also adapt the proposed estimator to the stratified sampling using the separate ratio estimation method. Obtainingthe mean square error (MSE) equationsof the proposed estimators in both simple and stratified random sampling, we find theoretical conditions thatthe proposed estimatorsaremore efficient than the other estimators. In addition, these conditions are supported by a numerical example.

Key words :Separate ratio estimator, auxiliary information, sampling, efficiency.

2000 AMS Classification : 62 D 05

1. Introduction

When information is available on the auxiliary variable, x that is positively correlated with the study variable, y, the ratio estimator is a widely used estimator to estimate the population mean, , as follows:

,(1)

where and are the sample means of study and auxiliary variables, respectively, and it is assumed that the population mean,, of the auxiliary variable is known. It is well known that the MSEequation of the ratio estimator is given by

,(2)

where ; ; n is the sample size; N is the number of units in the population; ;  is the population correlationcoefficient between the auxiliary and the study variables; and are the population variances of the auxiliary and the study variables, respectively; and are the population coefficients of variation of auxiliary and study variables, respectively.

Prasad [7] suggested the following ratio estimator:

,(3)

where  is aconstant whose optimal value for the estimator in (3)is . The MSEof this estimator can be given by

.(4)

Bahl and Tuteja [1] suggested the following estimator:

,(5)

where expis the exponential function. The MSE equation of this estimator can be given by

.(6)

Although there have been many studies on the combined estimators in stratified random sampling for recent years, such as Shabbir and Gupta [9,10], Singh et al. [11],Koyuncu and Kadilar [6], the authors rarely consider the separate estimators in stratified random sampling literature. However, Vishwakarma and Singh [12] show that the separate estimators are always more efficient than the combined estimators for their proposed estimators. For this reason, we adapt the estimator proposed in the simple random sampling to the stratified random sampling using the separate method in this study.

2. Suggested Estimator in Simple Random Sampling

Replacing the traditional ratio estimator, given in (1),with in the estimator of Bahl and Tuteja [1], given in (5), and motivated by the estimator of Prasad [7], given in (3), we propose a new ratio estimator as follows:

.(7)

To obtain theMSE equation for the proposed estimator, we use the Taylor Series Method defined by

,(8)

[13] where , , , and so for the proposed estimator.

= 1,

where .

Then, with the aid of (8), we can write

where

(see [8]).

Using these equations, we can write

.(9)

Setting , we get the optimum value of  as

,(10)

where . By this way, when  is replaced with pr in (9), the minimum MSE of the proposed estimator can be written as

.(11)

When there is no information about the population, one can estimate pr from the sample by

where . Here and are the sample coefficients of variation of auxiliary and study variables, respectively, and , where is the sample correlationcoefficient between the auxiliary and the study variables.

We would like to remind that the value of pr is always between 0 and 1 (0<pr<1), because  and A are always positive.

3. Efficiency Comparisons in Simple Random Sampling

In this section, we try to obtain the efficiency conditions for the proposed estimator by comparing the MSE of the proposed estimator with the MSE of the sample mean, traditional ratio estimator and the ratio estimators suggested by Prasad[7] and Bahl and Tuteja [1].

It is well known that under simple random sampling without replacement (SRSWOR) the variance of the sample mean is

.(12)

We first compare the MSE of the proposed estimator, given in (11), with the variance of the sample mean. By this comparison, we have the following condition:

.(13)

When this condition is satisfied, the proposed estimator is more efficient than the sample mean.

Secondly, we compare the MSE of the proposed estimator with the MSE of the traditional ratio estimator, given in (2). We have the following condition:

,(14)

where . When the condition (14)is satisfied, the proposed estimator is more efficient than the traditional ratio estimator.

Thirdly, comparing the MSE of the proposed estimator with the MSE of the estimator in Prasad [7], given in (4), we have the following condition:

,(15)

where . When the condition (15) is satisfied, we can say that the proposed estimatoris more efficient than the ratio estimator, suggested by Prasad [7].

Finally, we compare the MSE of the proposed estimator with the MSE of the estimator in Bahl and Tuteja [1], given in (6), and we have the following condition:

,(16)

where . When the condition (16)is satisfied, the proposed estimatoris more efficient than the ratio estimator, suggested by Bahl and Tuteja [1]. By the equations (13)-(16), we can also find the upper bound of for the proposed estimator to be more efficient than the other estimators.

4. Suggested Estimator in Stratified Random Sampling

Separate ratio estimator for the population total, Y, in the stratified random sampling is defined by

,(17)

where ;Nhis the population size in the stratum h; is the population mean of the auxiliary variable in the stratum h; and are the sample means of the auxiliary and study variables, respectively, in the stratum h and is the total number of stratum [3].

When we divide both sides of (17) by N, it is clear that we obtain the separate ratio estimator for the population mean in the stratified random sampling as

,(18)

where[2].

Adapting the proposed estimator in (7) to the separate ratio estimator in (18), we suggest a new estimator for the population mean in the stratified random sampling as follows:

,(19)

where is a constant for the stratum h. The MSE of this estimator can be obtained by

By this equation, we use (9) and we can write the MSE of the proposed estimator as

(20)

where ; ; nh is the sample size in the stratum h; ; h is the population correlationcoefficient between the auxiliary and the study variables in the stratum h; and are the population coefficients of variation of auxiliary and study variables, respectively, in the stratum h.

Setting for each stratum, we get the optimum value of h as

, h = 1, 2, … , (21)

where . Similar with the simple random sampling, prh can also be estimated from the sample for each stratum. Using these notations, when h is replaced with prhin (20), the minimum MSE of the proposed estimator can be written as

.(22)

It is clear that the values of prhdiffer from stratum to stratum but all of them are between 0 and 1.

5.Efficiency Comparisons in Stratified Random Sampling

The traditional estimator in the stratified random sampling is defined by

It is well known that the MSE equations of the traditionaland the separate ratio estimatorsin the stratified random sampling are respectively

,(23)

,(24)

where . When we compare these MSE equations with the MSE equation of the proposed estimator, given in (22), we have the following conditions:

, (25)

.(26)

When the condition (25) is satisfied, the proposed estimator is more efficient than the traditional stratified estimator and similarly when the condition (26) is satisfied, the proposed estimator is more efficient than the separate ratio estimator.

6. Numerical Example

We use data in Kadilar and Cingi [4,5] to compare efficiencies between the classical and proposed estimators in the simple and the stratified random samplings, respectively. These data sets concern the level of apple production as the study variable, number of apple trees as the auxiliary variable in 106 villages in the Marmarian Region and in 854 villages in 6 strata of Turkey, respectively (as 1:Marmarian, 2:Agean, 3:Mediterranean, 4:Central Anatolia, 5:Black Sea, 6:East and Southeast Anatolia) in 1999 (Source: Institute of Statistics, Republic of Turkey).

6.1 Numerical example for simple random sampling

In Table 1, we observe the statistics about the population. Using the simple random sampling, we take the sample size as n=20. We would like to remind that the sample size has no effect on the efficiency comparisons of the estimators, except the condition (15), as shown in the Section 3. Note that the correlation coefficient () between the auxiliary and study variables is 0.82 for this data set.

INSERT TABLE 1

INSERT TABLE 2

We compute the MSE values of sample mean, traditional ratio, Prasad, Bahl-Tuteja,and proposed estimators using the equations (12), (2), (4), (6) and (11), respectively.Using these MSE values we compute the relative efficiency for the estimators, say , with respect to thesample mean by

, .

These relative efficiencyvalues are shown in Table 2. We observe that the most efficient estimator is the proposed estimator. However, this result is an expected result because the conditions (13)-(16) are all satisfied as follows:

= 2.914; = 1.299;= 0.854;= 1.937 .

It is worth of pointing that we obtain pr =0.804 for this data set.In addition, we should denote that we use various sample sizes for the condition (15), but the condition is satisfied for all the sample sizes.

6.2 Numerical example for stratified random sampling

In Table 3, we observe the statistics about the population. Using Neyman allocation in the stratified random sampling, we obtain the sample size for each stratum, nh (h = 1,2,…,6),as shown in Table 3. For details, please see Kadilar and Cingi [5].

INSERT TABLE 3

INSERT TABLE 4

We compute the MSE values of proposed,traditional and separate ratio estimators using the equations (22)-(24), respectively.Using these MSE values we compute the relative efficiency for the estimators, say , with respect to thetraditional stratified estimator by

, .

These relative efficiency values are shown in Table 4. We observe that the most efficient estimator is the proposed estimator. However, this result is an expected result because the conditions (25) and (26) are all satisfied as follows:

= 92.745; = 29.388 .

We would like to note that we obtain = 17.370 for this data set. It is worth to point out that the sample size has no effect on the efficiency comparisons of the estimators for these conditions.

7. Conclusion

We develop a new ratio estimatorfor the population mean in the simple random sampling using the estimator suggested in Bahl and Tuteja [1]and adapt this new estimator to the stratified random sampling using the separate method. Theoretically and numerically,we demonstrate that the proposed estimators in both simple and stratified random sampling have the smallestMSEvalues in certain conditions and for a specific data set.

References

[1] Bahl, S. and Tuteja, R.K. Ratio and Product Exponential Estimators, Journal of Information and Optimization Sciences12(1), 159-164, 1991.

[2]Cingi, H. Sampling Theory(Hacettepe University Press, 1994). (in Turkish)

[3]Cochran, W.G. Sampling Techniques (John Wiley and Sons, 1977).

[4]Kadilar, C. and Cingi, H. A study on the chain ratio-type estimator, Hacettepe Journal of Mathematics and Statistics 32(1), 105-108, 2003.

[5]Kadilar, C. and Cingi, H. Ratio Estimators in Stratified Random Sampling, Biometrical Journal45(2), 218-225, 2003.

[6]Koyuncu, N. and Kadilar, C. Ratio and Product Estimators in Stratified Random Sampling, Journal of Statistical Planning and Inference 139 (8), 2552-2558, 2009.

[7]Prasad, B.Some Improved Ratio Type Estimators of Population Mean and Ratio in Finite Population Sample Surveys, Communications in Statistics: Theory and Methods 18(1), 379-392, 1989.

[8]Searls, D.T. Utilization of Known Coefficient of Kurtosis in the Estimation Procedure of Variance, Journal of American Statistical Association 59, 1225-1226, 1964.

[9]Shabbir, J. and Gupta, S. Improved Ratio Estimators in Stratified Sampling, American Journal of Mathematical and Management Sciences 25, 293-311, 2005.

[10] Shabbir, J. and Gupta, S. A New Estimator of Population Mean in Stratified Sampling,Communications in Statistics: Theory and Methods 35 (7), 1201-1209, 2006.

[11]Singh, H.P., Tailor, R., Singh, S., and Kim, J.M.A Modified Estimator of Population Mean Using Power Transformation,Statistical Papers49, 37-58, 2008.

[12]Vishwakarma, G.K. and Singh, H.P. (2009) Ratio-Product Estimators in Stratified Sampling, Statistical Methodology (accepted)

[13]Wolter, K.M. Introduction to Variance Estimation (Springer-Verlag, 1985).

Table 1 Data statistics of the population for the simple random sampling.

N = 106= 1536.774

n = 20= 24375.594= 0.041 A = 6.000

 = 0.816B = 7.789

Cyx = 6.881D = 0.208

Cy = 4.181E = 11.616

Cx = 2.018pr = 0.804

Table 2 Relative efficiency of estimators in the simple random sampling.

EstimatorsRE

sample 100

traditional224.415

Prasad341.205

Bahl-Tuteja150.475

proposed362.345

Table 3 Data statistics of the population for the stratified random sampling.

N=854 / N1=106 / N2=106 / N3=94 / N4=171 / N5=204 / N6=173
n=140 / n1=9 / n2=17 / n3=38 / n4=67 / n5=7 / n6=2
=37600 / =24375 / =27422 / =72410 / =74365 / =26442 / =9844
=2930 / =1537 / =2213 / =9384 / =5588 / =967 / =404
1=0.124 / 2=0.124 / 3=0.110 / 4=0.200 / 5=0.239 / 6=0.203
1=0.102 / 2=0.049 / 3=0.016 / 4=0.009 / 5=0.138 / 6=0.006
=0.917 / 1=0.816 / 2=0.856 / 3=0.901 / 4=0.986 / 5=0.713 / 6=0.894
Cx=3.851 / Cx1=2.018 / Cx2=2.095 / Cx3=2.220 / Cx4=3.841 / Cx5=1.717 / Cx6=1.909
Cy=5.838 / Cy1=4.181 / Cy2=5.221 / Cy3=3.187 / Cy4=5.126 / Cy5=2.471 / Cy6=2.339
Cyx=20.604 / Cyx1=6.881 / Cyx2=9.365 / Cyx3=6.376 / Cyx4=19.408 / Cyx5=3.026 / Cyx6=3.990
A1 =6.000 / A2 =9.043 / A3 =2.119 / A4 =1.237 / A5 =3.663 / A6 =1.701
B1 =7.789 / B2 =12.919 / B3 =2.334 / B4 =2.208 / B5 =3.004 / B6 =1.135
 pr1 =0.621 /  pr2 =0.691 /  pr3 =0.968 /  pr4 =0.989 /  pr5 =0.664 / pr6 =0.990

Table 4 Relative efficiency of estimators in the stratified random sampling.

EstimatorsRE

stratified 100

separate ratio 416.507

proposed 658.390