AMS570.01 Practice Midterm Exam Spring 2018

Name: ______ID: ______Signature: ______

Instruction: This is a close book exam. You are allowed a one-page 8x11 formula sheet (2-sided). No cellphone or calculator or computer is allowed. Cheating shall result in a course grade of F. Please provide complete solutions for full credit. The exam goes from 8:30-9:50am. Good luck!

We will have three problems in the midterm, each of the following three types.

Type 1: Bayes Estimator and Bayesian Credible Set; But could include the Frequentist Counterparts (estimator, CI) for comparisons.

1a.Let X1, X2, ..., Xn be a random sample from a geometric distribution with parameter p.

(a) Please find a sufficient statistic for the parameter p.

(b) Now, let p have a Beta(2,3) prior, please derive the Bayes estimator of p.

Solution:

(a)Geometric:

The likelihood is:

So according to the factorization theorem, is a sufficient statistic for .

(b)The prior density is:

The likelihood is:

The posterior can therefore be derived as:

The kernel of this product (dropping the constant multiplier) is:

Only one distribution has this kernel – the Beta distribution that has the kernel of the form:

Matching up parameter values, the posterior distribution is a Beta(, ) distribution.

The Bayes estimator is the posterior mean, that is:

1b.Given a random sample from a Normal population with mean and variance 1. Please

(a)Derive a sufficient statistic for.

(b)Derive the maximum likelihood estimator (MLE) of.

(c)Assuming the prior ofDerive the the Bayes estimator of.

(d)Which of the two estimators (the Bayes estimator and the MLE) are better? Why?

(e) Derive the 100(1-α)% Bayesian HPD credible set for .

Solution:

(a)

By the factorization theorem, is a SS for.

(b) Likelihood function:

is the MLE

(c)

The prior distribution is:

The posterior distribution is:

The Bayes estimator under the squared error loss:

(d) For the MLE , we have:

Here the MLE is indeed also the best unbiased estimator for .

Therefore, the MLE is better when we do not have reliable prior information on the underlying distribution.

For the Bayes estimator , we have:

One can see that in general, the Bayes estimator is biased and could have larger MSE than the MLE. However, when the prior information is accurate, for example, taking the extreme case of . In this case, the Bayes estimator is not only unbiased, but also has smaller MSE than the MLE. Therefore, when we do have reliable prior information, the Bayesian estimator is preferred.

We also point out that the MSE used for the comparison here is the MSE in the Frequentist framework – and thus the expectation is taken over the sample (or equivalently, the sufficient statistic ) only.

(e) The posterior distribution for is:

Since the posterior pdf of is symmetric around its mean – so we should cut the pdf of symmetrically around its mean to obtain the optimal interval. Therefore, the 100(1-α)% Bayesian HPD credible set for is of the form:

Since

we see immediately that:

That is,

Therefore we have:

Therefore, the Bayesian HPD credible set for is of the form

Type 2: (Frequentist) Confidence Intervals

2a. Let and be two independent samples. Furthermore, σ2 is known. Please derive the confidence interval for

Solution:

1)Parameter of interest

(or )

2)Point estimator for the parameter of interest

(or )

The ratio is not used because it is much harder to derive its distribution.

is a pivotal quantity for when is known.

We will prove the distribution of Z using the moment generating method as follows.

4) Now we derive the confidence interval. First, we draw the pdf of our pivotal quantity Z as follows.

Let be any small positive value less than 1 (*usually less than 0.5), in the above figure, we have:

The 100(1-α)% confidence interval for (when is known and the population is normal) is

2b.Suppose we have two independent random samples from two normal populations: , and . Furthermore, and are known.

(a)Please derive the confidence interval for using the pivotal quantity approach. (*Please include the derivation of the pivotal quantity, the proof of its distribution, and the derivation of the confidence interval for full credit.)

(b)For a fixed total sample size , (i.e. N is fixed), please derive the best sample sizes such that the length of the confidence interval is minimized.

Solution:

(a). The point estimator for is

PQ: prove the distribution of the PQ(*)

(b).

minimize minimize .

Let , then

*** Recall:

Chain Rule

Type 3: (Frequentist) Hypothesis Test

3a. Suppose we have two independent random samples from two normal populations: , and .

(a)At the significance level α, please construct a test using the pivotal quantity approach to test whether or not. (*Please include the derivation of the pivotal quantity, the proof of its distribution, and the derivation of the rejection region for full credit.)

(b)At the significance level α, please derive the likelihood ratio test for testing whether or not. Subsequently, please show whether this test is equivalent to the one derived in part (1).

Solution: Let , ; then we have

Thus the problem translates into the usual derivation of the pooled variance t-test testing to test whether or not based on the following samples of unknown but equal population variances:

Sample 1:

Sample 2:

(1)

Point estimator:

Therefore,

Then the pivotal quantity for is

Under the test statistisc

Intuitively, we reject Hoiff

Now we derive the rejection region based on the Type I error rate:

Therefore, we reject Hoiff

(2)

Under let

The restricted likelihood under is:

Solving:

We obtain the restricted MLEs:

Plugging in these restricted MLEs into , we obtain the restricted maximum likelihood

The unrestricted likelihood is:

Solving:

We obtain the unrestricted MLEs (*that is, the usual MLEs):

Plugging in these unrestricted MLEs into , we obtain the unrestricted maximum likelihood

Thus, the likelihood ratio is:

where

In the following, we show that rejecting H0 when

By the definition of Type I error rate, we have:

Thereby we have proven that the two tests are indeed, equivalent.

3b.Suppose that are a simple random sample from a Weibull distribution with density function

for some known constant . Note that when the Weibull distribution is simply an exponential distribution.

(a)Show that is exponentially distributed.

(b)If we wish to test versus , derive the value such that the test that rejects whenever is of size . Relate to a critical value of the distribution with degrees of freedom.

(c)Show that the test in part (b) above is a UMP level test.

(d)Please derive a confidence interval for by inverting the test.

Solution:

(a)The Weibull random variable X has a closed-form cumulative distribution function (cdf) given by

Thus the cdf of is

It is the cdf of the exponential distribution with parameter The mgf of is:

(b)The mgf of is:

The mgf of is:

Thus we know

This means that if we set

Then the test that rejects whenever is of size.

(c)The likelihood is:

So T is sufficient for this family of distribution. By the Karlin-Rubin Theorem, it suffices to verify that this family has a monotone likelihood ratio (MLR) in the statistic T. To verify this, we check that for ,

is a strictly decreasing function of T.

(d)The acceptance region corresponding the size test is:

Therefore a confidence interval for by inverting the test is:

3c.We have two independent samplesand, where and . For the hypothesis of

(a)Please derive the general formula for power calculation for the pooled variance t-test based on an effect size of EFF at the significance level of α.

Definition: Effect size = EFF =|| (e.g. Eff=1)

(b)With a sample size of 20 per group, α = 0.05, and an estimated effect size ranging from 0.8 to 1.2, please calculate the power of your pooled variance t-test.

Solution:

(a)T.S : =

At α=0.05, reject in favor of iff

Power=1-β=P(reject |)=

≈ (Effect size =)

(b)With n = 20, α = 0.05, Eff = 0.8 to 1.2, the power is calculated as follows:

Power (Eff = 0.8) =

Power (Eff = 1.2) =

Note: the T statistic above follows a t-distribution with 38 (=20+20-2) degrees of freedom.

Therefore we conclude that the power will range from 80% to 98% for a given effect size of 0.8 to 1.2.

*** That’s all, folks! ***