Bayesian Inference and computational methods – Practical 3

Solutions including some R code

1. Code for Gibbs sampler to generate from uniform distribution on finite region bounded by x = 0, x = 1, x + y = 1, x + y = 2.

A single iteration of the Gibbs sampler updates then current bivariate state vector (xi, yi) by drawing a new value xi+1 from the conditional density p(x|yi) and then drawing the yi+1 from the conditional density p(y| xi+1). Note that we condition on the updated value of xi+1, when we update y.

For the uniform target density above, a sketch should tell you that:

for x such that 0 < y < 2,

p(x|y) ~ U(max(0, 1-y), min(2-y, 1))

and

for x, 0 < x< 1,

p(y|x) ~ U(1-x, 2 - x).

Coding this in R gives.

nits = 10000

x = c(1:nits)

y = c(1:nits)

x[1] = 0.5

y[1] = 1 #Start somewhere in the region!

for(i in 1:(nits - 1)) {

x[i+1] = runif(1, max(0, 1-y[i]), min(2-y[i], 1))

y[i+1] = runif(1, 1-x[i+1], 2-x[i+1]) #remember to use updated x!

}

A scatter plot of the 10000 values looks like:

That looks OK (i.e. uniform over the region)! Also the Gibbs sampler seems to mix well as seen from a plot of x values against interation.

Within 100 iterations has explored the range of x fairly efficiently.

b) The marginal density of x is U(0, 1). The MCMC output seems to agree OK!

The marginal density of y is triangular and symmetric about y = 1 (check this). Again the MCMC output suggests that our algorithm is correctly implemented.

c) Suppose now we consider the case where the upper diagonal bounding line is x+y=1.1. Then the magnitude of the correlation between x and y under the target density is magnified and we should expect the Gibb's sampler to mix less efficiently under these circumstances.

The R code changes to become:

nits = 10000

x = c(1:nits)

y = c(1:nits)

x[1] = 0.5

y[1] = 0.55 #Start somewhere in the region!

for(i in 1:(nits - 1)) {

x[i+1] = runif(1, max(0, 1-y[i]), min(1.1-y[i], 1))

y[i+1] = runif(1, 1-x[i+1], 1.1-x[i+1]) #remember to use updated x!

}

Scatter plot of 10000 iterations of the chain:

Plot of 1st 100 iterations shows slow mixing of this chain - only a small part of the state space is explored in these 100 iterations

Question 2:

Code for a Gibb's sampler to investigate joint posterior of mu and psi = 1/sigma^2 for normal sample of size n. This is an adaption of the code used in lectures where the prior densities for mu and psi are selected to be (respectively) uniform and inversely proportional to psi. This is done by letting the prior variance for mu be infinite and the parameters in the gamma prior for psi approach zero.

n = 20#set sample size

nits = 20000 #set number of iterations

#sample generated from N(1, 0.5) - note rnorm uses sd!

x = rnorm(n, 1, sqrt(0.5))

xbar = mean(x)

s2 = var(x)

mu = c(1:nits)

psi = c(1:nits)

mu[1] = xbar #initiate chain using sample moments

psi[1] = 1/s2

for(i in 1:(nits-1)) {

psi[i+1] = rgamma(1, n/2, 0.5*sum((x-mu[i])*(x-mu[i])))

lam1 = ( n*psi[i+1])^-1*( psi[i+1]*sum(x))

lam2 = (n*psi[i+1])^-1

mu[i+1] = rnorm(1, lam1, sqrt(lam2))

}

From my sequence of 20000 iterations we can get estimates of posterior means and variances as:

> mean(mu )

[1] 0.7764556

> var(mu)

[1] 0.02777374

> mean(psi)

[1] 1.992330

> var(psi)

[1] 0.4194472

We estimate posterior probabilities P(mu > 1.5) and P(psi < 1.3333)

> sum(mu > 1.5)/nits

[1] 0

(i.e. the probability is extremely small since the Markov chain never visits states for which mu > 1.5)

> sum(psi<4/3)/nits

[1] 0.14815

(i.e. there is a significant posterior probability that psi < 4/3 - or 2 > 0.75).

If we repeat the exercise for n = 60, we should expect to see a greater degree of certainty in the values of mu and psi (evidenced by smaller posterior variances).

This is indeed the case and the data from the larger sample yield posteriors for the parameters that are concentrated close to the true values (mu = 1, psi = 2).

> mean(mu)

[1] 0.9633437

> var(mu)

[1] 0.007971639

> mean(psi)

[1] 2.137741

> var(psi)

[1] 0.1535509

c) To use standard results to check thibgs we can compare the known results (see chapter 3 of course notes) that tell us that the posteriors for mu and psi should satisfy

and

.

To check that the MCMC samples conform to the expected distributions for our last example we can examine:

hist((mean(x)-mu)/sqrt(var(x)/n))

give something that looks like a t59 distribution.

> mean((mean(x)-mu)/sqrt(var(x)/n))

[1] 0.003630326

> var((mean(x)-mu)/sqrt(var(x)/n))

[1] 1.021392

confirms this. Also

looks like a Chi-square on 59 degrees of freedom.

> mean(psi*(n-1)*var(x))

[1] 59.06275

> var(psi*(n-1)*var(x))

[1] 117.2113

likewise suggests that the results are valid. (Recall the mean and variance are 59 and 118 for Chi-squared on 59 degrees of freedom.) Therefore we can be reasonably confident that our algorithm is correctly coded, given that the mean and variance for our MCMC sample are reasonably close to these values.

Question 3. (see lecture notes) Recall solution question 3 in Practical 2, where the Metropolis algorithm was used to investigate the posterior distribution of (a, b). All we need to do is replace the following block of code therein:

bp = b[i] + 2*delb*(runif(1, 0, 1)-0.5) #propose new b

if(bp < 0) b[i+1] = b[i]#reject if negative value proposed

else{

pacc = (bp/b[i])^(nsamp*a[i+1])*exp(-1.0*(sx + theta)*(bp - b[i]))

if(runif(1, 0, 1) < pacc) b[i+1] = bp

else b[i+1] = b[i]

}

with a Gibbs step whereby we draw from the conditional distribution of b given a and the data. This Gibbs step is represented by the single command in R:

b[i+1] = rgamma(1, nsamp*a[i+1], sx + theta).