***REVIEW OF STATISTICS

. sample space S = the set of anything that can happen which is of interest to the investigator.

. event A = a subset of the sample space S.

. events Ai and Aj are disjoint if Ai Aj = .

. Axioms of probabilities: Pr(.) is a probability function if it satisfies:

1. Pr(Ai)  0 for any Ai S

2. Pr(S) = 1

3. Pr(Ai Aj) = Pr(Ai) + Pr(Aj) if Ai and Aj are disjoint events.

. A random variable X is a real value function that has a specific value at each point of the sample space.

For any B  S, the probability that X  B is Pr(X  B).

. A distribution function is the function F(t) = Pr(X  t) such that:

1. F(t) is non-decreasing and continuous from the right

2. F(-) = 0

3. F(+) = 1.

. A probability function = f(x) where x can be a discrete or a continuous variable.

- discrete case: when X can take a countable number of distinct values: x1, x2, x3, ... Then,

f(xi) = Pr(X = xi),

and

Pr(X  B) = i[f(xi): xi B)].

- continuous case: the function f(x) satisfies

Pr(x  B) = xB f(x) dx,

where f(x) = F(x)/x.

Note: Pr(X = x) = 0 in the continuous case.

. In the multivariate case, x = (x1, x2, ..., xn) where n is the number of random variables.

- The joint distribution function of x = (x1, x2, ..., xn) is Fn(x) = Pr(X1 x1, X2 x2, ..., Xn xn).

- The marginal distribution of the subset (x1, x2, ..., xk), k < n, is Fk(x1, x2, ..., xk) = Fn(x1, x2, ..., xk, , ..., )

. The marginal probability function is

fk(x1, …, xk) =  …  fn(x1, …, xn) dxk+1 … dxn, in the continuous case,

and

fk(x1, …, xk) = fn(x1, …, xn), in the discrete case,

where fn(x1, ..., xn) is the joint probability function.

. The random variables (x1, x2, ..., xn) are independent if

1

Fn(x1, x2, ..., xn) = F1(x1) F2(x2) ... Fn(xn)

or

fn(x1, x2, ..., xn) = f1(x1) f2(x2) ... fn(xn).

. Conditional distribution: Let f(x,y) be the joint probability function for (x, y). Then,

g1(x) =  f(x, y) dy is the marginal probability function for x,

and

g2(y) =  f(x, y) dx is the marginal probability function for y.

The conditional probability function of x given y is

h1(xy) = f(x,y)/g2(y),

and the conditional probability function of y given x is

h2(yx) = f(x,y)/g1(x).

. Bayes theorem:

h2(y| x) =

in the continuous case, and

h2(y| x) =

in the discrete case.

Proof: (in the continuous case)

h2(y| x) = f(x, y)/g1(x) = = .

Q.E.D.

In the case where x corresponds to sample information, g2(y) is called the prior probability, h1(xy) is called the likelihood function of the sample and h2(yx) is called the posterior probability.

. Expectations: The expected value of some function r(x) is given by

E[r(x)] =  r(x) f(x) dx, in the continuous case,

or

E[r(x)] = x r(x) f(x), in the discrete case,

where E is the "expectation operator."

. The kth moment: Choose r(x) = xk, k = 1, 2, ... Then,

mk = E(xk) is the kth moment of x.

if k = 1, then m1 = E(x) = the mean (or average) of x, a common measure of the "location" of x.

if k = 2, then m2 = E(x2) = the second moment of x.

if k = 3, then m3 = E(x3) = the third moment of x,...

. The kth central moment: Choose r(x) = (x - m1)k, k = 2, 3, ... Then, Mk = E[(x - m1)k] is the kth central moment of x.

if k = 2, then M2 = E[(x-m1)2] = the variance of x, a common measure of the "spread" or "dispersion" of x.

if k = 3, then M3 = E[(x-m1)3], the third central moment of x.

if k = 4, then M4 = E[(x-m1)4], the fourth central moment of x...

. Note: variance of x = V(x) = E(x-m1)2 = E(x2 + m12 - 2xm1) = m2 - m12.

. Other measures:

standard deviation = (M2)½

coefficient of variation = (M2)½/m1

relative skewness = M3/(M21.5)

relative kurtosis = M4/M22

covariance = Cov(x, y) = E[(x - E(x))(y - E(y))]

= E[x y -x E(y) - yE(x) + E(x) E(y)]

= E(x y) - E(x) E(y)

correlation = (x, y) = Cov(x, y)/[(M2(x) M2(y)]½; -1  1.

. Let: x = (x1, x2, ..., xn)' = (n1) vector with mean E(x) =  = (1, 2, ..., n)' and variance  = {ij} where ii = V(xi) is the variance of xi and ij = Cov(xi, xj) is the covariance of xi with xj, and  is a (nn) matrix. Let y = Ax + b. Then,

E(y) = A E(x) + b = A  + b

V(y) = A V(x) A' = A  A'.

. Note: If x and y are independently distributed with finite variances, then Cov(x, y) = 0 and V(x + y) = V(x) + V(y).

. Chebyschev inequality: If V(x) exists (i.e. if it is finite), then

Pr[|x - E(x)|  t]  V(x)/t2

. Moment Generating Function: G(t) = E(etx)

If mr exists (i.e. if it is finite), then the moment generating function G(t) satisfies

[rG(t)/tr]t=0 = E(xr) = mr, r = 1, 2, 3, ...

Proof: A Taylor series expansion of etx evaluated at tx = 0 gives

G(t) = E[1 + tx + (tx)2/2! + (tx)3/3! + ...].

Evaluating the derivative of this expression with respect to t at t = 0 gives the desired result.

Q.E.D.

. Conditional Expectation: Let f(x,y) be a joint probability function, g2(y) be the marginal probability function of y, and h1(xy) = f(x, y)/g2(y) be the conditional probability of x given y. The conditional expectation of a random variable x given y is the expectation based on the conditional probability h1(xy). The unconditional expectation Ex,y of some function r(x,y) is given by

Ex,y r(x, y) = Ey[Exy r(x, y)].

where Exy is the conditional expectation operator and Ey is the expectation based on the marginal probability of y.

Proof: Ex,y r(x, y) = Σx,y r(x, y) f(x, y)

= Σx,y r(x, y) h1(xy) g2(y)

= Σy[Σx r(x, y) h1(xy)] g2(y)

= Ey[Exy r(x, y)].

Q.E.D.

. Conjugate Distributions: In a Bayesian framework, a distribution is conjugate if, for some likelihood function, the prior and posterior distributions belong to the same family.

Example: normal distribution for the unknown mean of a random sample from a normal distribution.

. Some Special Discrete Distributions:

ProbabilityMoment GeneratingMeanVariance

Function f(x)Function G(t)

. Binomial:

n! px(1-p)n-x(pet+(1-p))nnpnp(1-p)

x!(n-x)!

for 0 < p < 1; x = 0, 1, ..., n;

. Bernoulli = Binomial when n = 1

. Negative Binomial:

(r+x-1)! pr(1-p)x[p/(1-(1-p)et)]rr(1-p)/pr(1-p)/p2

x!(r-1)!

for 0 < p < 1; x = 0,1,...,n; for (1-p)et < 1

. Geometric = Negative Binomial when r = 1

. Poisson:

e-xexp((et-1))

x!

for x = 0, 1, 2, ...;  > 0;

. Uniform:

f(x) = 1/n(n+1)/2(n2-1)/12

for x = 1, 2, ...,n; n = integer

1

. Some Special Continuous Distributions:

ProbabilityMoment GeneratingMeanVariance

FunctionFunction G(t)

. Beta:

Γ(+) x-1(1-x)-1/(+) αβ/[(+)2(++1]

Γ()Γ()

for 0 < x < 1

. Uniform:

f(x) = 1/(b-a)(b+a)/2(b-a)2/12

for a < x < b

. Normal:

1 exp{-(x-)2/22}exp{t+2t2/2}2

(22)½

for  > 0

. Gamma:

 x-1e-x(/(-t)//2

()for t < 

for  > 0;  > 0; x > 0

. Exponential = Gamma with  = 1

. Chi Square = Gamma with  = k/2;  = 1/2; k = positive integer

. Pareto:

k/x+1k/(-1)k2/[(-2)(-1)2]

for x > k > 0;  > 0for  > 1for  > 2

. Lognormal

1 exp{-(log(x)-m)2/22} exp(m+2/2) [exp(2)-1]exp(2m+2)

x(2)½

for x > 0;  > 0

Note: n! = n (n-1) (n-2) … 1.

() = y-1 e-y dy

= 1 if  = 1

= (-1)! if  is an integer.