251v2outl 4/19/2006(Open this document in 'Outline' view!)

K. Two Random Variables.

1. Regression (Summary).

2. Covariance ( and )

a. Population Covariance

The population covariance is defined, using probability, as . This

can be used to describe the relationship between and .

If the covariance is positive we can say that and tend to

move together, while if it is negative we can say that they

tend to move in opposite directions. In order to use this

formula we must realize that .

This means that we must add together the product of and ,

together with their joint probability, for each possible pair of

values of and . For example, assume that and are

related by the following joint probability table: .

We begin by taking the upper left hand probability, .12, which

is the probability that both and are 400, and multiplying

it by 400 twice. Then we take the next probability in the same

row, .15, which is the probability that is 600 and is 400,

and multiply it by both 600 and 400. If we continue in this way we get

We can now use the following tableau to compute the means

and variances of and .

To summarize (a check),

,,

, and

We will need the variances below. To complete what we have

done, write

b. The Sample Covariance

The sample covariance is much easier to compute, the formula

being

.

For example, assume that we have data on income () and

savings ()(in thousands) for 5 families.

1 / 1.9 / 0.0 / 3.61 / 0.00 / 0.00
2 / 12.4 / 0.9 / 153.76 / 0.81 / 11.16
3 / 6.4 / 0.4 / 40.96 / 0.16 / 2.56
4 / 7.0 / 1.2 / 49.00 / 1.44 / 8.40
5 / 7.0 / 0.3 / 49.00 / 0.09 / 2.10
Sum / 34.7 / 2.8 / 296.33 / 2.50 / 24.22

and .

Then and .

,

and since

, .

The positive sign of , the sample covariance, indicates

that and tend to move together.

3. The Correlation Coefficient ( and )

The size of a covariance is relatively meaningless; to judge the strength

of the relationship between and we need to compute the

correlation,which is found by dividing the covariance by the standard

deviations of and .

a. Population Correlation.

For the population covariance, recall from above that

and

. So that

.

The correlation must always be between positive 1 and

negative 1 . A correlation close to zero is

called weak. A correlation that is close to one in absolute value

is called strong. (Actually statisticians prefer to look at the

value of the correlation squared.) A strong positive correlation

indicates that and have a relationship that is close to a

straight line with a positive slope. A strong negative

correlation means that the relationship approximates a straight

line with a negative slope. Unfortunately, the correlation only

indicates linear relationships; a nonlinear relationship that is

obvious on a graph may give a zero correlation.

b. Sample Correlation.

Recall that . If we

divide thecorrelation by the two standard deviations, we find

that

4. Functions of Two Random Variables.

and if ,

or , where

has the value or depending on whether the

product of and is negative or positive.

5. Sums of Random Variables.

a. and

b. Independence.

(i) Definition.
(ii) Consequences

If are independent,

, ,

and .

c. If and are constants, .

This and a. imply that

and

d. Application to portfolio analysis – Most of this is

from the document 251var2 in the supplement.

If , then

and

is the variance of the return. Thus if ,

we can say

.

For example, assume that ,,

but is unknown. Then

.

If we use the formula for immediately above,

= .

Now we can see the effect variousvalues of will have

on and .

The purpose of this section is to show how to find the minimum value

for Since variance is a measure of risk, minimizing variance minimizes risk, though actually, the best measure of risk is probably the coefficient of variation, the standard deviation divided by the mean, in this case .

Remember that .

Also recall that, since are shares of $1.00,, then .

Remember too, that .

If we put all this together,

Now let us assume some values for the standard deviations and the

correlation.

Let

Then

If we collect terms in , we get

or .

In order to minimize risk we pick our value of to give

us aminimum variance.

If we know calculus,the way that we find this minimum

variance is bytaking the first derivative of with

respect to and setting it equalto zero.

Since , if we set the variance

equal to zerowe get , which implies that

. Now since , we set

. That is, to minimize risk,we put about

23% of our money in stock 1 and 77% in stock 2.

If we do not know calculus, we can still minimize . Try values of at intervals of 0.1 between zero and one. We will find that the smallest values of

occur at and Now we can try values

of at intervals of 0.01 between 0.2 and 0.3. We will find

that the smallest value of occurs at

1