251v2outl 4/19/2006(Open this document in 'Outline' view!)
K. Two Random Variables.
1. Regression (Summary).
2. Covariance ( and )
a. Population Covariance
The population covariance is defined, using probability, as . This
can be used to describe the relationship between and .
If the covariance is positive we can say that and tend to
move together, while if it is negative we can say that they
tend to move in opposite directions. In order to use this
formula we must realize that .
This means that we must add together the product of and ,
together with their joint probability, for each possible pair of
values of and . For example, assume that and are
related by the following joint probability table: .
We begin by taking the upper left hand probability, .12, which
is the probability that both and are 400, and multiplying
it by 400 twice. Then we take the next probability in the same
row, .15, which is the probability that is 600 and is 400,
and multiply it by both 600 and 400. If we continue in this way we get
We can now use the following tableau to compute the means
and variances of and .
To summarize (a check),
,,
, and
We will need the variances below. To complete what we have
done, write
b. The Sample Covariance
The sample covariance is much easier to compute, the formula
being
.
For example, assume that we have data on income () and
savings ()(in thousands) for 5 families.
1 / 1.9 / 0.0 / 3.61 / 0.00 / 0.002 / 12.4 / 0.9 / 153.76 / 0.81 / 11.16
3 / 6.4 / 0.4 / 40.96 / 0.16 / 2.56
4 / 7.0 / 1.2 / 49.00 / 1.44 / 8.40
5 / 7.0 / 0.3 / 49.00 / 0.09 / 2.10
Sum / 34.7 / 2.8 / 296.33 / 2.50 / 24.22
and .
Then and .
,
and since
, .
The positive sign of , the sample covariance, indicates
that and tend to move together.
3. The Correlation Coefficient ( and )
The size of a covariance is relatively meaningless; to judge the strength
of the relationship between and we need to compute the
correlation,which is found by dividing the covariance by the standard
deviations of and .
a. Population Correlation.
For the population covariance, recall from above that
and
. So that
.
The correlation must always be between positive 1 and
negative 1 . A correlation close to zero is
called weak. A correlation that is close to one in absolute value
is called strong. (Actually statisticians prefer to look at the
value of the correlation squared.) A strong positive correlation
indicates that and have a relationship that is close to a
straight line with a positive slope. A strong negative
correlation means that the relationship approximates a straight
line with a negative slope. Unfortunately, the correlation only
indicates linear relationships; a nonlinear relationship that is
obvious on a graph may give a zero correlation.
b. Sample Correlation.
Recall that . If we
divide thecorrelation by the two standard deviations, we find
that
4. Functions of Two Random Variables.
and if ,
or , where
has the value or depending on whether the
product of and is negative or positive.
5. Sums of Random Variables.
a. and
b. Independence.
(i) Definition.
(ii) Consequences
If are independent,
, ,
and .
c. If and are constants, .
This and a. imply that
and
d. Application to portfolio analysis – Most of this is
from the document 251var2 in the supplement.
If , then
and
is the variance of the return. Thus if ,
we can say
.
For example, assume that ,,
but is unknown. Then
.
If we use the formula for immediately above,
= .
Now we can see the effect variousvalues of will have
on and .
The purpose of this section is to show how to find the minimum value
for Since variance is a measure of risk, minimizing variance minimizes risk, though actually, the best measure of risk is probably the coefficient of variation, the standard deviation divided by the mean, in this case .
Remember that .
Also recall that, since are shares of $1.00,, then .
Remember too, that .
If we put all this together,
Now let us assume some values for the standard deviations and the
correlation.
Let
Then
If we collect terms in , we get
or .
In order to minimize risk we pick our value of to give
us aminimum variance.
If we know calculus,the way that we find this minimum
variance is bytaking the first derivative of with
respect to and setting it equalto zero.
Since , if we set the variance
equal to zerowe get , which implies that
. Now since , we set
. That is, to minimize risk,we put about
23% of our money in stock 1 and 77% in stock 2.
If we do not know calculus, we can still minimize . Try values of at intervals of 0.1 between zero and one. We will find that the smallest values of
occur at and Now we can try values
of at intervals of 0.01 between 0.2 and 0.3. We will find
that the smallest value of occurs at
1