Chapter 5 Probability Distributions (Discrete Variables)

Chapter 5 – Probability Distributions (Discrete Variables)

Defn: A random variable is a variable whose values are determined by chance. We will denote a random variable by a capital letter, such as X, and denote particular values of the variable by the corresponding lower case letter, x. Thus we read P(X = x) as: “the probability that the random variable X takes on the value x.”

Example: Consider the random experiment of rolling two dice, a green one and a red one. Let the variable X be the sum of the numbers showing on the top faces. X can have the possible values 2, 3, 4, 5, 6, …, 11, or 12. When we roll the two dice once, we cannot predict with certainty which value of X will occur.

Defn: A probability distribution consists of a set of pairs of numbers. In each pair, the first number is a possible value of a random variable, and the second number is the probability that the value will occur in a performance of the random experiment.

Example: For our random experiment of rolling two dice, if we assume that the dice are fair, then each possible outcome in the sample space has the same probability of occurring. If we let the random variable X be the sum of the numbers on the top faces, then there is exactly one way for X to take on the value 2, namely if the numbers showing on the top faces are both 1. Thus P(X = 2) = 1/36.

There are two ways that X can take on the value 3, if the outcome is either (1, 2) or (2, 1). Thus

P(X = 3) = 2/36 = 1/18. The complete probability distribution for the random variable X is given in the table below:

X / P(X = x)
2 / 1/36
3 / 1/18
4 / 1/12
5 / 1/9
6 / 5/36
7 / 1/6
8 / 5/36
9 / 1/9
10 / 1/12
11 / 1/18
12 / 1/36

Required Properties of a Probability Distribution

1) The probability of the occurrence of any event must be a number between 0 and 1.

2) For probability distributions with a countable number of possible values of the random variable, the sum of the probabilities for the various values must be one; i.e.,

, where the sum is over all possible values of the random variable.

Example: p. 208, Exercise 5.15

Example: p. 208, Exercise 5.19

Mean, Variance and Expectation

Defn: The expectation, or mean, of a probability distribution for a random variable X is defined by

. I.e., we multiply each possible value of the random variable by the probability of occurrence of that value, and add all of these products together. We also call m the mean of the random variable X, or the expectation of the random variable X.

Example: Let the random experiment be rolling two fair dice, one green and one red. Let X be the sum of the numbers on the dice. The probability distribution is given in the table above. Then the mean of the distribution is

In this case, the mean of the distribution is the most likely value. This is not always the case, however.

Example: p. 213, Exercise 5.29

Example: p. 214, Exercise 5.33

The mean of the random variable (or of the distribution) tells us the long-run average of the variable when the random experiment is performed many times. We also want a measure of the variability of the random variable..

Defn: The variance of a random variable X is . I.e., we first find the mean, m, of the random variable, then for each value of x, we square x-m and multiply the result by the probability that the value x occurs. Then we add all of these quantities together to get the variance of X. The standard deviation of X is the positive square root of the variance.

Note: As with the sample variance and standard deviation, larger values of s2 or of s mean that the distribution is more spread out. Also, if s2 = 0, or if s = 0, then there is no variability in X; X has only one possible value, namely its mean.

Example: Roll a fair die. Let X be the number showing on the top face. The probability distribution of X is given in the table below:

x / 1 / 2 / 3 / 4 / 5 / 6
P(X = x) / 1/6 / 1/6 / 1/6 / 1/6 / 1/6 / 1/6

The mean of X is m = (1)(1/6) + (2)(1/6) + (3)(1/6) + (4)(1/6) + (5)(1/6) + (6)(1/6) = 3.5.

The variance of X is

s2 = (1 – 3.5)2(1/6) + (2 – 3.5)2(1/6) + (3 – 3.5)2(1/6) + (4 – 3.5)2(1/6) + (5 – 3.5)2(1/6) + (6 – 3.5)2(1/6)

= 2.9,

and the standard deviation of X is s = 1.71.

Example: p. 213, Exercise 5.29.

Example: p. 214, Exercise 5.33

The Binomial Distribution

An important special probability distribution, which appears often when doing surveys, is called the binomial distribution. A binomial experiment is a random experiment which has the following characteristics:

1) There are n independent and identical trials.

2) Each trial results in one of two possible outcomes, Success or Failure.

3) The probability of Success is the same for each individual trial.

We then define a random variable X to be the number of Successes which occur in the n trials. The random variable X has a binomial probability distribution.

The binomial probability distribution is given by the following equation:

, for x = 0, 1, 2, 3, …, n.

By “independent” in condition 1, we mean that the outcome of one trial does not affect the outcome of any other trial. By “identical” we mean that the trials are performed in exactly the same way.

Example: Which of the following are binomial experiments or can be made into binomial experiments?

a) Surveying 100 people to determine whether they like Sudsy Soap.

b) Tossing a coin 100 times to see how many heads occur.

c) Drawing a card from a deck and getting a heart.

d) Asking 1000 people which brand of cigarette they smoke.

e) Testing four different brands of aspirin to see which brands are effective.

f) Testing one brand of aspirin using 10 people to determine whether the brand is effective at relieving headaches.

g) Asking 100 people whether they smoke.

We can find binomial probabilities using the TI-83 calculator:

Example: A Gallup Poll survey of adult Americans found that 90% of the people interviewed were unaware that maintaining a healthy weight could reduce the risk of stroke. If 15 adult Americans are selected at random, find the probability that exactly 9 are unaware that maintaining a healthy weight could reduce the risk of stroke? What is the probability that no more than 9 are unaware that maintaining a healthy weight could reduce the risk of stroke? What is the probability that at least 10 of the people in the sample are unaware?

Since the selection is done randomly, the 15 trials are independent of each other (what does “independent” mean?). Since the same information is being sought for each person, the 15 trials are identical to each other. Hence, condition (1) is satisfied. Either a person is unaware that maintaining a healthy weight could reduce the risk of stroke (Success) or is aware (Failure), so condition (2) is satisfied. The probability that a person in the sample is unaware that maintaining a healthy weight could reduce the risk of stroke is 0.90 for each of the 15 people selected, so condition (3) is satisfied. If we define X to be the number of persons in the sample who are unaware that maintaining a healthy weight could reduce the risk of stroke, then X has a binomial distribution with parameters n = 15 and p = 0.90.

a) (i) Choose 2nd , DISTR, and binompdf(.

(ii) Enter 15, 0.90, and 9.

(iii) What is the result?

b) (i) Choose 2nd, DISTR, and binomcdf(.

(ii) Enter 15, 0.90, and 9

(iii) What is the result?

c) How would we find this probability? What rule would we use?

Mean, Variance and Standard Deviation for the Binomial Distribution

For a random variable X which has a binomial distribution with parameters n and p, we have

(1) m = np

(2) s2 = np(1-p)

(3) s =

Example: p. 224, Exercise 5.58.

Is this a binomial experiment? What is the mean? The standard deviation?