Chapter : the Normal Distribution

CHAPTER:THE NORMAL DISTRIBUTION

1Introduction

2The Standard Normal Distribution, N(0,1)

3Use of the standard normal tables

4Use of the standard normal tables for any normal distribution

5Problems that involve finding the value of  or  or both

6The Normal Approximation to the Binomial distribution

7The Normal Approximation to the Poisson distribution

8When to Use the Different Approximations

9When There are More Than One Independent Normal Variables

10Miscellaneous Examples

 1 Introduction

A continuous random variable X having probability density function f(x) where

f(x) = e - , - < x <  ,  > 0 and  are real constants,

is said to have a normal distributionwith mean  and variance 2.

 and 2 are the parameters of the distribution.

If X is distributed in this way, we write X ~ N(,2). y = f(x)

On the right is a sketch of y = f(x).

Note:

a)The distribution is bell-shaped and symmetrical about x = .

b)Mean = mode = median = .-3 -2-  ++2+3

c)Approximately 95% of the distribution lies within  2 standard deviations of the mean, i.e. P(|X - | < 2) = 0.95.

d)P(|X - | < ) = 0.68 and P(|X - | < 3) = 0.997

e)The maximum value of f(x) occurs when x= and is given by f(x) = .

f)The variable X is random because it can be shown that e - dx = 1

g)The probability that X lies between a and b is given by

P(a  x  b) = e - dx P(a  X  b)

a b

The actual size of the bell-shaped curve depends on the values of  and . Larger  gives a wider base.

X ~ N(50,4) X ~ N(0,1)X ~ N(4,)

0.8

0.4

0.2

44 46 48 50 52 54 56 -3 -2 -1 0 1 2 31 2 3 4 5 6 7

 2 The Standard Normal Distribution, N(0,1)

If a continuous random variable Z has a normal distribution with mean 0 and variance 1,

then Z is said to have a standard normal distribution i.e. Z ~ N(0,1).

The probability density function of the standard normal variable Z is denoted by (z), (z)

where(z) = e - ,  < z < . 0.4

A sketch of the probability density function of Z shows that:

a)The curve is symmetrical about the line z = 0 as (-z) = (z).

b)The curve has a maximum point at (0, ),

and it has points of inflexion at x =  1 -3 -2 -1 0 1 2 3 z

The cumulative distribution function of the standard normal variable Z is denoted by  (z) where

(z) = P(Z < z) = . (z)

This integral is very difficult to evaluate, so we refer to tables giving (z),

0 z z

 3 Use of the standard normal tables

Only positive values of z are printed in the tables, so for negative values of z, the symmetrical properties of the curve are used:

1. P(Z < a) = 1  (a) (z) (z)

 a 0 z0az

2. P(Z > a) = P(Z < a) (z) (z)

=  (a)

 a 0 z 0 az

P(a < Z < b) =  (a)  (b)where a, b > 0 (z)

0 a b z

P(  a < z < b) =  (b) – [ 1  (a) ] where a, b > 0 (z)

=  (b) +  (a)  1

-ab

It should be noted that the tables may be printed in one of two different formats.

1. They may give the values of (z), or 2. The values of Q(z),

where (z) = P(Z<z) where Q(z) = P(Z>z)

We will only refer to the tables giving (z), the cumulative probability of the standard normal distribution.

Example 3.1

If Z ~ N(0,1) find from tables

a)P(Z < 1.377)

b)P(Z > 1.377)

c)P(Z < 1.377)

d)P(Z >  1.377). [0.9158, 0.0842, 0.0842, 0.9158]

Solution

Example 3.2

If Z ~ N(0,1), find

a)P(0.345 < Z < 1.751)

b)P( 2.696 < Z < 1.865)

c)P(1.4 < Z < 0.6)

d)P( |Z| < 1.433)

e)P( Z > 0.863 or Z < 1.527) [0.3250, 0.9655, 0.1935, 0.8480, 0.2576]

Solution

Example 3.3

If Z~N(0,1), find the value of a if

a) P(Z > a) = 0.3802

b) P(Z > a) = 0.7818

c) P(Z < a) = 0.0793

d) P(Z < a) = 0.9693

e) P( |Z| < a) = 0.9[ 0.305, - 0.778, - 1.41, 1.87, 1.645]

Solution

 4 Use of the standard normal tables for any normal distribution

Every normal distribution with mean  and variance 2 can be transformed into a standard normal distribution. This process is called the standardization of the normal distribution.

If X ~ N(,2), then Z = is a random variable with Z ~ N(0,1).

Example 4.1

Given that Z = and E(X) =  and Var(X) = 2, show that E(Z) = 0 and Var(Z) = 1,

where Z is the standard normal distribution.

Solution

Example 4.2

The random variable X ~ N(300,25). Find

a)P(X > 305)

b)P(X < 291)

c)P(X < 312)

d)P(X > 286)[ 0.1587, 0.0359, 0.9918, 0.9974]

Solution

Example 4.3

Given X ~ N(50,8) find

a) P(48 < X < 54)

b) P(52 < X < 55)

c) P(46 < X < 49)

d) P( |X-50| < )[0.6814, 0.2014, 0.2830, 0.6826]

Solution

Example 4.4

If X ~ N(100,80), find

a)P(99 < X <105)

b)P( |X-100| < )

c)P(105 < X < 115)[0.2565, 0.6826, 0.2413]

Solution

Example 4.5

If X ~ N(45,16) and if

a)P(X < a) = 0.0317, find a

b)P(X < b) = 0.895, find b

c)P(X < c) = 0.0456, find c

d)P(X < d) = 0.996, find d.[37.572, 50.012, 38.244, 55.6]

Solution

Example 4.6

If X ~ N(100,36) and P(X > a) = 0.1093, find the value of a. [107.38]

Solution

Example 4.7

If X ~ N(24,9) and P(X > a) = 0.974, find the value of a. [18.171]

Solution

Example 4.8

If X ~ N(70,25), find the value of a such that P( |X-70| < a) = 0.8. Hence find the limits within which the central 80% of the distribution lies. [a=6.41, limits : 63.59 and 76.41]

Solution

Example 4.9

If X ~ N(80,36), find c such that P( |X-80| < c) = 0.9.

Hence find the limits within which the central 90% of the distribution lies. [ c=9.87, limits: 70.13 and 89.87]

Solution
Example 4.10

A certain type of cabbage has a mass which is normally distributed with mean 1 kg and standard deviation 0.15kg. In a lorry load of 800 of these cabbages, estimate how many will have mass

a)greater than 0.79 kg

b)less than 1.13 kg

c) between 0.85kg and 1.15kg

d) between 0.75kg and 1.29kg [ 735, 646, 546, 740]

Solution

Example 4.11

The time taken by a milkman to deliver milk to CJC is normally distributed with mean 12 minutes and standard deviation 2 minutes. He delivers milk daily. Estimate the number of days during the year when he takes

a)longer than 17 minutes

b)less than 10 minutes

c)between 9 and 13 minutes. [2, 58, 228]

Solution

 5 Problems that involve finding the value of  or  or both

Example 5.1

The lengths of certain items follow a normal distribution with mean  and standard deviation 6 cm.

It is known that 4.78% of the items have a length greater than 82cm. Find the value of the mean . [72.0]

Solution

Example 5.2

The masses of articles produced in a particular workshop are normally distributed with mean  and standard deviation . 5% of the articles have a mass greater than 85g and 10% have a mass less than 25g.

Find the values of  and , and find the ranges symmetrical about the mean, within which 75% of the masses lie.

[ = 51.3,  = 20.5, limits: 27.7g and 74.9g]

Solution

 6 The Normal Approximation to the Binomial distribution

Under certain circumstances the normal distribution can be used as an approximation to the binomial distribution. If X ~ Bin(n,p) then E(X) = np and Var(X) = npq where q = 1-p.

For large n and p not too small or too large (it can be n > 10, p close to or n > 30, p around )

then X ~ N(np,npq) approximately.

This approximation is very good if both np and nq are greater than 5.

This property is used to simplify calculation in problems involving the Binomial distribution with large n.

Since we are using a continuous Normal distribution to approximate a discrete Binomial distribution,

we have to make a continuity correction.

If X ~ Bin(n,p) and is approximated by Y ~ N(np,npq), then for r, k  Z, and n > 30, np > 5,

P(X = r) P(r - < X < r + )

P(r  X  k) P(r - < X < k + )
P(r < X < k) P(r + < X < k - )

Below are some examples:

Discrete / Continuous
P(X = 3) / P(2.5 < X < 3.5)
P(X  3) / P(X < 3.5)
P(X < 3) / P(X < 2.5)
P(X > 3) / P(X > 3.5)
P(X  3) / P(X > 2.5)
P(3 < X < 7) / P(3.5 < X < 6.5)
P(3  X  7) / P(2.5 < X < 7.5)
P(3 < X  7) / P(3.5 < X < 7.5)
P(3  X < 7) / P(2.5 < X < 6.5)

Example 6.1

Find the probability of obtaining between 4 and 7 heads inclusive with 12 tosses of a fair coin,

a) using the binomial distribution,

b) using the normal approximation to the binomial distribution.[ 0.733, 0.732]

Solution

0.226

0 1 2 3 4 5 6 7 8 910 1112 13

no. of heads

Example 6.2

It is known that in a sack of mixed grass seeds 35% are rye grass seeds. Use the normal approximation to the binomial distribution to find the probability that in a sample of 400 seeds, there are

a)less than 120 rye grass seeds,

b)between 120 and 150 rye grass seeds (inclusive),

c)more than 160 rye grass seeds. [0.0158, 0.8487, 0.0158]

Solution

 7 The Normal Approximation to the Poisson distribution

Under certain circumstances the normal distribution can be used as an approximation to the Poisson distribution.

If X ~ Po () then E(X) =  and Var (X) = .

For large  ( > 10), then X ~ N(,) approximately.

Since we are using a continuous Normal distribution to approximate a discrete Poisson distribution,

we have to make a continuity correction.

Example 7.1 (Crawshaw and Chambers)

A radioactive disintegration gives counts that follow a Poisson distribution with mean count per second of 25.

Find the probability that in 1 second, the count is between 23 and 27 inclusive

a)using the Poisson distribution

b)using the Normal approximation to the Poisson distribution. [ a) 0.383 b) 0.383]

Solution

Example 7.2 (Crawshaw and Chambers)

In a certain factory the number of accidents occurring in a month follows a Poisson distribution with mean 4.

Find the probability that there will be at least 40 accidents during one year. [0.8901]

Solution

 8 When to Use the Different Approximations

Distribution of X / Restriction on parameters / Approximation
X ~ Bin(n,p) / n large (>50), p small (<0.1), np < 5 / X ~ Po (np)
X ~ Bin(n,p) / n>10, p close to
or
n>30, p moving away from
or
np > 5, nq >5. The bigger the value of n, the better the approximation. / X ~ N(np,npq)
X ~ Bin(n,p) / n is small, i.e < 30 / None, use original distribution i.e. Bin(n,p)
X ~ Po() /  large ( > 10) / X ~ N(,)

Example 8.1

If X ~ Bin(20,0.4), find the probability that 6  X  10.

Then find the approximations to this probability using

a)the normal distribution

b)the Poisson distribution. [ 0.7469, 0.7462, 0.6246]

Solution

Example 8.2

A large batch of clay pots is molded and fired. After firing, a random sample of 10 pots is inspected for flaws before glazing, decoration and final firing. If 25% of the pots in the batch have flaws, calculate correct to 3 significant figures, the probability that the random sample contains

a) no pot with flaws,

b) at least 3 pots with flaws.

The batch is accepted without further checking if the random sample contains no more than 2 pots with flaws. Calculate the probability of a batch being accepted.

80 samples of 10 pots are inspected. Calculate, using a suitable approximation,

c) the probability that at most 4 samples contain no pot with flaws,

d) the probability that more than 40 samples contain at least 3 pots with flaws.

[ 0.0563, 0.474, 0.526, 0.531, 0.2840]

Solution

 9 When There are More Than One Independent Normal Variables

If X and Y are two independentnormal variables such that X ~ N(1, 21) and Y ~ N(2, 22), a, b constants, then

a)X + Y ~ N(1 + 2 , 21 + 22)

b)X  Y ~ N(1 - 2 , 21 + 22)

c)W ~ N(a1 + b, a221 ) where W = aX + b

Proof: E(W) = E(aX + b) = a E(X) + b = a + b

Var(W) = Var(aX + b) = a2Var(X) = a22

d)aX ~ N(a, a22)

e)aX + bY ~ N(a1 + b2, a221 + b222)

f)aX  bY ~ N(a1 - b2, a221 + b222).

If X1, X2, …, Xn are n independent normal variables such that X1 ~ N(1, 21), X2 ~ N(2, 22), …,

Xn ~ N(n, 2n), then X1 + X2 + …+ Xn ~ N(1 + 2 + … + n , 21 + 22 + … + 2n ).

Also, a1X1 + a2X2 + ... + anXn ~ N(a11 + a22 + ... + ann, a1212 + a2222 + ... + an2n2)

In the special case when X1, X2, …, Xn are independent observations from the same normal distribution

so that Xi ~ N(,2) for i = 1, 2, …,n then X1 + X2 + …+ Xn ~ N(n, n2).

Note : Care must be taken to distinguish between the random variable 2X and the random variable X1 + X2,

where X1 and X2 are two independent observations of the random variable X.

If X ~ N(, 2), then 2X ~ N(2, 42)

but X1 + X2 ~ N(2, 22)

Example 9.1

If X ~ N(60,16) and Y ~ N(70,9), find

a)P(120 < X + Y < 135)

b)P(2 < Y  X < 12) [0.8185, 0.6006]

Solution

Example 9.2

The weights of Thai durians are normally distributed with mean 2000g and standard deviation 120g.

Determine, correct to 3 significant figures, the probability that

a) a sample of 4 Thai durians weighs more than 8200g.

The weights of ‘XO’ durians are normally distributed with mean 1750g and standard deviation 90g.

Determine, correct to 2 significant figures, the probability that

c) an ‘XO’ durian weighs less than a Thai durian,

d) a sample of 8 ‘XO’ durians weighs more than a sample of 7 Thai durians. [ 0.203, 0.952, 0.500]

Solution

Example 9.3

In a cafeteria, baked beans are served either in ordinary portions or in children’s portions. The quantity given for an ordinary portion is a normal variable with mean 90g and standard deviation 3g and the quantity given for a children’s portion is a normal variable with mean 43g and standard deviation 2g. What is the probability that John, who has 2 children’s portions, is given more than his father, who has an ordinary portion? [Ans: 0.166]

Solution

Example 9.4 (N81/2/11) (modified part)

The thickness, P cm, of a randomly chosen paperback is normally distributed with mean 2 and variance 0.730. The thickness, H cm, of a randomly chosen hardback is normally distributed with mean 4.9 and variance 1.920.

a)Find the probability that the combined thickness of four randomly chosen paperbacks is greater than the combined thickness of two randomly chosen hardbacks.

b)Find the probability that a randomly chosen paperback is less than half the thickness of a randomly chosen hardback. [0.2444, 0.6586]

Solution

Example 9.5 (J87/2/8)

The weight of the contents of a randomly chosen packet of breakfast cereal A may be taken to have a normal distribution with mean 625g and standard deviation 15g, The weight of the packaging may be taken to have an independent normal distribution with mean 25g and standard deviation 3g.

Find, giving 3 significant figures in your answers,

a) the probability that a randomly chosen packet of A has a total weight exceeding 630g

b) the probability that the total weight of the contents of 4 randomly chosen packets of A exceeds 2450g.

The weight of the contents of a randomly chosen packet of breakfast cereal B may be taken to have a normal distribution with mean 465g and standard deviation 10g. Find the probability that the contents of 4 randomly chosen packets of B weigh more than the contents of 3 randomly chosen packets of A.

[0.904, 0.952, 0.324]

Solution

 10 Miscellaneous Examples

Example 10.1 (HCJC 96/2/8 part)

The heights of males in a certain age group are normally distributed with mean 172 cm and standard deviation

 cm. The heights of females in the same age group are also normally distributed with mean 166 cm and standard deviation 12 cm.

a)If 95% of the males are taller than 155.55 cm, show that  = 10 cm.

b)Find the probability that a randomly chosen male has a height between 170 cm and 171 cm.

Deduce, to three significant figures, the probability that each of two randomly chosen males has a height between 170 cm and 171 cm.

c)Find the probability that the total height of three randomly chosen males exceeds three times the height of a randomly chosen female by at least 11 cm. [0.0395, 0.00156, 0.5695]

Solution

SUMMARY

Notation:X ~ N(,2).

Probability density function: f(x) = e- , - < x <  ,  > 0 and  are real constants,

Cumulative distribution function:(z) = P(Z < z) = .

Use of cumulative tables:P(Z < a) = (a) where a, b > 0

P(Z > a) = 1 - P(Z < a) = 1 -  (a)

P(a  Z  b) =  (b) -  (a)

P( a  Z  b) =  (b) +  (a)  1

Standardization:If X ~ N(,2), then Z = is a random variable with Z ~ N(0,1).

Normal distribution as an Approximation

Distribution of X / Restriction on parameters / Approximation
X ~ Bin(n,p) / n large (>50), p small (<0.1), np < 5 / X ~ Po (np)
X ~ Bin(n,p) / n > 30, np > 5 / X ~ N(np,npq)
X ~ Bin(n,p) / n is small, i.e < 30 / None, use original distribution i.e. Bin(n,p)
X ~ Po() /  large ( > 20) / X ~ N(,)

If X and Y are two independentnormal variables such that X ~ N(1, 21) and Y ~ N(2, 22), a, b constants, then

d)X + Y ~ N(1 + 2 , 21 + 22)

e)X  Y ~ N(1 - 2 , 21 + 22)

f)W ~ N(a1 + b, a221 ) where W = aX + b

g)aX ~ N(a, a22)

h)aX + bY ~ N(a1 + b2, a221 + b222)

i)aX  bY ~ N(a1 - b2, a221 + b222).

If X1, X2, .., Xn are n independent normal variables such that X1 ~ N(1, 21), X2 ~ N(2, 22), .. , Xn~N(n, 2n), then X1 + X2 + …+ Xn ~ N( 1 + 2 + … + n , 21 + 22 + … + 2n ).

Also, a1X1 + a2X2 + ... + anXn ~ N(a11 + a22 + ... + ann , a1212 + a2222 + ... + an2n2)

If X and Y are two independent normal variables such that X ~ N(1, 21) and Y ~ N(2, 22), and a and b are any constants, then aX + bY ~ N(a1 + b2, a221 + b222)

and aX  bY ~ N(a1 b2, a221 + b222).

A Simple Monk Puzzle

One morning, exactly at sunrise, a Buddhist monk began to climb a tall mountain. The narrow path, no more than a foot or two wide, spiraled around the mountain to a glittering temple at the summit.
The monk ascended the path at varying rates of speed, stopping many times along the way to rest and to eat the dried fruit he carried with him. He reached the temple shortly before sunset. After several days of fasting and meditation, he began his journey back along the same path, starting at sunrise and again walking at variable speeds with many pauses along the way. His average speed descending was, of course, greater than his average climbing speed.
Prove that there is a spot along the path that the monk will occupy on both trips at precisely the same time of the day. [Hint: Think simply, only need to use up to secondary one maths] [Remark: This puzzle leads on to a theorem called A Fixed-Point Theorem]

Chapter : the Normal Distribution

Contents

Example 9.1

The heights of males in a certain age group are normally distributed with mean 172 cm and standard deviation

 cm. The heights of females in the same age group are also normally distributed with mean 166 cm and standard deviation 12 cm.

a)If 95% of the males are taller than 155.55 cm, show that  = 10 cm.

b)Find the probability that a randomly chosen male has a height between 170 cm and 171 cm.

Deduce, to three significant figures, the probability that each of two randomly chosen males has a height between 170 cm and 171 cm.

c)Find the probability that the total height of three randomly chosen males exceeds three times the height of a randomly chosen female by at least 11 cm. [0.0395, 0.00156, 0.5695]

Solution

SUMMARY

A Simple Monk Puzzle

Puzzle taken from lectures by Prof Tan Eng Chye, NUS Maths Dept