Using Statistics Note sheet - 3
The Normal Distribution
This symmetrical, continuous distribution is described by two parameters - the mean, (describing the position) and the variance, 2 (describing the shape), s2 is an estimate of 2. [Remember that a Poisson distribution was described by just one parameter, the mean]. A normal distribution is symmetrical and always has the characteristic bell-shape. It is sometimes called the Gaussian distribution. Standard deviation, , is the sqare root of variance.
In a normal distribution...
contains 68.25% of the observations - 50% fall between 0.674
2 contains 95.45% of the observations - 95% fall between 1.96
3 contains 99.73% of the observations - 99% fall between 2.576
Most parametric statistics assume that data is distributed normally. Check this is true first!
The ‘Standard’ Normal Distribution
This is a derived distribution where each observation is processed by subtracting the mean and dividing by the standard deviation. This gives a normal distribution with a mean of 0 and a variance of 1.
Convergence
A Poisson distribution with a large mean will approximate to a normal distribution. As will a binomial distribution with >100 observations (or fewer if p0.5).
Sampling Distributions (Central Limit Theorem)
The means of samples taken from any shape of parent distribution will themselves have a normal distribution. This is the basis for the rule that the standard deviation of the mean (i.e. standard error) of a sample is /n.
Describing the Distribution Further.
Two types of departure from normality in a data set are:
Using Statistics - 3 - Calvin Dytham
Skewness - This is another word for asymmetry; skewness means that one tail of the bell shaped curve is drawn out more than the other. Skews are either to the right or left depending on whether the right or left tails are drawn out. (i.e. long right tail - right skewed distribution).
Kurtosis- This can be either leptokurtic or platykurtic. A leptokurtic distribution has more observations very close to the mean and in the tails. A playkurtic distribution has more observations in the ‘shoulders’ and fewer around the mean and tails. A biomodal distribution is, therefore, extremely platykurtic.
Using Statistics - 3 - Calvin Dytham
The skewness parameter is 1 and kurtosis 2. These are estimated by the statistics g1 and g2.
In a normal distribution both g1 and g2 are equal to zero. A negative g1 indicates skewness to the left and a positive g1 skewness to the right. A negative g2 indicates a platykurtic distribution and a positive g2 leptokurtic distribution.
In SPSS 9 these statistics can be calculated easily. Go into ‘Descriptive statistics’ then ‘Descriptives..’ and then change the ‘Options’ to include Skewness and Kurtosis.
Using Statistics - 3 - Calvin Dytham