CFA Level One – Quantitative Methods:

Session 2 – Basic Statistics and Probability

Frequency Distributions:

Appropriate Number of Classes 2k ³ N (N = total number of observations)

Sample Mean

Weighted Mean

Median: midpoint in the data (half of datapoints above and half below) – important if mean is affected by outliers

Mode: most frequently occurring value

Positive Skewness: Mode < Median < Mean

Negative Skewness: Mean < Median < Mode

Arithmetic Mean (using classes) (where f = frequency of observations in a class)

Median (locate the class in which the median lies and interpolate with:

(where L = lower limit of class containing the median; CF = cumulative frequency preceding the median class; f = frequency of observations in the median class; I = class interval of median class)

Variance and Standard Deviation:

Sample Sandard Deviation

For population Std Dev, use instead of and use N instead of n-1

(Variance is the square of Std Dev)

For a frequency distribution: (f is frequency by class; x the class midpoint)

68% of observations are within 1 S.D. of the mean

95% of observations are within 2 S.D. of the mean

99.7% of observations are within 3 S.D. of the mean

Coefficient of Variance enables comparison of two or more S.D.s where the units differ:

Skewness can be quantified using:

Basic Probability

For mutually exclusive events: P(A or B) = P(A) + P(B)

For non-mutually exclusive events: P(A or B) = P(A) + P(B) – P(A and B)

For independent events: P(A and B) = P(A) x P(B)

For conditional events: P(A and B) = P(A) x P(B given that A occurs) = P(A) x P(B|A)

Bayes theory calculates the posterior probability which is a revision of probability based on a new event, i.e. the probability that A occurs once we know B has occured:

P(A1|B) = P(A1) x P(B|A1)

P(A1) x P(B|A1) + P(A2) x P(B|A2)

Mean of a Probability Distribution

(where x is the value of a discrete random variable and P(x) the probability of that value occurring)

Standard Deviation of a Probability Distribution (Variance = s2)

Binomial Probability

·  2 mutually exclusive outcomes for each trial

·  The random variable is the result of counting the successes of the total number of trials

·  The probability of success is constant from one trial to another

·  Trials are independent of one another

Binomial Probability of success:

·  Good for small values of n

·  For each value of n binomal tables can be used, plotted for a given value of n as:

- p (probability of success)

- x (number of observed successes)

·  Probabilities can be summed to give a cumulative probability

For a binomial distribution . . . .

Mean: Std. Dev:

Poisson Probability Distribution

Used when p < 0.05 and n > 100 (it is the limiting form of the binomial distribution function):

For a poisson distribution mean = variance so std dev.

Poisson distribution tables are available and give the probability of x (number of successes) for a given m (mean number of successes)

Normal Probability Distribution

Standard Normal Distribution

·  Mean = median = mode

·  Mean = 0

·  s = 1

·  Used as a comparison for other normal distributions

Z values

Used to standardize observations from normal distributions. The Z value describes how far an observation is from the population mean in terms of standard deviations:

Z = Observation - Mean = x - m

Std Dev s

The area under the curve in a normal distribution between two values is the probability of an observation falling between the two values.

A table of Z values is used to find the area under the normal curve between the mean and a particular figure; these are given as a decimal so (x100) to get the percentage probability. These tables are used for:

·  Finding the probability that an observation will fall between the mean and a given value

A one-tailed test of Z

·  Finding the probability that an observation will fall in a range around the mean (a confidence interval)

A two-tailed test of Z

Confidence intervals are two tailed tests of Z at integer standard deviations and describe the probability of an observation falling between the mean and n standard deviations from the mean

34% of observations fall between the mean and 1 s so 68% ± 1s

45% of observations fall between the mean and 1.65 s so 90% ± 1.65 s

47.5% of observations fall between the mean and 1.96 s so 95% ± 1.96 s

49.5% of observations fall between the mean and 2.58 s so 99% ± 2.58 s


Standard Errors

Sampling error = (also for Variance and Std. Dev. Is the diff between the sample statistic and the population statistic)

A distribution of sample means from a large sample size has approximately normal distribution, a mean of m and a variance of s2

Point estimates are a single sample value used to estimate population parameters (e.g. sample mean ) Interval estimates area calculated from the point estimates and describe the range around the point estimate in which the population parameter is likely to fall for a given level of confidence – as from Z values:

·  95% of the time population mean m is within ± 1.96 s of the sample mean

·  99% of the time population mean m is within ± 2.58 s of the sample mean

The standard error of sample means (std. Dev. of sample means distribution) proves this:

If s is not known use sample standard deviation s

Calculation of the confidence intervals uses the Z values:

e.g. 95% confidence interval

For other intervals replace “1.96” with the Z value from the Z value table for a given confidence level

Hypothesis Testing
Step / Action / Steps
1 / Write the Null Hypothesis and the Alternative Hypothesis / H0 Will always contain an equality
H1 Never contains an equality
For a one tailed test H0 is greater than or equal to (or less than or equal to) a given test value; H1 is subsequently less than (or greater than) the test value
For a two tailed test H0 is equal to a test value and H1 does not equal that value
2 / Select level of significance / The significance level is the probability of rejecting the Null Hypothesis when it is true (related to confidence intervals):
5% significance level: 95% confidence level . . . . ± 1.96 s (for 2 tailed test)
1% significance level: 99% confidence level . . . . ± 2.58 s (for 2 tailed test)
Risk of rejecting a correct H0 is a risk; Risk of accepting a wrong H0 is b risk
H0 / Accept / Reject
Is True / correct / a error
Is False / b error / correct
Step / Action / Comments
3 / Calculate the test statistic / Use: *
Where:
Z = Test statistic
= Sample Mean
= Population Mean (H0)
= Standard error of sample means (if unavailable use sample std. dev.**)
* for single observations use just Z value
Z = x-m
s
**
4 / Establish the decision rule / The decision rule states when to reject the null hypothesis:
·  How many Standard Deviations (Z) the sample meanmust be from H0 in order to be rejected
·  Driven by the significance level
Critical Z – value
Significance Level / One tailed / Two tailed
10% / 1.29 s / 1.65 s
5% / 1.645 s / 1.96 s
1% / 2.33 s / 2.58 s
If the sample mean is less than the critical Z value from H0 then the null hypothesis is accepted
5 / Make the decision / Based on the data

Addtionally, P-values are considered; this is the probability of observing another sample as extreme as the current one assuming that the null hypothesis is true.

If the P-value is less than the significance level then H0 is rejected. For a two-tailed test:

Pvalue = 2 x (0.5 - The area under the curve from the centre of the distribution to Z *)

* taken from the standard Z tables

Correlation Coefficient:

Coefficient of determination = R2 = r2 = % of Y’s variability explained by variability in X

Y’s total variability

e.g. R2 = 0.80 between a stock and the index means that 80% of the stock’s movement is explained by movement in the index (systematic risk) and 20% is specific to the stock (unsystematic risk)

Total risk = Systematic Risk + Unsystematic Risk

t-statistics and hypothesis testing

t-Statistics are used in place of Z-statistics for small (n < 30) samples; they are used in hypothesis testing just as the Z-statistic

(r = correlation coefficient; r2 = coefficient of determination; n-2 = degrees of freedom)

Us t-Statistics to determine if the correlation of x and y is significantly different from zero; i.e. the null hypothesis is that x and y are uncorrelated:

·  H0: r = 0

·  H0: r ≠ 0

Once the t-value is computed look up on a table of critical t-values for a given number of degrees of freedom and significance level; if the computed t-value is greater than the critical t-value then reject the null hypothesis.

Linear Regression Analysis

Relationship between x and y can be shown on a regression line – the best fit straight line through a set of data points. The regression line is found by the least squares principle and is based on the equation for a straight line:

where: and

Non Linear series (e.g. a curvilinear series) can be plotted by taking logs (NB – a curvilinear series is an example of compounding and is exponential)

log (Y) = log (a) + log (bX)

Once the linear relationship has been established then the Standard Error can be calculated using the sum of squared errors (SSE) method:

Where y' is the value of y for each value of x from the regression line and yi is the actual value of y for each x

Could also use:

In order to determine how good the regression is (i.e. how closely correlated variation is y is with variation in x) we use the coefficient of determination R2

where = SSE; = y’s sum of squares

As SSE tends to 0; R2 tends to 1; telling us that the observations of y for a given x are falling closer and closer onto the regression line.

Also since ; when R2 = 1; r = 1; so as r tends to 0; the standards error of estimates tends to sy (the standard deviation of y)

The following assumptions underpin linear regression:

·  Normality: for each observation of x the distribution of dependent y variables is normal

·  Linearity: the mean of each dependent observation y lies on the regression line

·  Homoskedasticity: the variability of y doesn’t change with x

·  Statistical Independence: the dependent observations of y are unrelated to one another

Confidence intervals are used in regression analysis to determine the range in which a dependent variable y lies for a given value of the independent variable x and to determine the probability that it applies to the whole group:

Confidence Interval:

The prediction interval is very similar but applies only to a specific dependent variable and not to the entire group:

Prediction Interval

The prediction interval is wider than the confidence interval since we are less confident to predict a specific value of y from x than predicting the value for a group

As SSE increases the confidence and prediction intervals widen – i.e. the error factor (unsystematic risk) increases and confidence in using the regression model to predict y falls

Steps for using the confidence and prediction intervals:

·  First compute the intercept and the slope for the data set (from equations for the b and a constants)

·  Compute the standard error estimate from:

·  Then, clarify the confidence and prediction interval that we are trying to compute – i.e. what data are we seeking for which a confidence or prediction interval is needed

·  Find the critical t-Statistic for the given significance level and degrees of freedom (n – 2)

·  Compute the confidence and prediction interval using the equations above (NB for a small sample size adjustment is seldom required so use:

Confidence interval = prediction interval

Time Series

A data set which is a function of time – e.g. a stock price series

Key components:

·  Secular trend: smooth, long term trend embedded in a time series

·  Cyclical trend: intermediated term variation (> 1 year) – e.g. the business cycle

·  Seasonal variation: shorter term variation (< 1 year) patters than may repeat over time

·  Irregular variation – episodic variation that is definable but unpredictable

The linear trend equation is the same as the basic regression equation but t replaces x:

y = a + bt

where: and

Moving averages smooth out the variability of a data series by calculating and plotting a series of simple averages. In order to apply a moving average the data series must have a Linear Trend (T) and a Rhythmic Cycle (C) – the moving average smoothes out the cyclical and irregular components of a time series.

PVs, FVs, Annuities and Holding Periods

Future Value:

Present Value: