CFA Level One – Quantitative Methods:
Session 2 – Basic Statistics and Probability
Frequency Distributions:
Appropriate Number of Classes 2k ³ N (N = total number of observations)
Sample Mean
Weighted Mean
Median: midpoint in the data (half of datapoints above and half below) – important if mean is affected by outliers
Mode: most frequently occurring value
Positive Skewness: Mode < Median < Mean
Negative Skewness: Mean < Median < Mode
Arithmetic Mean (using classes) (where f = frequency of observations in a class)
Median (locate the class in which the median lies and interpolate with:
(where L = lower limit of class containing the median; CF = cumulative frequency preceding the median class; f = frequency of observations in the median class; I = class interval of median class)
Variance and Standard Deviation:
Sample Sandard Deviation
For population Std Dev, use instead of and use N instead of n-1
(Variance is the square of Std Dev)
For a frequency distribution: (f is frequency by class; x the class midpoint)
68% of observations are within 1 S.D. of the mean
95% of observations are within 2 S.D. of the mean
99.7% of observations are within 3 S.D. of the mean
Coefficient of Variance enables comparison of two or more S.D.s where the units differ:
Skewness can be quantified using:
Basic Probability
For mutually exclusive events: P(A or B) = P(A) + P(B)
For non-mutually exclusive events: P(A or B) = P(A) + P(B) – P(A and B)
For independent events: P(A and B) = P(A) x P(B)
For conditional events: P(A and B) = P(A) x P(B given that A occurs) = P(A) x P(B|A)
Bayes theory calculates the posterior probability which is a revision of probability based on a new event, i.e. the probability that A occurs once we know B has occured:
P(A1|B) = P(A1) x P(B|A1)
P(A1) x P(B|A1) + P(A2) x P(B|A2)
Mean of a Probability Distribution
(where x is the value of a discrete random variable and P(x) the probability of that value occurring)
Standard Deviation of a Probability Distribution (Variance = s2)
Binomial Probability
· 2 mutually exclusive outcomes for each trial
· The random variable is the result of counting the successes of the total number of trials
· The probability of success is constant from one trial to another
· Trials are independent of one another
Binomial Probability of success:
· Good for small values of n
· For each value of n binomal tables can be used, plotted for a given value of n as:
- p (probability of success)
- x (number of observed successes)
· Probabilities can be summed to give a cumulative probability
For a binomial distribution . . . .
Mean: Std. Dev:
Poisson Probability Distribution
Used when p < 0.05 and n > 100 (it is the limiting form of the binomial distribution function):
For a poisson distribution mean = variance so std dev.
Poisson distribution tables are available and give the probability of x (number of successes) for a given m (mean number of successes)
Normal Probability Distribution
Standard Normal Distribution
· Mean = median = mode
· Mean = 0
· s = 1
· Used as a comparison for other normal distributions
Z values
Used to standardize observations from normal distributions. The Z value describes how far an observation is from the population mean in terms of standard deviations:
Z = Observation - Mean = x - m
Std Dev s
The area under the curve in a normal distribution between two values is the probability of an observation falling between the two values.
A table of Z values is used to find the area under the normal curve between the mean and a particular figure; these are given as a decimal so (x100) to get the percentage probability. These tables are used for:
· Finding the probability that an observation will fall between the mean and a given value
A one-tailed test of Z
· Finding the probability that an observation will fall in a range around the mean (a confidence interval)
A two-tailed test of Z
Confidence intervals are two tailed tests of Z at integer standard deviations and describe the probability of an observation falling between the mean and n standard deviations from the mean
34% of observations fall between the mean and 1 s so 68% ± 1s
45% of observations fall between the mean and 1.65 s so 90% ± 1.65 s
47.5% of observations fall between the mean and 1.96 s so 95% ± 1.96 s
49.5% of observations fall between the mean and 2.58 s so 99% ± 2.58 s
Standard Errors
Sampling error = (also for Variance and Std. Dev. Is the diff between the sample statistic and the population statistic)
A distribution of sample means from a large sample size has approximately normal distribution, a mean of m and a variance of s2
Point estimates are a single sample value used to estimate population parameters (e.g. sample mean ) Interval estimates area calculated from the point estimates and describe the range around the point estimate in which the population parameter is likely to fall for a given level of confidence – as from Z values:
· 95% of the time population mean m is within ± 1.96 s of the sample mean
· 99% of the time population mean m is within ± 2.58 s of the sample mean
The standard error of sample means (std. Dev. of sample means distribution) proves this:
If s is not known use sample standard deviation s
Calculation of the confidence intervals uses the Z values:
e.g. 95% confidence interval
For other intervals replace “1.96” with the Z value from the Z value table for a given confidence level
Hypothesis Testing
Step / Action / Steps1 / Write the Null Hypothesis and the Alternative Hypothesis / H0 Will always contain an equality
H1 Never contains an equality
For a one tailed test H0 is greater than or equal to (or less than or equal to) a given test value; H1 is subsequently less than (or greater than) the test value
For a two tailed test H0 is equal to a test value and H1 does not equal that value
2 / Select level of significance / The significance level is the probability of rejecting the Null Hypothesis when it is true (related to confidence intervals):
5% significance level: 95% confidence level . . . . ± 1.96 s (for 2 tailed test)
1% significance level: 99% confidence level . . . . ± 2.58 s (for 2 tailed test)
Risk of rejecting a correct H0 is a risk; Risk of accepting a wrong H0 is b risk
H0 / Accept / Reject
Is True / correct / a error
Is False / b error / correct
Step / Action / Comments
3 / Calculate the test statistic / Use: *
Where:
Z = Test statistic
= Sample Mean
= Population Mean (H0)
= Standard error of sample means (if unavailable use sample std. dev.**)
* for single observations use just Z value
Z = x-m
s
**
4 / Establish the decision rule / The decision rule states when to reject the null hypothesis:
· How many Standard Deviations (Z) the sample meanmust be from H0 in order to be rejected
· Driven by the significance level
Critical Z – value
Significance Level / One tailed / Two tailed
10% / 1.29 s / 1.65 s
5% / 1.645 s / 1.96 s
1% / 2.33 s / 2.58 s
If the sample mean is less than the critical Z value from H0 then the null hypothesis is accepted
5 / Make the decision / Based on the data
Addtionally, P-values are considered; this is the probability of observing another sample as extreme as the current one assuming that the null hypothesis is true.
If the P-value is less than the significance level then H0 is rejected. For a two-tailed test:
Pvalue = 2 x (0.5 - The area under the curve from the centre of the distribution to Z *)
* taken from the standard Z tables
Correlation Coefficient:
Coefficient of determination = R2 = r2 = % of Y’s variability explained by variability in X
Y’s total variability
e.g. R2 = 0.80 between a stock and the index means that 80% of the stock’s movement is explained by movement in the index (systematic risk) and 20% is specific to the stock (unsystematic risk)
Total risk = Systematic Risk + Unsystematic Risk
t-statistics and hypothesis testing
t-Statistics are used in place of Z-statistics for small (n < 30) samples; they are used in hypothesis testing just as the Z-statistic
(r = correlation coefficient; r2 = coefficient of determination; n-2 = degrees of freedom)
Us t-Statistics to determine if the correlation of x and y is significantly different from zero; i.e. the null hypothesis is that x and y are uncorrelated:
· H0: r = 0
· H0: r ≠ 0
Once the t-value is computed look up on a table of critical t-values for a given number of degrees of freedom and significance level; if the computed t-value is greater than the critical t-value then reject the null hypothesis.
Linear Regression Analysis
Relationship between x and y can be shown on a regression line – the best fit straight line through a set of data points. The regression line is found by the least squares principle and is based on the equation for a straight line:
where: and
Non Linear series (e.g. a curvilinear series) can be plotted by taking logs (NB – a curvilinear series is an example of compounding and is exponential)
log (Y) = log (a) + log (bX)
Once the linear relationship has been established then the Standard Error can be calculated using the sum of squared errors (SSE) method:
Where y' is the value of y for each value of x from the regression line and yi is the actual value of y for each x
Could also use:
In order to determine how good the regression is (i.e. how closely correlated variation is y is with variation in x) we use the coefficient of determination R2
where = SSE; = y’s sum of squares
As SSE tends to 0; R2 tends to 1; telling us that the observations of y for a given x are falling closer and closer onto the regression line.
Also since ; when R2 = 1; r = 1; so as r tends to 0; the standards error of estimates tends to sy (the standard deviation of y)
The following assumptions underpin linear regression:
· Normality: for each observation of x the distribution of dependent y variables is normal
· Linearity: the mean of each dependent observation y lies on the regression line
· Homoskedasticity: the variability of y doesn’t change with x
· Statistical Independence: the dependent observations of y are unrelated to one another
Confidence intervals are used in regression analysis to determine the range in which a dependent variable y lies for a given value of the independent variable x and to determine the probability that it applies to the whole group:
Confidence Interval:
The prediction interval is very similar but applies only to a specific dependent variable and not to the entire group:
Prediction Interval
The prediction interval is wider than the confidence interval since we are less confident to predict a specific value of y from x than predicting the value for a group
As SSE increases the confidence and prediction intervals widen – i.e. the error factor (unsystematic risk) increases and confidence in using the regression model to predict y falls
Steps for using the confidence and prediction intervals:
· First compute the intercept and the slope for the data set (from equations for the b and a constants)
· Compute the standard error estimate from:
· Then, clarify the confidence and prediction interval that we are trying to compute – i.e. what data are we seeking for which a confidence or prediction interval is needed
· Find the critical t-Statistic for the given significance level and degrees of freedom (n – 2)
· Compute the confidence and prediction interval using the equations above (NB for a small sample size adjustment is seldom required so use:
Confidence interval = prediction interval
Time Series
A data set which is a function of time – e.g. a stock price series
Key components:
· Secular trend: smooth, long term trend embedded in a time series
· Cyclical trend: intermediated term variation (> 1 year) – e.g. the business cycle
· Seasonal variation: shorter term variation (< 1 year) patters than may repeat over time
· Irregular variation – episodic variation that is definable but unpredictable
The linear trend equation is the same as the basic regression equation but t replaces x:
y = a + bt
where: and
Moving averages smooth out the variability of a data series by calculating and plotting a series of simple averages. In order to apply a moving average the data series must have a Linear Trend (T) and a Rhythmic Cycle (C) – the moving average smoothes out the cyclical and irregular components of a time series.
PVs, FVs, Annuities and Holding Periods
Future Value:
Present Value: