Mr. J Gallagher
Key Information– Statistics
Statistics in the world today:
- Weather reports
- Football league table
- Music Charts
Two types of data **See handout
Population – entire group
Sample – group from population
Primary data – 1st Hand information (Collected by person who will use it)
Secondary Data – 1nd hand information (Not collected by person who will use it)
i.e. Internet, newspaper, etc…
Ways to carry out a survey:
- Interview
- Telephone call
- Online / posted questionnaire
Qualities of a good questionnaire:
- Use clear and simple language
- Be as brief as possible
- Accommodate all possible answers
- Make questions useful and relevant
Distribution of data:
negative skew Symmetric positive skew
(left skew)(right skew)
Scatter Graphs and Correlation:
Correlation Coefficient:
Measure of strength between -1 (strong negative) and 1(strong positive). If n = 0 then no correlation.
Averages
Mean =
Mode = Most Common
Median = Middle Number
Range = Maximum Value – Minimum Value
Interquartile Range = Q3 – Q1Q1 = lower quartile
Q2 = second quartile
Q3 = upper quartile
Standard deviation – average spread from the mean
x = variable
= mean
n = number of variables
Normal Distribution:
Higher Level
Observational Study – the researcher collects the informationof interest but does not influence events.
Parameter – is a numerical measurement describing some characteristic of population. It is a fixed (unknown) number.
Statistic – is a numerical measurement describing some characteristic of a sample. The statistic can change from sample to sample.
Simple Random Sample – a sample of size n is selected in such a way that every possible sample of size n from the population has an equal chance of being selected (thus avoiding bias).
Cluster Sampling – the population is divided into sections or clusters. Then some of those clusters are randomly selected and all members from these clusters are chosen.
Stratified Random Sampling – first divide the population into at least two different subgroups so that the individuals or subjects within each subgroup share the same characteristics.
Then a simple random sample is drawn from each subgroup and combined to form the full sample.
Quota Sampling – non-probability sampling method (no randomization – open to mistakes).
Convenient Sampling – non-probability sampling chosen in most convenient way, easily accessable.
Ethical Issues:
- Informed Consent
- Confidential (no name)
Steps in a Statistical Investigation:
- Pose a question
- Collect data
- Present the data
- Analyze the data
- Interpret the data
Reliability:
- Large enough sample
- Random selection from population
- Equal chance of being selected
- High response rate
Percentiles – values that divide a data set into 100 equal parts.
z-score – number of standard deviations a given value x is above or below the mean of a given data set.
Paired (Bivariate Data) – Data which can be paired.
Outlier – individual value which falls outside the overall pattern.
Casual Relationship – If one variable depends on another.
Margin of Error – is the maximum likely difference between the sample proportion () and the population proportion, p.
Formula: - < p < +
Hypothesis – is a claim or statement about a property of a population.
Null Hypothesis (H0) – statement which defines the population.