Mr. J Gallagher

Key Information– Statistics

Statistics in the world today:

  • Weather reports
  • Football league table
  • Music Charts

Two types of data **See handout

Population – entire group

Sample – group from population

Primary data – 1st Hand information (Collected by person who will use it)

Secondary Data – 1nd hand information (Not collected by person who will use it)

i.e. Internet, newspaper, etc…

Ways to carry out a survey:

  • Interview
  • Telephone call
  • Online / posted questionnaire

Qualities of a good questionnaire:

  • Use clear and simple language
  • Be as brief as possible
  • Accommodate all possible answers
  • Make questions useful and relevant

Distribution of data:

negative skew Symmetric positive skew

(left skew)(right skew)

Scatter Graphs and Correlation:

Correlation Coefficient:

Measure of strength between -1 (strong negative) and 1(strong positive). If n = 0 then no correlation.

Averages

Mean =

Mode = Most Common

Median = Middle Number

Range = Maximum Value – Minimum Value

Interquartile Range = Q3 – Q1Q1 = lower quartile

Q2 = second quartile

Q3 = upper quartile

Standard deviation – average spread from the mean

x = variable

= mean

n = number of variables

Normal Distribution:

Higher Level

Observational Study – the researcher collects the informationof interest but does not influence events.

Parameter – is a numerical measurement describing some characteristic of population. It is a fixed (unknown) number.

Statistic – is a numerical measurement describing some characteristic of a sample. The statistic can change from sample to sample.

Simple Random Sample – a sample of size n is selected in such a way that every possible sample of size n from the population has an equal chance of being selected (thus avoiding bias).

Cluster Sampling – the population is divided into sections or clusters. Then some of those clusters are randomly selected and all members from these clusters are chosen.

Stratified Random Sampling – first divide the population into at least two different subgroups so that the individuals or subjects within each subgroup share the same characteristics.

Then a simple random sample is drawn from each subgroup and combined to form the full sample.

Quota Sampling – non-probability sampling method (no randomization – open to mistakes).

Convenient Sampling – non-probability sampling chosen in most convenient way, easily accessable.

Ethical Issues:

  • Informed Consent
  • Confidential (no name)

Steps in a Statistical Investigation:

  • Pose a question
  • Collect data
  • Present the data
  • Analyze the data
  • Interpret the data

Reliability:

  • Large enough sample
  • Random selection from population
  • Equal chance of being selected
  • High response rate

Percentiles – values that divide a data set into 100 equal parts.

z-score – number of standard deviations a given value x is above or below the mean of a given data set.

Paired (Bivariate Data) – Data which can be paired.

Outlier – individual value which falls outside the overall pattern.

Casual Relationship – If one variable depends on another.

Margin of Error – is the maximum likely difference between the sample proportion () and the population proportion, p.

Formula: - < p < +

Hypothesis – is a claim or statement about a property of a population.

Null Hypothesis (H0) – statement which defines the population.