Ch1.1Population and Samples

I.What is Statistics?

Statistics is the science of collecting and analyzing (numerical) data (taken from The Oxford American Dictionary)

Usually it involves collecting partial information (a sample) from a population, and using it to make generalizations (inference) about the population.

Ex1. Sue wants to know the mean height of undergraduate students in NC State University. Since she doesn’t have the resources to measure every student, she chose to measure 100 random students in the University.

Ex2.AGE engineer wants to know the average time life of their 13-W energy-saving light bulbs produced by a new procedure. Some number of random light bulbs is necessary. Suppose data on life time of 30 such light bulbs were collected.

II. Some statistical terms:

Data:collection of facts or observations

Variable: A characteristic of the object (or individual) in the population

Univariate data:the data where there is only one variable

Bivariate data:the data where there are only two variables

Multivariate data:the data where there are more than two variables

Population:A collection of objects (or individuals) to which we would like to make inference

Sample: A subset of the population of interest

Ex 1. In Sue’s study,

The data is:100 students’ heights

The variable of interest is: (students’) Height

The data set isa set containing 100 students’ heights

The population of interest is:NSCU students

The sample is:100 selected NSCU students

Ex 2. In the GE study,

The data is:30 life times

The variable of interest is:life time of GE’s light bulbs

The data set isa set of 30 life times

The population of interest is:GE’s 13-W energy-saving light bulbs

The sample is:30 light bulbs

II. Branches of Statistics

  1. Producing data: Sampling design, experiment design
  • Collect data to answer specific questions by sampling or experimentation.
  1. Describing data: Descriptive statistics
  • Deal with the presentation of the data------summarizing the data with numerical and graphical methods
  • Making inference: Inferential statistics
  • Use information from a sample to draw conclusions about a population
  • One key aspect of inferential statistics is that there is some amount of uncertainty associated with using sample data to draw conclusions about a population

Ex 1. (Sue’s example)

Sue can follow a certain random sampling scheme to select the 100 students. Such sampling scheme guarantees that the selected students are representative of NCSU students

  1. Sue can use methods in descriptive statistics to summarize the information of the 100 students(i.e., her sample), such as to report the average height of the 100 students.
  2. Sue can use techniques ininferential statistics to draw conclusions about the overall population of undergraduate students in NCSU based on the information obtained from her sample.

Suppose that the average height of the 100 students was 65’. Sue may estimate that, based on her sample, the average height of all undergraduate students in NCSU is also 65’ and with possible error of 1.1’ (that is, 651.1).

EX 2. The GE engineer can do the same thing as Sue.

  • In this class, we’ll concentrate on descriptive statistics and inferentialstatistics.
  • Big picture of the class: (also see syllabus)

1