Ch1.1Population and Samples
I.What is Statistics?
Statistics is the science of collecting and analyzing (numerical) data (taken from The Oxford American Dictionary)
Usually it involves collecting partial information (a sample) from a population, and using it to make generalizations (inference) about the population.
Ex1. Sue wants to know the mean height of undergraduate students in NC State University. Since she doesn’t have the resources to measure every student, she chose to measure 100 random students in the University.
Ex2.AGE engineer wants to know the average time life of their 13-W energy-saving light bulbs produced by a new procedure. Some number of random light bulbs is necessary. Suppose data on life time of 30 such light bulbs were collected.
II. Some statistical terms:
Data:collection of facts or observations
Variable: A characteristic of the object (or individual) in the population
Univariate data:the data where there is only one variable
Bivariate data:the data where there are only two variables
Multivariate data:the data where there are more than two variables
Population:A collection of objects (or individuals) to which we would like to make inference
Sample: A subset of the population of interest
Ex 1. In Sue’s study,
The data is:100 students’ heights
The variable of interest is: (students’) Height
The data set isa set containing 100 students’ heights
The population of interest is:NSCU students
The sample is:100 selected NSCU students
Ex 2. In the GE study,
The data is:30 life times
The variable of interest is:life time of GE’s light bulbs
The data set isa set of 30 life times
The population of interest is:GE’s 13-W energy-saving light bulbs
The sample is:30 light bulbs
II. Branches of Statistics
- Producing data: Sampling design, experiment design
- Collect data to answer specific questions by sampling or experimentation.
- Describing data: Descriptive statistics
- Deal with the presentation of the data------summarizing the data with numerical and graphical methods
- Making inference: Inferential statistics
- Use information from a sample to draw conclusions about a population
- One key aspect of inferential statistics is that there is some amount of uncertainty associated with using sample data to draw conclusions about a population
Ex 1. (Sue’s example)
Sue can follow a certain random sampling scheme to select the 100 students. Such sampling scheme guarantees that the selected students are representative of NCSU students
- Sue can use methods in descriptive statistics to summarize the information of the 100 students(i.e., her sample), such as to report the average height of the 100 students.
- Sue can use techniques ininferential statistics to draw conclusions about the overall population of undergraduate students in NCSU based on the information obtained from her sample.
Suppose that the average height of the 100 students was 65’. Sue may estimate that, based on her sample, the average height of all undergraduate students in NCSU is also 65’ and with possible error of 1.1’ (that is, 651.1).
EX 2. The GE engineer can do the same thing as Sue.
- In this class, we’ll concentrate on descriptive statistics and inferentialstatistics.
- Big picture of the class: (also see syllabus)
1