PS260 Lecture1 Fall2006.doc
HINT: You may find it helpful to type your own notes in class and combine them with my notes.
1) Now that I have given you a copy of the syllabus (which is also available at ) we will deal with statistical software issues.
a) By now you should have Openstat4 on your own computer. Let me know if there are problems with installation.
b) Log into your computers and start up Openstat4. You should see the spreadsheet. Type in the following:
subj.variable 1 variable 2
1101
2202
3304
4404.5
5506
6607
c) Now go to analysis and choose correlation. Run a correlation coefficient. If there is a non-chance relationship between two paired sets of numbers, the correlation will be high. In this case you can see that as one variable increases, there is a linear and proportional increase in the other variable. The correlation should be about 0.9, If there are enough pairs of data such a strong statistical relationship would be unlikely to occur by chance.
d) Now exit from OS4 (Openstat4) and open Microsoft Excel. Note the spreadsheet format. Go to page 1 of Excel book and follow the instructions. We want to make sure that you have installed the data analysis tool pack.. (Most likely you need the Microsoft Office disk).
2) Now close Excel and I will lecture on some basic statistical concepts.
My first topic is MEASURES OF CENTRAL TENDENCY
If I give you a test, some people will get high scores, some will get low scores, and most will get scores that are in the mid-range.
ASSIGNMENT: Read the section of your on-line text by Miller that deals with measures of central tendency.
The measures of central tendency are the mean, median, and mode
MEAN
MEDIAN
MODE
You also need to know the term BIMODAL
In class I will define each of the above and give examples.
Once I define those basic terms (which you need to memorize), I will talk about the STANDARD DEVIATION
The STANDARD DEVIATION is a measure of the degree to which scores tend to fall close to or far away from the mean. When the standard deviation is small, scores are relatively close to the mean. When the standard deviation is large, scores tend to fall far away from the mean.
If you REALLY understand the above concepts you understand the heart of statistics. If you do not understand, you will find the material which follows to be exceedingly difficult.
======As Bill Gates like to say: “Now lets drill down a little bit.” ======
TOPIC:
A MEASURE OF DISPERSION AROUND THE MEAN -
The Standard Deviation.
The Standard Deviation is a measure of the degree to which scores tend to fall close to or
far away from the mean of the distribution. If many scores fall close to the mean, the standard
deviation is small. If they fall far away from the mean, the standard deviation will be large.
Here is a BELL SHAPED CURVE with a standard deviation of 12 shown
32-20 = 12
Here is the same data but we go out 2 standard deviations
Here we go out 1.96 SD
(Show how to calculate the standard deviation and make up some examples)
(Show how to graph the data)
Show the normal curve division 34%, 13.6%, 2.4%
Show how IQ scores are used with the SD.
The following is a portion of a normal curve handout you will get in class.
Homework: 1)find the standard deviation of these two sets of scores:
2) graph the data 3)
THOUGHT QUESTION - If I have the IQ scores of
a sample of 5 FPC students am I likely to have a good measure of IQ for FPC students
in general?
2 1
2 2
2 2
4 8
4 8
4 4
6 5
6
6
ps101statistics 2006 page 2
- Go over standard deviation assignment.
======
The basic experiment: I have 25 students in section 1 of PS101 and 28 students in section 2. I teach one class using a projection of the notes on the wall. I teach the other class by using notes and graphs on the chalk board. I want to know if the two teaching techniques result in different results on the first multiple choice test.
<draw distribution of group1 & group 2 and vary the overlap>
As you can see, when there is little overlap between the two distributions, you should be convinced that the performance of the two groups significantly differs. MOST STATISTICS RELY ON THE SIZE OF THE MEAN DIFFERENCE VERSUS THE AMOUNT OF OVERLAP OF THE DISTRIBUTIONS.
TOPIC: correlation
A positive correlation is the situation where increases in the value of one variable are associated with increases in the values of the other variable.
Anegative correlation is the situation where increases in the value of one variable are associated with decreases in the value of the other variable.
(Do a plot of life span vs. # cigarettes smoked.)
(Do a plot of height versus weight to illustrate a positive correlation.)
Note: Correlations range in size from -1 to +1. Correlations with values around +-0.20
tend to be "not statistically significant. " Correlations that are close to +- 1.0 tend to be statistically significant.
Note: Sample size has a big effect on the statistical significance of a correlation. With
5 pairs of data, a correlation of +.80 is not statistically significant. With 1000 pairs of data a correlation of -0.18 is statistically significant.—
NOTE: Statistically significant correlation means that the relationship is unlikely to be due to chance.
(Do a plot of height versus weight)
Assignment due on MondayMake sure you can 1) Access Excell and the data analysis pack 2) Make sure you have downloaded and can read the main text (Miller) 3) Download Openstat4 and make sure you can enter numbers and do a simple analysis such as a correlation.