COURSE SCHEDULE-Week 8
3/17-3/24
Welcome Back!! Your grades are posted on SOCS under the Stat 115 profile!
Monday 3/17: Probability I.
(1) Pre-class assignment Quiz over 3/7 assignment from Lesson 13. See Course schedule Week-7 for details.
GROUP HOMEWORK CORRECTION: Group homework assignment for 3/24
Pleas delete Lesson14 MCS-5 and replace it by Lesson 14 MRB-5 and MRA-6.
Pre-class assignment for 3/21: (1)Work on the group homework assignment for 3/24 to be able to ask questions.
(2) Work on Sampling Simulation I. Sheet attached at the end of this document. I will collect these on Thursday and it will be part of your project grade.
(3) Lesson 15 only pages 15-1 and 15-2
(a) Do the Quiz 15.1.3.
(b) Do the Quiz 15.2.3.
Thursday 3/21: Probability II. –Random Variable
Pre-class assignment for 3/24:
Lesson 15
(1) 15.3.1 (a) Use the uniform distribution to assign probabilities to sub-intervals
of [0,1]. Draw the density curve!
(b)What would be the chance to pick a point from the interval [0, .25]? How about [.25, .5]? And [.5, .75]
©Why do you think this is called the uniform density?
(d)Now draw the standard normal density curve with mean =0 and st.dev.=1.
(e)Compare the probabilities corresponding to the above intervals for this case.
P(0, .25)= , P(.25, .5)= , P(.5, .75)= ? Which intervals are more likely to be hit in this case if you picked a point from this distribution, the ones close to the middle or further away form it?
(2) 15.3.2 Give an answer to the question:" What makes statistical estimation possible?" based on this icon. Let's think about this in terms of sampling . Imagine taking a sample of size 10 then size 100 and then one of size1000 from a given population. Which one of the corresponding sample means would be a better estimate for the population mean ?
(3) 15.3.4 is a bit fuzzy about what we call the Law of Large Numbers (=LLN)
It defines the LLN as follows: "averages of selected outcomes taken from a given distribution will settle down to the true mean of the distribution in the long run"
It is all right but it is the consequence of the following theorem which is what is called the LLN in most other references. Please learn this one for the Test:
"If an event has probability p of happening on a single observation and we repeat the observation a great number of times then the long term relative frequency of the event happening will get closer and closer to the actual probability p."
The following examples show some examples of a common misunderstanding of the LLN. Think about why the given reasoning is incorrect! (Are the trials independent?
How is the idea of the "long run" misunderstood?)
(4) Mrs. Jones has five girls and she is pregnant again.
Mrs Jones: I do hope that our next baby isn't another girl.
Mr. Jones: My dear, after five females, it's bound to be a boy.
Is Mr. Jones right?
(5) Many gamblers think that they can win at roulette by waiting until there is a long run of red numbers, then betting on black. Will such a system work?
(6) I quote from a lecture on fire safety: "One in ten Americans will experience destructive fire this year. I know that you will say: "I have lived in my house for 25 years and never experienced any fire." But this only means that you are moving not further away from a fire but closer to one."
What is wrong?
Sampling simulation I.
Lesson 10: MRA 3.
1.Take 20 samples of size 5% from the "Perc" population. Select variable. Go to Manip/Sample. Enter the sample size 5%, select Sample with replacement and on the bottom enter 20 for the number of samples.
2. Take 20 samples of size 25%.
Perform the following steps for both the size 5% and size 25% samples.
3. Set Calculation/ Options to show only means and standard deviations.
4. Calculate the means for the 20 samples. Select all samples. Go to Calculate/summaries/As Variables .
5. Edit the name of your Mean variable by simply clicking on its name. Change it to Mean 5% and Mean 25% .
HAND-IN PART Due 3/21 Thursday (Work on your own!)
6. Draw a histogram of the 20 sample means. Compare the distributions of the sample means for the 5% case and the 25% case. Which one has a larger spread which one is more symmetric?
7. Calculate the mean and standard deviation for the 20 sample means. Record the values:
(a) What was the highest and lowest individual sample mean of the 20 samples of size 5%?
(b) What was the highest and lowest individual sample mean of the 20 samples of size 25%?
©What is the mean for all your sample means of size 5%?
(d)What is the mean for all your sample means of size 25%?
(e)What is the standard deviation of the sample means of size 5%?
(f)What is the standard deviation of the sample means of size 25%?
(g)In which case was the mean of the 20 sample means closer to the population mean 3.06%?
(h)How your findings about the standard deviation agree with your findings about the histograms in part (d)?
6. A) What can you say about the variation from sample to sample for the larger versus the smaller samples?
How about the range?
B) If you had to use a sample to estimate the population mean (3.06%) would you select a sample of size 12 or 48? Why?
C) What can you say about the STANDARD DEVIATION of those sample means
for the larger versus the smaller samples?
D) What can you say about the MEAN of all the 20 sample means in both cases?