Double Warmup: Intro to HTName: rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr

In class, we did a couple of micro studies where we assessed whether or not the names “Tim” and “Bob” were more readily attached to one of two faces:

The crux of trying to “show” something like this is gathering numerical data that makes it appear to be more true than not. But how to do that?

Suppose you’re a researcher, and you think that more people will call the bearded guy “Tim”. You might propose something – like, “I think that 75% will call him ‘Tim’”, or “I think that 85% will.” But where are those numbers going to come from? Intuition? Guessing? Magic 8 – Ball? Nah – there’s enough of that in the media. 

So, statisticians designed a system that deals with this in an indirect way: assume, for starters, that people don’t have any preference for one face or name over the other.

  1. (1 point) If that were true, then what percentage of the time would they call the bearded man “Tim”?

Makes sense! I mean, if people really didn’t care who was whom, they’d just “toss a coin” in their mind, and then decide based on that. Let’s stop here and do just that!

Now, of course, you know that that last probability is an average – sometimes, even if people don’t have a preference, a group of people might guess a “little” high or a “little” low. But the question really is this: does the data we’ve collected support the belief that people don’t have a preference for names and/or faces, or that they do?

So, in our class, we asked 25 people to name these gentlemen. 23 out of the 25 called the bearded man “Tim”.

  1. (1 point)Based on these results, complete the following sentence by selecting the right verb (copy and paste is fine):

It appears/does not appearthat people have a preference for the names “Tim” and “Bob” WRT these two faces.

But how would you quantify that? Think about that for a moment, and then answer the following:

  1. (1 point)Suppose you had to quantify your belief that you just thought about. Give one or two sentences how you would or could do this. I know this is tough! Give it a shot, anyway. 

Well, one way we could do it is run the experiment again, and see if, indeed, we got similarly extreme results. That’s great, and the cornerstone of statistical research!

But what if, indeed, we needed to make a call based on our data, and our data alone? Well, what statisticians strive to do is combine both these ideas: check our data’s “unusualness” versus lots and lots of simulated experiments. Let’s start!

Head over to Find the “One Proportions” link and click on it. Let’s stop here and regroup!

So what we’ll do is check how often we would see results like we got – that is, leaning strongly toward called the bearded guy “Tim” – if, indeed, people didn’t show a preference at all for either of the two names.

Set the sheet to run 10000 trials. When it’s done, you’ll see the sampling distribution (look familiar?). Then, in the center section of the page, set up the counter as at right, and then click “Count”. /
  1. (1 point)The results will pop up in red under that area – it’s a probability, stating the average rate at which you should see 23 or more heads out of 25 (or, 23 or more people calling the bearded guy “Tim”). What probability did you get?

Let’s regroup and discuss!

  1. Choose the most easily – believed conclusion for you based on these 25 results (again, copy and paste is fine):

It appears/does not appearthat people have a preference for the names “Tim” and “Bob” WRT these two faces.

Did your answer change from number 2?

  • If so, good! I’m guessing you now see just how unusual these results we got were, assuming that people were guessing at the names (in fact, it would take about a million trials to get results as extreme, or more extreme, than ours – see how I got that number?).
  • If not, good! I’m guessing you already had an idea of how unusual these results were – and now you have a quantification of the results! More on that to come!

Now, remember the second activity we did? We asked 60 folks the same question. In this case, 37 of the 60 called the bearded guy “Tim”.
  1. (1 point)Repeat question 4 in the simulator, but use the new data, as shown at right. What probability did you get this time?
/
  1. (1 point)Choose the most easily – believed conclusion for you based on these 60 results:

It appears/does not appearthat people have a preference for the names “Tim” and “Bob” WRT these two faces.

So, we now have conditional probabilities, don’t we?!?!? In both of questions 4 and 6, you arrived at probabilities. These probabilities are of the form:

The chance that we got the results we saw in our data (or even more extreme results), assuming that the people have no preference in name selection.

  1. (1 point)Take your answers from questions 4 and 6 and complete the following!
  • The chance that we got the results we saw in our sample of size 25 (or even more extreme results), assuming that the people have no preference in name selection, was approximately ______.
  • The chance that we got the results we saw in our sample of size 60 (or even more extreme results), assuming that the people have no preference in name selection, was approximately ______.

Here’s a new term – each of those probabilities you just calculated are called “P – Values” (LOTS more on them to come). They’re a “decision engine”, if you will, to help you choose between two opposing ideas. In our case, the two opposing ideas (as you already have figured out, I’m sure!) are “People have no preference in these names and faces” and “People HAVE a preference in these names and faces.”

  1. (1 point)remember, you’re still pretending to be this researcher. Which of those two ideas would most likely be your claim, before running the experiment[1]?
  1. (1 point)Would a “larger” or “smaller” P – Value make it easier to believe this research claim?
  1. (1 point) Try, best that you can, to explain your last statement!

[1] This might seem like a silly research question – but that’s just to explain the mathematics. It could just as easily be study on a drug’s effectiveness, or a safety measure’s failure rate, or life experience impact on educational results.