PART C - SAMPLING

Understanding Research Methods By Mildred Patten

SAMPLING(Biased or Unbiased)

Infer or generalize to population from samples

Define population

Draw sample or conduct census

Poor sampling=poor inferences (two main concerns)

Sample size

Sample selection method (biases)

Unbiased (every pop element has equal chance to be included – names in a hat)

Common biased methods (fail to identify all population members, convenience samples, volunteers)

Simple Random Sampling

Simple random requires names in a hat OR use of a table of random numbers

Each population member must be numbered with same number digits; use random starting place in the table

Random samples are subject to error

Called random sampling error (when the sample differs from the population)

Sample size affects random sampling error

Increase sample size to decrease sampling error

Bias is result of NON RANDOM or SYTEMATIC ERRORS

Sample size does not affect biases; biases are the result of something the researcher did

SYSTEMATIC SAMPLING (or Nth sampling)

Use a randomly ordered list if possible (alphabetical is sometimes used)

Select a random starting point

N is equal to the number in the population divided by the number in the desired sample (population size ÷ sample size)

Make sure to go completely though the list so all members have an equal chance of selection

StratifiedRandom Sampling

Random sampling is unbiased, but sampling error can occur

To be more precise and reduce sampling error, sometimes use stratified sampling

Population is divided into relevant strata (multiple strata are often used such as age, sex, location)

Usually proportional – with equal proportions from each strata in the population selected in the sample

Stratification insures small subgroups are correctly represented in the sample

Other Methods Of Sampling

Cluster – randomly select groups, not individuals

Beware, clusters are more homogeneous, thus a large number of clusters and stratification may be needed

Purposive – select those with desired info

Not random and generalizations are dangerous

Snowball – to locate subjects hard to find (addicts)

Bias is presumed in these, but best available

Multistage – stratify by urban, suburban, rural counties (clusters), then randomly select houses and individuals within them

Topics 20-23REVIEW QUESTIONS

What is the first step in sampling?

What is the best way to draw an unbiased sample?

Identify the following sampling methods:

JC, Kingsport, and Bristol all have 30,000 people in them and I draw a sample of 66 customers from each city

I randomly select 15 businesses in each of the three cities and send them surveys to give to their employees

I get a list of all 30,000 students at UT and select every 100th student to get a sample of 300 students

I ask everyone I know in chemotherapy to provide the name of someone else who is also receiving chemotherapy so I can interview them

Sampling and Demographics

Demography – study of people and populations

Characteristics of participants (age, sex, income, marital status, etc.)

Describe participants so reader can judge usefulness of results

Can compare to population to see if representative

Can track response rates by demographics to see if bias in responding

Sample Size

Elimination of sample bias is most crucial element

Adequacy of sample size is second most crucial

Increasing sample size increases precision

Meaning results vary little from sample to sample

Larger is better, but there is a diminishing return

Larger sample does nothing to eliminate bias

SAMPLE SIZE:A closer look

How many needed depends on:

Money and time available (more subjects=more $/time)

Occurrence of effect measuring (rare=more subjects)

Variability in population (much variation=more subjects)

Size of difference in groups (small diff=more subjects)

Pilot study (20-100 subjects) gives insight including expected return rates

It’s relative, but 1500 is the upper limit, with about 30 the lower limit See Table 2, Recommended Sample Sizes

Topics 24-26REVIEW QUESTIONS

How can you increase your sampling accuracy?

I want to determine TC students’ attitudes about the on-line library and ask my students to complete a survey. What type sample is this and are there any reasons not to trust it?

How many subjects is the absolute minimum needed to have faith in results? What about the maximum?

PART D - INSTRUMENTATION

Introduction ToValidity

Valid instrument measures what it is supposed to

performs the function it purports to

Tests/instruments are valid for a particular purpose

Must first state the purpose of the measure

It’s a matter of degree – how valid is it?

Reasons For Invalidity

Testing only a sample of the construct studied (some samples better than others)

Some constructs/traits are elusive

how to measure honesty/cheerfulness – hard to get the full essence

Hard to quantify constructs BUT replication is impossible if you do not

JudgmentalValidity

Content validity – Judgments on appropriateness of content (used especially with achievement tests)

3 Parts to content validity

Broad sample content (cover entire content)

Important material emphasized (more test items on it)

Different skill levels covered (knowledge, comprehension, application, analysis, synthesis, evaluation)

Facial validity – judge whether, on its face, the instrument measures what it’s intended to

Occasionally intentional low facial validity (to hide purpose of research)

EmpiricalValidity

Criterion related validity – comparisons between a measure and some criterion

Predictive validity – criterion occurs in future

Concurrent validity – criterion occurs now

Validity coefficient (0 to 1.0); closer to 1.0 is better (means high/high and low/low relationship)

- Also can go to -1.0 (a high/low and low/high relationship)

- Rarely are validity coefficients 1.0, too much variation in complex constructs

JUDGMENT-EMPIRICAL(Construct Validity)

Construct validity uses both judgment and empirical methods

Measure hypothetical constructs (not concrete, constructed)

Cannot see, touch, feel, but can see indicators of them (honesty, love, fear, depression)

Infer construct existence by observing collection of indicators

To test, correlate with some other measure of the construct or test to see if effects of it are present – correlate depression with a happiness scale you develop; also correlate your scale with count of smiles/hour (this is an indirect measure of validity, so use a series of tests to establish construct validity)

In establishing validity of scale or test, look at all measures

Judgment – content/face

Empirical – predictive/concurrent

Judgment/Empirical – construct

Topics 27-30REVIEW QUESTIONS

Establishing validity requires what?

Name and explain the four types of validity:

RELIABILITY(and its relationship to validity)

Reliable = consistent (take two measures and compare them to see if consistent)

Highly reliable measures may be valid OR not

Validity is most important, but to be useful must have valid AND reliable measurements

Can have reliability without validity, but not validity without reliability

Text example - shooting at targets

Measures of Reliability

Compare two measures and calculate a reliability coefficient

Correlation between two quantitative measures

Closer to 1.0 the better (.80 is high; .50 can be useful)

Test/retest (test same group with same instrument at two times – needs to be enduring trait)

Inter-rater/observer (consistency between 2 raters)

Parallel or equivalent forms (2 forms of instrument)

Internal consistency (split-half; intra-rater; cron.alpha)

Topics 31-33REVIEW QUESTIONS

What does reliability mean?

Name 4 ways to establish reliability:

Can you have reliability without validity?

Can you have validity without reliability?

Norm-Referenced v. Criterion-Referenced Tests

NRTs compare individual performance with group (TCAP)

Items are intentionally of medium difficulty

Shows how a local group differs from the norm

CRTs measure whether an individual has met performance standards

Item difficulty of little concern; instead want examinee to meet a set level of performance

Describes what examinee knows and doesn’t know

Measures of Optimum Performance

Achievement tests – measure knowledge and skills acquired (measure effectiveness of instruction to see if objectives met)

Objectively scored - multiple choice (provides snapshot of achievement)

Subjectively scored - essays, performance, products (for reliability need standard scoring system such as checklist or rating scale Excellent/Good/Fair/Poor)

Aptitude tests – predict achievement (SAT/college)

Intelligence tests – predict general achievement

Measures ofTypical Performance

Examinees’ best performance is measured with achievement/aptitude/intelligence tests

Typical performance desired for personality traits (attitudes/interests/disposition)

Social desirability bias is a problem (and Hawthorne or guinea pig effect)

Anonymous surveying helps

Direct unobtrusive observation useful

Projective techniques can be used (Ink blots; Most employees…); Note that these require expertise

Likert most common attitude scale (SA to SD)

One statement per topic; multiple statements cover all components of the construct; mix neg/positive statements

Topics 34-36REVIEW QUESTIONS

Is the driving part of the test you take to get a driver’s license a norm or criterion referenced test?

If I want to find out how much people cheat on their taxes, what type of bias might occur in doing surveys?

When using a Likert scale, several questions are asked about one topic, then the answers are totaled. Why is this better than asking just one question?