Accuracy: Validate the Interpretation of Test Performance

Chapter 6: Validity Concepts

Definition

• Accuracy: validate the interpretation of test performance

Face Validity

• Degree to which a test superficially appears to measure domain

• Math test

Establishing Construct Validity

• Content Validity

• Criterion-related Validity

• Comprehensive evaluation of the theoretical framework for a test

Content-related Validity

• Systematic examination

• Free from irrelevant variable influence

Content-related validity: Process

• Complete an examination of the literature

• Generate an adequate sampling of the “item universe”

• Domain must be proportionately represented in test

Content-related Validity: Procedure

• Domain in consideration must be fully described

• Description of procedures for item appropriateness & representativeness

• Cover subject matter and objectives of testing

Content Validity Ratio (CVR)

• Content Validity can be quantifiably measured
= number of panelists who agree an item is essential

• N = total number of panelists

CVR Example

• Gonzalez Anxiety Scale has 50 items

• 20 experts rate each item

• not essential, somewhat essential, and essential

• What is the CVR if 9 panelists rate item 1 essential?

• Table provided on p. 179

• Should we keep item 1?

Content Validity: Limitations

• Biases

• Cultural relativism

• Level of expertise of the panelists

Criterion-related validity

• Index of relationship between test and criterion

• A criterion should be similar to the test, reliable, and valid

• SAT predicts college performance (GPA)

Two kinds of Criterion Validity

• Concurrent

• Predictive

– Based on temporal (time) estimates

Concurrent Criterion-related Validity

• Test and criterion are measured at roughly the same time

• Impractical to wait for a secondary evaluation

– e.g., a diagnostic measure to generate diagnostic impression

Predictive Criterion-related Validity

• Test and criterion are compared over a period of time

• Used in Decision Theory

– e.g., A job abilities test is used to predict job performance

CRV: Limitations

• Possible problems from "criterion contamination"

• Coefficient affected by range of the sample

• Homogeneous vs. heterogeneous sample

Construct-related validity

• Extent to which a test measures a theoretical construct

• Construct: psychological trait

Construct-related validity: Process

• Theoretical relationships specified

• Empirical relationships examined

• Empirical evidence interpreted

Construct Validity: Techniques

• Convergent validation

• Discriminant validation

• Factor Analysis

• Multitrait-Multimethod Matrix

• Reliability

Convergent Validation

• A test should correlate highly with another test that is theoretically related

– e.g., a math test and numerical reasoning test

Discriminant Validation

• A test ought not to correlate with a theoretically unrelated test

– e.g., a self-esteem test and a comprehension test

Factor Analysis

• Descriptive statistical technique

• Analyzing the factors/dimensions of the test

• Factorial validity

Internal Consistency

• Consider homogeneity of a test

• Subtests (or items) correlate with test total score

• Provides evidence that the test measures a single concept

Predicted Change Over Time

• Examining pre and post test scores

• Assessing predicted change after an experimental intervention

– e.g., a depression intervention should improve (change) scores on a depression scale

Predicted Differences Between Distinct Groups

• Analyzing scores of contrasted groups

• Depressed sample scores should differ from the non-depressed sample

Multitrait Multimethod Matrix (MTMM Matrix)

• Campbell & Fiske (1959)

• Correlation of 2 or more traits by 2 or more methods

• Methods: self-report vs. spousal ratings vs. peer observations

• Traits: job satisfaction vs. marital satisfaction vs. self-satisfaction

Reliability Coefficients

• Monotrait monomethod

– Same trait, same method

Validity Coefficients

• Squaring the validity coefficient computes the proportion of variance that could be accounted for as a result of the test (predictor)

Monotrait Heteromethod

• Same trait, different method

Heterotrait monomethod

• Different trait, same method

Heterotrait heteromethod

• Different trait, different method

Validity Coefficient Magnitude

• Nature of the group

• Variability in gender, age, education, race

• Validity coefficient tends to decrease across groups

Sample range

• Homogeneity v. heterogeneity of the sample

• The wider range of scores (variability) the higher the correlation

• Comparison of extremely different (contrasted) groups

Test Reliability

• A validity coefficient is limited by the reliability of the test and reliability of the criterion

• An unreliable test is an invalid test

• rxy = validity coefficient

• rxx =test reliability

• ryy = criterion reliability

Test-criterion Relationship

• Both assumed to have linear and equal variances

• Homoscedascity means equal variances

• Curvilinear or unequal variances

Test Bias

• Constant or systematic error in a test

• A consideration when looking at cross-cultural issues

• Is the test fair to all groups?

Differential Validity

• Evaluate differences between the validity coefficients using cross-validation

• Analysis could reveal shrinkage

Predictive Validity Coefficient Error

• Margin of error to be expected in individuals predicted criterion score

• Is there error in test validity?

• Perform a Standard Error of Estimate (SEE)

Standard Error of Estimate (SEE)

• sy = standard deviation of criterion score

• = square of the validity coefficient of the criterion

• Example Y = 70, sy = 10, rxy = .80

• What is SEE?

Decision Theory Cronbach & Glaser (1965)

• Criterion related-predictive validity

• Expectancy data used in job selection testing

• How well does a test predict job performance?

Possible Outcomes

• 1) Valid acceptance: True positive

• 2) Valid rejection: True Negative

• 3) False negative

• 4) False positive

Incremental Validity

• Base rate

• Cut-off score

• Incremental validity is the increase in predictive validity, over the base rate, because of a test

Validity Summary

• Examiner is interested in obtaining information about:

– Examinee's knowledge of a particular domain

– Amount of construct possessed by examinee's on a specified domain

– Examinee's likely performance