Chapter 6: Validity Concepts
Definition
• Accuracy: validate the interpretation of test performance
Face Validity
• Degree to which a test superficially appears to measure domain
• Math test
Establishing Construct Validity
• Content Validity
• Criterion-related Validity
• Comprehensive evaluation of the theoretical framework for a test
Content-related Validity
• Systematic examination
• Free from irrelevant variable influence
Content-related validity: Process
• Complete an examination of the literature
• Generate an adequate sampling of the “item universe”
• Domain must be proportionately represented in test
Content-related Validity: Procedure
• Domain in consideration must be fully described
• Description of procedures for item appropriateness & representativeness
• Cover subject matter and objectives of testing
Content Validity Ratio (CVR)
• Content Validity can be quantifiably measured
= number of panelists who agree an item is essential
• N = total number of panelists
CVR Example
• Gonzalez Anxiety Scale has 50 items
• 20 experts rate each item
• not essential, somewhat essential, and essential
• What is the CVR if 9 panelists rate item 1 essential?
• Table provided on p. 179
• Should we keep item 1?
Content Validity: Limitations
• Biases
• Cultural relativism
• Level of expertise of the panelists
Criterion-related validity
• Index of relationship between test and criterion
• A criterion should be similar to the test, reliable, and valid
• SAT predicts college performance (GPA)
Two kinds of Criterion Validity
• Concurrent
• Predictive
– Based on temporal (time) estimates
Concurrent Criterion-related Validity
• Test and criterion are measured at roughly the same time
• Impractical to wait for a secondary evaluation
– e.g., a diagnostic measure to generate diagnostic impression
Predictive Criterion-related Validity
• Test and criterion are compared over a period of time
• Used in Decision Theory
– e.g., A job abilities test is used to predict job performance
CRV: Limitations
• Possible problems from "criterion contamination"
• Coefficient affected by range of the sample
• Homogeneous vs. heterogeneous sample
Construct-related validity
• Extent to which a test measures a theoretical construct
• Construct: psychological trait
Construct-related validity: Process
• Theoretical relationships specified
• Empirical relationships examined
• Empirical evidence interpreted
Construct Validity: Techniques
• Convergent validation
• Discriminant validation
• Factor Analysis
• Multitrait-Multimethod Matrix
• Reliability
Convergent Validation
• A test should correlate highly with another test that is theoretically related
– e.g., a math test and numerical reasoning test
Discriminant Validation
• A test ought not to correlate with a theoretically unrelated test
– e.g., a self-esteem test and a comprehension test
Factor Analysis
• Descriptive statistical technique
• Analyzing the factors/dimensions of the test
• Factorial validity
Internal Consistency
• Consider homogeneity of a test
• Subtests (or items) correlate with test total score
• Provides evidence that the test measures a single concept
Predicted Change Over Time
• Examining pre and post test scores
• Assessing predicted change after an experimental intervention
– e.g., a depression intervention should improve (change) scores on a depression scale
Predicted Differences Between Distinct Groups
• Analyzing scores of contrasted groups
• Depressed sample scores should differ from the non-depressed sample
Multitrait Multimethod Matrix (MTMM Matrix)
• Campbell & Fiske (1959)
• Correlation of 2 or more traits by 2 or more methods
• Methods: self-report vs. spousal ratings vs. peer observations
• Traits: job satisfaction vs. marital satisfaction vs. self-satisfaction
Reliability Coefficients
• Monotrait monomethod
– Same trait, same method
Validity Coefficients
• Squaring the validity coefficient computes the proportion of variance that could be accounted for as a result of the test (predictor)
Monotrait Heteromethod
• Same trait, different method
Heterotrait monomethod
• Different trait, same method
Heterotrait heteromethod
• Different trait, different method
Validity Coefficient Magnitude
• Nature of the group
• Variability in gender, age, education, race
• Validity coefficient tends to decrease across groups
Sample range
• Homogeneity v. heterogeneity of the sample
• The wider range of scores (variability) the higher the correlation
• Comparison of extremely different (contrasted) groups
Test Reliability
• A validity coefficient is limited by the reliability of the test and reliability of the criterion
• An unreliable test is an invalid test
• rxy = validity coefficient
• rxx =test reliability
• ryy = criterion reliability
Test-criterion Relationship
• Both assumed to have linear and equal variances
• Homoscedascity means equal variances
• Curvilinear or unequal variances
Test Bias
• Constant or systematic error in a test
• A consideration when looking at cross-cultural issues
• Is the test fair to all groups?
Differential Validity
• Evaluate differences between the validity coefficients using cross-validation
• Analysis could reveal shrinkage
Predictive Validity Coefficient Error
• Margin of error to be expected in individuals predicted criterion score
• Is there error in test validity?
• Perform a Standard Error of Estimate (SEE)
Standard Error of Estimate (SEE)
• sy = standard deviation of criterion score
• = square of the validity coefficient of the criterion
• Example Y = 70, sy = 10, rxy = .80
• What is SEE?
Decision Theory Cronbach & Glaser (1965)
• Criterion related-predictive validity
• Expectancy data used in job selection testing
• How well does a test predict job performance?
Possible Outcomes
• 1) Valid acceptance: True positive
• 2) Valid rejection: True Negative
• 3) False negative
• 4) False positive
Incremental Validity
• Base rate
• Cut-off score
• Incremental validity is the increase in predictive validity, over the base rate, because of a test
Validity Summary
• Examiner is interested in obtaining information about:
– Examinee's knowledge of a particular domain
– Amount of construct possessed by examinee's on a specified domain
– Examinee's likely performance