1

Supplemental Materials

Reflecting on Explanatory Ability: A Mechanism for Detecting Gaps in Causal Knowledge

by D. Johnson et al., 2016, JEP: General

Power Analysis and Sample Size Determination

Experiments 1–8

In Rozenblit and Keil (2002), means from Study 3 (Figure 4) and standard deviations from Study 2 (standard errors depicted in Figure 3) were interpolated from the graphs. To detect a similar decrease in overestimation from before (M = 4.7, SD = 2.41) to after (M = 3.5, SD = 2.41) explanation generation, with power of 0.80 at alpha = .05, for independent samples, an estimated sample size of n = 64 is required for each condition. Sample sizes for each condition ranged from 55-65 across studies based on the rate volunteer participants signed up on Amazon’s Mechanical Turk. No participants were excluded from analyses in any experiment.

Experiment 9

A between-subjects comparison on position extremity scores from before (M = 1.41, SE = .07, interpolated SD = .56) to after (M = 1.19, SE = .08, interpolated SD = .64) explanation generation was provided in Fernbach et al. (2013). To detect a similar decrease with power of 0.80 at alpha = .05, for independent samples, an estimated sample size of n = 112 is required for each condition.

Table S1.

Descriptive statistics for understanding rating response time(s) in Experiments 1, 2, and 5

Condition / Unguided / REA / Explanation Generation
M (SD) / M (SD) / M (SD)
Experiment 1
vacuum cleaner / 2.60 (1.00) / 3.29 (1.75) / 3.76 (1.66)
Experiment 2
Mean of 8 objects / 3.48 (0.99) / 4.07 (2.13) / -
Experiment 5
Mean of 4 objects / 3.00 (1.13) / 3.96 (3.49) / 4.47 (1.50)

Note. Experiment 1, N = 189; Experiment 2, N = 117; Experiment 5, N = 176.

Table S2.

Descriptive statistics for understanding ratings by object/policy Experiments 2-5,7-9

Condition / Unguided / REA-noncausal / REA-causal / Explanation Generation
M (SD) / M (SD) / M (SD) / M (SD)
Experiment 2
piano key / 5.00 (1.45) / 3.86 (1.78) / - / -
smoke detector / 4.81 (1.61) / 4.51 (1.73) / - / -
VCR / 4.91 (1.73) / 4.29 (1.69) / - / -
ear buds / 4.57 (1.73) / 4.05 (1.94) / - / -
gas stove / 5.05 (1.40) / 3.76 (1.79) / - / -
treadmill / 5.19 (1.32) / 4.44 (1.51) / - / -
Polaroid camera / 4.05 (1.85) / 3.41 (1.95) / - / -
power drill / 4.50 (1.69) / 3.54 (1.87) / - / -
Experiment 3
printer / 5.19 (1.29) / 5.02 (1.54) / 4.10 (1.61) / -
treadmill / 5.16 (1.60) / 4.87 (1.46) / 4.05 (1.37) / -
power drill / 4.31 (1.61) / 4.15 (1.80) / 3.61 (1.55) / -
gas stove / 5.17 (1.43) / 4.58 (1.47) / 4.19 (1.74) / -
Experiment 4
spray bottle / 5.67 (1.38) / - / 4.58 (1.76) / -
printer / 5.44 (1.54) / - / 4.22 (1.81) / -
can opener / 5.38 (1.60) / - / 4.61 (1.64) / -
treadmill / 5.32 (1.55) / - / 4.72 (1.78) / -
umbrella / 5.26 (1.80) / - / 4.39 (1.89) / -
power drill / 5.46 (1.44) / - / 4.05 (2.10) / -
candle / 5.65 (1.55) / - / 4.94 (2.01) / -
gas stove / 5.39 (1.45) / - / 4.64 (1.86) / -
Experiment 5
vacuum cleaner / 5.40 (1.36) / - / 4.53 (1.32) / 4.02 (1.67)
Velcro / 5.62 (1.28) / - / 5.47 (1.33) / 4.72 (1.65)
computer mouse / 5.19 (1.73) / - / 3.95 (1.84) / 3.29 (1.83)
reading glasses / 5.76 (1.34) / - / 5.10 (1.29) / 3.84 (1.65)
Experiment 7 / Unguided / REA-causal / Total Steps / Explanation
printer / 4.63 (2.00) / 3.75 (2.01) / 3.49 (1.93) / 3.58 (1.47)
treadmill / 4.95 (1.76) / 3.76 (1.92) / 3.36 (1.99) / 3.35 (1.70)
power drill / 4.61 (1.86) / 3.67 (1.94) / 2.98 (2.01) / 3.07 (1.68)
gas stove / 4.95 (1.94) / 4.13 (1.91) / 3.80 (2.16) / 3.45 (1.81)
Experiment 8 / Unguided-5 / Unguided-20 / REA-5 / REA-20
printer / 5.09 (1.56) / 4.76 (1.82) / 3.42 (1.55) / 3.71 (1.82)
treadmill / 5.41 (1.57) / 4.58 (1.77) / 3.71 (1.81) / 3.39 (1.86)
power drill / 5.20 (1.66) / 4.36 (1.80) / 3.37 (1.78) / 3.00 (1.75)
gas stove / 5.25 (1.63) / 4.60 (1.92) / 3.86 (2.07) / 3.73 (1.74)
Experiment 9 / Unguided / REA-causal
teacher pay / 4.90 (1.48) / 4.29 (1.73) / - / -
cap and trade / 3.70 (1.83) / 2.85 (1.78) / - / -

Note. It important to note that the means in Table S2 are computed between-subjects, whereas means used for inferential statistics in the main text are computed between objects, but within-subjects.

Pilot study

A sample of 88 participants (66% female, 34% male) was recruited from Amazon’s Mechanical Turk and received a small payment for participation (age; M = 39.95, range 18 to 68) for participation. Participants rated 24 objects on their complexity (defined by how intricately parts work together to make it function), estimated number of hidden parts, estimated number of total parts, and familiarity on a 1 to 7 scale. Descriptive results are depicted in Table S3 for all objects.

Table S3.

Descriptive statistics for complexity, hidden parts, total parts, and familiarity

Dimension / Complexity / Hidden Parts / Total Parts / Familiarity
Object / M (SD) / M (SD) / M (SD) / M (SD)
Experiment 5
vacuum cleaner / 4.70 (1.47) / 5.57 (1.07) / 5.59 (1.17) / 5.59 (1.43)
computer mouse / 4.25 (1.69) / 5.02 (1.86) / 3.65 (1.38) / 6.01 (1.28)
Velcro / 2.00 (1.26) / 1.25 (0.72) / 1.33 (1.05) / 5.88 (1.32)
reading glasses / 2.38 (1.45) / 1.36 (1.12) / 1.31 (0.65) / 5.97 (1.34)
Other Experiments
Polaroid camera / 5.41 (1.42) / 5.14 (1.37) / 5.53 (1.15) / 4.60 (1.58)
computer mouse / 4.25 (1.69) / 5.02 (1.86) / 3.65 (1.38) / 6.01 (1.28)
piano key / 4.14 (1.38) / 4.73 (1.70) / 4.33 (1.91) / 4.99 (1.66)
light bulb / 4.06 (1.49) / 4.22 (1.74) / 2.78 (1.18) / 5.48 (1.59)
cell phone / 6.38 (1.24) / 6.47 (1.09) / 6.35 (1.06) / 5.61 (1.66)
Velcro / 2.00 (1.26) / 1.25 (0.72) / 1.33 (1.05) / 5.88 (1.32)
VCR / 4.81 (1.48) / 5.80 (1.20) / 5.69 (1.17) / 4.85 (1.75)
electric fan / 3.49 (1.30) / 3.35 (1.58) / 3.73 (1.07) / 5.76 (1.36)
spray bottle / 2.05 (1.28) / 1.42 (0.75) / 1.55 (0.76) / 6.07 (1.23)
umbrella / 1.88 (1.03) / 1.39 (0.75) / 1.89 (1.11) / 6.34 (0.98)
ear buds / 4.36 (1.63) / 4.53 (1.61) / 3.88 (1.38) / 5.33 (1.71)
stapler / 2.61 (1.27) / 3.47 (1.47) / 2.97 (1.22) / 6.30 (1.07)
flush toilet / 3.70 (1.32) / 4.68 (1.61) / 4.14 (1.53) / 6.01 (1.47)
gas stove / 4.25 (1.46) / 4.82 (1.47) / 4.98 (1.60) / 5.19 (1.78)
can opener / 2.34 (1.12) / 1.41 (0.88) / 1.98 (1.04) / 6.13 (1.28)
reading glasses / 2.38 (1.45) / 1.36 (1.11) / 1.31 (0.65) / 5.97 (1.34)
power drill / 4.80 (1.20) / 5.25 (1.14) / 5.13 (1.11) / 4.34 (1.79)
candle / 1.43 (0.93) / 1.13 (0.45) / 1.05 (0.21) / 6.40 (0.85)
treadmill / 5.19 (1.17) / 5.27 (1.38) / 5.58 (1.34) / 4.74 (1.72)
lock and key / 2.70 (1.34) / 3.76 (1.78) / 2.74 (1.26) / 5.78 (1.47)
microwave / 5.40 (1.39) / 5.78 (1.33) / 5.84 (1.08) / 5.33 (1.75)
printer / 5.58 (1.28) / 5.89 (1.13) / 5.95 (0.99) / 5.23 (1.71)
vacuum cleaner / 4.70 (1.47) / 5.57 (1.07) / 5.59 (1.17) / 5.59 (1.44)
smoke detector / 4.36 (1.52) / 5.68 (1.39) / 4.22 (1.31) / 4.60 (1.75)

Note. N = 88

Correlations between ratings. When averages were computed across all 24 objects, extremely high correlations were found between complexity and hidden parts, r(86) = .95, p < .001, complexity and total parts, r(86) = .94, p < .001, and between the hidden parts and total parts, r(86) = .94, p <.001. In contrast familiarity appeared more independent from complexity, r(86) = -.73, p < .001, hidden parts, r(86) = -.70, p < .001, and total parts, r(86) = -.71, p < .001.

Experiment 5, matching on familiarity. While trying to match on familiarity, but manipulate object complexity, a dependent t-test was performed on mean complexity ratings of two high complexity objects (vacuum, computer mouse) and two low complexity objects (Velcro, reading glasses), and revealed a substantial difference, t(87) = 24.62, p < .001, d = 2.63 There was no difference between these sets of objects on familiarity ratings.

Pictures used in Experiment 1-8

Vacuum

Velcro

Computer Mouse

Reading Glasses

Piano Key

Smoke Detector

VCR

Ear Buds

Gas Stove

Treadmill

Polaroid Camera

Power Drill