Supplemental Material

Appendix A: Word List

Abstract / Concrete
CONSCIENCE / POLITICS / KEYS / SANDALS
HAZARD / ADHERENCE / ICE* / RUBY
FOLLY / LOYALTY / MAGAZINE / RUG
TOPIC / PROPORTION / LEAF / TOOL
METHOD / COWARDICE / BLOSSOM / TONGUE
AFFECTION / WORTH / CEREAL* / TOES
JUDGMENT / TREATMENT / BENCH / VIOLIN
INSTANT / DUTY / TRIPOD / LIBRARY
QUALITY / CONFESSION / LADDER* / PLASMA
APTITUDE / CONDITION / PRINTER / APRON
TALENT / ADVANTAGE / CHARCOAL / RIBBON
LUCK / INNOCENCE / RADISH / CIRCUS
BOREDOM / SPAN / WAX / ELEVATOR
REGRET / AWARENESS / HAMMOCK / STEREO
BRAVERY / OUTCOME / CURTAIN / PENNY
BET / ZEAL / MUSTARD / CAVE
ILLUSION / FORESIGHT / MISSILE* / BELT
BEING / PRESTIGE / BEARD / SQUARE
ENVY / PURPOSE / LIME* / MUD
DOMAIN / WONDER / PUZZLE / JAM
MANNER / CAUSE / CACTUS / PARCEL
BLESSING / DECEIT / ARMOR / SYRINGE
SPIRIT / EFFORT / DASHBOARD / BUNNY
PREFERENCE / EXPRESSION / BEETLE / COMEDIAN
APPROVAL / EXTREME / CLOG / KANGAROO
THEORY / FAILURE / DIAPER / TROUT
PATIENCE / FORFEIT / FRACTURE / SHACK
WILL / FUTURE / GLOBE / BARK
ENIGMA / AMBITION / HINGE / PUPIL
BELIEF / ABUNDANCE / KITE / SKILLET
CONTEXT / INFLUENCE / MOWER / SPLINTER
FANTASY / LACK / MOTOR / GOWN
TEMPTATION / MASTERY / PADDLE / SHERIFF
PROTOCOL / RARITY / PEDAL / ACCORDION
HATRED / NONSENSE / PATIO* / CAKE
DISCRETION / OBSESSION / ROOSTER / HARBOR
FRAUD / PLEASURE / ERASER / DENIM
MERCY / RELIEF / STAPLE / EAGLE
ADVICE / CHAOS / TANGERINE / RIBS
INTEREST / TERROR / TORTOISE / LASH

Note. * Items removed from analysis because they were at ceiling on recognition accuracy.

Appendix B: Mediator Coding Scheme

MEDIATOR AT STUDY

Study Mediator Produced—score for each instance of a 3 repetition item.

Yes (1)—generated a verbal report of a mediator that in principle could be a basis for associative encoding. Ideally, the mediator should include both cue and target, but tokens for the words (especially for abstract items) are sufficient.

No (0)—no words linking the cue and target are provided. A repeating of the cue and/or target, even with non-descriptive other words, is not sufficient, nor is an elaboration on the concepts represented by the cue or target only. Attempts to create a mediator that indicate only one of the words may have been imaged (i.e., partial “mediators”) and word salad.

e.g. CACTUS-PARCEL: “Willy’s it’s one of my favorite restaurants”

e.g. BET-ZEAL : “two people making a bet”

MEDIATOR AT RECALL

Items coded as No (0) for Study Mediator Produced should not get scored at mediator recall. Italics below indicate special cases within a mediator category.

Omission

Yes (1) —no mediator reported at recall even though one was generated at study.

e.g. “I don’t remember my image”, “I’ve got nothing”, no response within time allotted

No (0)—anything else

Verbatim

Yes (1)—verbatim or functionally verbatim (change in tense, part of speech, articles, preposition, or pronouns, and elaboration are allowed). Flipping of subject and predicate, or parts of one of those, is allowed. Only one has to be verbatim if there were multiple generated at study.

Functionally verbatim:

e.g. BLOSSOM-TONGUE: “One of those tongue tattoos with a flower blossom”  “A tongue tattoo that had a blossom flower”

Verbatim with unrecalled target:

e.g. MISSILE-BELT: “There was a missile in her belt”  “There was a missile in her belt”

Recalled “pocket” as target.

No (0)—anything else

Gist

Yes (1)—preserves the main idea/meaning of the study mediator. Omission of non-vital descriptors allowed; change in tense, spelling, part of speech, articles, or pronoun allowed; synonyms and elaboration allowed. Only has to be gist consistent with one mediator if multiple were generated at study.

e.g. MANNERS-PLEASED: “Having good manners and it being the cause of your mother's happiness”  “A mother being pleased because her child used good manners”

e.g. BEETLE-COMEDIAN: “I see this huge funny comedian that is sitting in a small beetle car”  “And it was the big fat comedian in the small car”

No (0)—anything else

Partial

Yes(1)—omission or substitution of cue or target (or both) or token of either while maintaining important content of the mediator; omission of vital descriptors or of verbs crucial to understanding how the cue and target relate to each other.

e.g. FRACTURE-SHACK: “ he fractured a bone walking in to the shack”  “he fractured a bone”

Gist/Partial

e.g. STAPLE-EAGLE: “the eagle couldn't fly because she had a staple in her wing”  “the eagle got a staple in her wing”

No (0)—anything else

Commission

Yes (1)—mediating (linking) words are new, such that the main idea/meaning of the mediator is changed

e.g. DOMAIN-WONDER: “she wondered about the domain”  “the domain was unknown”

Gist/Commission

e.g. FANTASY-LACK: “ a giant library where someone was taking out all of the fairy tale books and burning them”  “a person in a library throwing all of the fantasy fiction books off the shelves”

Partial/Commission

e.g. AFFECTION-WORTH: “two people showing affection and it being worth it”  “people pleased because they showed affection”

Gist/Partial/Commission

e.g. MISSILE-BELT: “I pictured a belt with missiles on it and the guy could easily fire it at enemies”  “ I pictured a belt and it had missiles it can fire off at me”

No (0)—anything else

Intrusion(coded but not analyzed)

Yes (1)—a type of commission error that includes an intra-list study mediator not generated at study for the pair in question. Mediators or parts of mediators from those reported at recall do not count.

e.g. MUSTARD-CAVE: “I see some random cave man eating a hotdog”  “This is the birthday party” (birthday party was part of the mediator generated for PATIO-CAKE)

Gist/Intrusion—none generated

Partial/Intrusion—none generated

Commission/Intrusion

e.g. HINGE-PUPIL: “ there was a hinge in her pupil”  “her fracture was fixed with a hinge” (fracture was part of the mediator for “FRACTURE-SHACK)

Gist/Partial/Intrusion—none generated

Gist/Commission/Intrusion

e.g. DIAPER-TROUT: “ I imagine a trout wearing a diaper” ”I imagine a trout in a diaper with clogs on” (a kangaroo wearing clogs was the mediator for CLOG-KANGAROO)

Partial/Commission/Intrusion

e.g. KITE-SKILLET: “ the kite was near the skillet”  ”the magazine was near the kite” (magazine was part of the mediator for MAGAZINE-RUG)

No (0)—anything else

Hybrid (coded but not analyzed)

Yes (1) –a type of commission/intrusion error that includes descriptors or phrases from multiple study mediators

e.g. STAPLE-EAGLE: “ I'm thinking of how rats are a staple part of an eagles diet”  “I remember a pedal operating a stapler”

Partial/Commission/Intrusion/Hybrid

e.g. RADISH-CIRCUS: “a clown juggling radishes at a circus”  “a cactus with radish blossoms on it” (cactus and blossom are both cues from other trials)

No (0)—anything else

Appendix C: Target Recall, Strategy Recall, FOK, and Target Recall-FOK Resolution

The study generated data on all measures for a full set of 80 items (40 concrete and 40 abstract paired-associates). As noted in the main paper, it is more informative to evaluate FOKs for unrecalled items only, because this analysis parallels the definition of the feeling-of-knowing as monitoring of the availability of information in memory that cannot currently be retrieved (i.e., information that could not be recalled). However, some interesting phenomena arise when the data on cued recall and the FOKs and strategy recall data for all items, including recalled items, are considered. This supplemental appendix reports on the relevant variables for this experiment for all items that had been studied, including those that generated successful recall.

Cued Recall and Strategy Recall

Table 1 reports mean cued recall as a function of concreteness and repetition. To parallel analyses run in the main paper on recognition memory, we conducted a mixed model analysis of item-level recall outcomes (success, failure) using a generalized mixed multi-level model estimated in SAS PROC GLIMMIX, using a logit link function. The model specified a random effect on intercepts (individual differences in associative recall) as well as a random residual variance component. We do not report all details of the results here. Items varied in recallability, as reflected in the effect of Cue, F(78, 3294) = 2.42, p< .001. As would be expected from the literature (e.g., Paivio, 2007; Rowe & Schnore, 1971) and our recent work (Hertzog, Fulton, Mandviwala, & Dunlosky, 2013), Concreteness had a strong effect on cued recall,F(1, 3307) = 148.45, p< .001 that persisted over the 7-day delay, even though levels of recall were lowered by the long retention interval. Repetition had the largest influence on cued recall, F(1, 3335) = 581.03, p< .001. Theconcreteness X repetition interaction was not reliable in the log-odds space, F< 1, even though it is clear from Table 1 that the linear difference in recall between concrete and abstract pairs was greater for items presented three times.

We also analyzed the ordinal encoding strategy recall variable described in the main paper, setting errors to 1, partial mediator recall to 2, gist mediator recall to 3, and verbatim recall to 4. The analysis was conducted at the item level using SAS PROC MIXED with a random variance component for intercepts, reflecting individual differences in mean levels of mediator recall. Items varied in degree of strategy recall, F (78, 3372) = 2.41, p< .001. Controlling on these item differences, concreteness affected strategy recall,F (1, 3374) = 262.18, p< .001 with better strategy recall for concrete items, marginal M = 2.12, SE = 0.06, relative to abstract items, M = 1.64, SE = 0.06(Hertzog, Fulton, et al., 2013). Scaled as Cohen’s d, the effect in SD units was d = 0.51. The largest effect on recall was associated with Repetition, F(1, 3373) = 1809.10, p< .001, with items presented three times generated much better strategy recall than items studied only once (marginal M = 2.51, SE = 0.06, versus M = 1.26,SE = 0.03, respectively, d = 1.36). Concreteness and Repetition also interacted in producing strategy recall, F(1, 3373) = 77.59, p< .001. Concrete items presented three times for study generated strategy recall that was, on average, gist recall of the mediator (M = 2.88, SE = 0.06), in contrast to abstract items presented once (M = 1.14, SE = 0.06), which on average could not be recalled. Thus, repeated presentations amplified the concreteness effect (d = 0.23 for items presented once versus d = 0.80 for items presented three times). A concreteness effect on strategy recall is fully consistent with Hertzog et al. (2013), while also showing that the effect persists after a one-week retention interval. The repetition effect is new and demonstrates that increased study opportunities dramatically increases long-term access to studied mediators. As treated in detail in the main paper, these effects were also observed for unrecalled items alone, setting the stage for strategy recall to be a cue that influences FOKs for unrecalled items and carries some of the effects of repetition on FOKs.

FOK Magnitude and FOK Resolution for All Items

To evaluate influences on FOK magnitude, we ran a multi-level model with Cue, Concreteness, and Repetition as independent variables and item-level FOKs as the dependent variable. The model included a random intercept to capture individual differences in mean FOKs. This model paralleled the ones reported in the main paper for unrecalled items (see Table 2a for the raw cell means in the 2 X 2 factorial design). There were reliable effects of all independent variables: Cue, F(78, 3372) = 2.27, p< .001, Concreteness, F(1, 3373) = 234.75, p< .001, Repetition, F(1, 3372) = 1685.09, p< .001, as well as aconcreteness X repetition interaction, F(1, 3372) = 71.31, p< .001. Concrete items received higher FOKs (marginal M = 52.34, SE = 2.33) than abstract items (marginal M = 46.52, SE = 2.37), d = 0.23. Items presented three times generated higher FOKs (marginal M = 55.95, SE = 2.32) than items presented once (marginal M = 42.91, SE = 2.38), d =0.50. The interaction effect reflected an amplification of the concreteness effect for items presented three times, d = 0.35, relative to one presentation, d = 0.10, consistent with the effects of these variables on cued recall reported above.

As might be expected, then, cued recall was a potent influence on FOKs. As seen in Table 2b and Table 2c, FOKs differed dramatically as a function of whether an item was recalled, with much higher FOKs for items with correctly recalled targets.

A traditional method in metacognitive research for evaluating this relationship is to compute FOK resolution with respect to cued recall as measured by Goodman-Kruskal gamma correlations between the two variables. Table 4 shows that FOKs were highly correlated with cued recall(see also Eakin & Hertzog, 2012a). Indeed, the FOKs in this respect behave much like delayed judgments of learning(Eakin & Hertzog, 2012b; Rhodes & Tauber, 2011). Interestingly, providing three encoding opportunities reduced the recall-FOK gamma correlations, F(1, 43) = 12.14, p = .001, with near-perfect gammas for once presented items (M = .95, SE = .02), but not for thrice-presented items (M = .86, SE = .02), d = 0.51. This is the opposite pattern that one observes for gamma correlations with recognition memory accuracy (Hertzog, Dunlosky et al., 2010; see also below). The main paper demonstrates that repetition increases access to encoding strategy information for unrecalled targets, which in turn seems to influence FOKs. In contrast, recall success is the dominant cue accessed for once-presented items. It is likely then that the reduction in gamma correlations seen in Table 4 is attributable to the emergence of strategy recall information which influences FOKs in addition to recall outcomes, more so for item presented three times.

Clearly successful target recall is a potent influence on FOKs. At the same time, the overall set of results, especially those accounting for FOKs for unrecalled items, show that multiple cues influence FOKs, and that the accessibility of strategy recall in particular is enhanced when items are studied multiple times(for further treatment of a multiple-cue utilization perspective, see Hertzog, Hines, & Touron, 2013).

Table 1

Target Recall and Strategy Recall as a Function of Concreteness and Repetition

Target Recall
Abstract / Concrete
Mean / SE / Mean / SE
1 Presentation / .02 / .01 / .11 / .01
3 Presentations / .35 / .01 / .67 / .01
Strategy Recall
Abstract / Concrete
Mean / SE / Mean / SE
1 Presentation / 1.16 / .03 / 1.37 / .03
3 Presentations / 2.15 / .03 / 2.88 / .03

Table 2a

Mean FOK (all items) as a Function of Concreteness and Repetition

Abstract / Concrete
Mean / SE / Mean / SE
1 Presentation / 26.65 / 1.13 / 31.99 / 1.07
3 Presentations / 56.21 / 1.10 / 77.42 / 1.06

Table 2b

Mean FOK (recalled items) as a Function of Concreteness and Repetition

Abstract / Concrete
Mean / SE / Mean / SE
1 Presentation / 82.97 / 4.01 / 82.62 / 1.82
3 Presentations / 84.03 / 1.14 / 91.52 / .72

Table 2c

Mean FOK (unrecalled items) as a Function of Concreteness and Repetition

Abstract / Concrete
Mean / SE / Mean / SE
1 Presentation / 25.34 / 1.06 / 25.69 / 1.06
3 Presentations / 40.75 / 1.28 / 47.87 / 1.76

Table 3

Fitted Mean Goodman-Kruskal Gamma Correlations of FOKs with Target Recall for All Items from the Mixed Model Analysis

Abstract / Concrete
N / G / SE / N / G / SE
1 Presentation / 44 / .96** / .04 / 44 / .95** / .02
3 Presentations / 44 / .85** / .02 / 44 / .86** / .03

**p .001

Note.G = Goodman-Kruskal gamma correlation. N indicates the number of persons in each cell with computable G correlations. SE is the fitted standard error for fitted least-squares means of the G correlations (see main text for further details).

Appendix D: FOK-Recognition Memory Resolution

As indicated in the main paper, we evaluated FOK-recognition memory relationships using multi-level models. It is traditional in metacognition research to evaluate the resolution of metacognitive judgments by analyzing within-person ordinal correlations of judgments with validating outcomes in a two-stage process, first computing Goodman-Kruskal correlations of judgments and outcomes for each person, and then treating these values as test statistics in a second stage (see Dunlosky & Metcalfe, 2009 for an introductory treatment; Gonzalez & Nelson, 1996). For this reason, we provide information on gamma correlations from this experiment in this supplemental appendix. The correlations were computed by excluding items that produced ceiling recognition memory, thus paralleling the analyses reported in the main paper.

Gamma correlations (G) revealed reliable within-person correlations of FOKs with recognition memory accuracy for previously unrecalled items, G = .34, SE = .05, p< .001. When data were broken down into the 2 X 2 Concreteness X Repetition cells (see Table 4), a repeated measures model in SAS PROC MIXED revealed a reliable effect of repetition, F (1, 42) = 5.21, p = .03, d = 0.55. Resolution was higher for thrice-presented items (marginal mean G = .44, SE = .11) than for once-presented items (G = .17, SE = .06). This result replicated earlier findings of higher resolution for multiply-presented items during encoding (Hertzog, Dunlosky, & Sinclair, 2010). The main effect of Concreteness was non-significant, F (1, 42) = 1.51, p = .23, d = 0.25.

However, Table 4, below, also reveals a potential issue for interpreting these results. Given high recognition memory performance for thrice-presented items, gamma correlations could not be computed for a large number of participants in the cells involving thrice-presented concrete items. In contrast to standard implementations of repeated-measures analysis using the general linear model, which deletes cases with any missing G correlations, the mixed model analysis we employed makes use of all available data under missing-at-random assumptions. For example, if ceiling effects in recognition for thrice-repeated concrete items make G non-computable in that cell for a given individual, the three G’s from the other cells in the 2 X 2 matrix for that person would still contribute to the estimated marginal means and significance tests. This approach requires making a missing-at-random assumption, however.

Given the skewed marginal distributions of recognition accuracy it was not possible to further divide items by encoding recall to use gamma correlations to test the hypothesis that encoding recall accounted for the relationship of FOKs to recognition accuracy. These outcomes are yet another reason why the multi-level regression approach provided superior information about FOKs and their relationships to other cues; the use of gamma correlations was simply inadequate for these purposes given this design and the resulting data.

Table 4

Fitted Mean Goodman-Kruskal Gamma Correlations of FOKs with Recognition Memory Accuracy for Unrecalled Items from the Mixed Model Analysis

Abstract / Concrete
N / G / SE / N / G / SE
1 Presentation / 41 / .22* / .08 / 36 / .13 / .10
3 Presentations / 35 / .24* / .10 / 8 / .64* / .21

* p< .05

Note.G = Goodman-Kruskal gamma correlation. N indicates the number of persons in each cell with computable G correlations. SE is the fitted standard error for fitted least-squares means of the G correlations (see text for further details).

Appendix E: FOK-Confidence Judgment (CJ) Relationship

This appendix reports on FOK-CJ relationships for correctly recognized items that were analyzed using multi-level models in the main paper. Our earlier work focused on using traditional Goodman-Kruskal gamma correlations to evaluate these relationships. To provide a point of reference to earlier work, we report the relevant gamma correlations here. The correlations were computed by excluding items that produced ceiling recognition memory, thus paralleling the analyses reported in the main paper.