Online Supplemental Material

Stopping Rule for Data Collection

Based on past experience with cleansing research, with continuous dependent measures, we aimed at 50 participants per between-subjects condition (Experiments 1 & 3) and 30 when the same design was repeated (Experiment 4).With categorical dependent measures, we aimed at 60 participants per between-subjects condition (Experiment 2). Variation in final sample sizes was due to different show-up rates for experiments run in the lab and availability of participants for experiments run outside the lab, as detailed below.

Experiment 1: We aimed at 100 participants in 10 experimental sessions scheduled on three days. 89 came to participate, and data collection ended with the last session. To achieve the pre-determined sample size, we collected 14 more in a second wave of data collection, final N = 103.

Experiment 2: We aimed at 240 participants and gave seven research assistants three weeks to approach students on campus. By the end, 242 had participated.

Experiment 3: We aimed at 200 participants in 20 experimental sessions scheduled on two days. 188 came to participate, and data collection ended with the last session. To achieve the pre-determined sample size, we collected 21 more in a second wave of data collection, final N = 209.

Experiment 4: We aimed at 120 participants in 10 experimental sessions scheduled on two days. 107 came to participate, and data collection ended with the last session. To achieve the pre-determined sample size, we collected 19 more in a second wave of data collection, final N = 126.

Items for Cleaning Product Evaluation in All Experiments

(1) “How much do you like this antiseptic wipe?” (-5 = dislike very much, 5 = like very much); (2) “How much do you like its packaging and design?” (-5 = dislike very much, 5 = like very much); (3) “What is the maximum price you are willing to pay for this can of antiseptic wipes? Check the range in which your maximum price falls.” (less than $3.00, $3.00 - $3.99, $4.00 - $4.99, $5.00 - $5.99, $6.00 - $6.99, $7.00 - $7.99, $8.00 or more).

Other Measures

Participants reported their current mood (0 = very bad, 10 = very good), rated the valence (1 = negative, 9 = positive) and difficulty (1 = easy, 9 = difficult) of the scrambled sentence task (SST), and provided demographic information after the primary measures in Experiments 1, 3, and 4 and before the choice behavior in Experiment 2. Mood, SST valence, and SST difficulty were unaffected by any of the manipulations in Experiments 1 (ps ≥ .083), 3 (ps ≥ .058), and 4 (ps ≥ .096). Mood was unaffected by the manipulations in Experiment 2 (ps ≥ .272), while SST valence was higher in the cleansing (vs. no-cleansing) condition (5.54 vs. 5.17, F(1, 217) = 5.33, p = .022) and SST difficulty was higher when the health goal (vs. control) was primed (3.27 vs. 2.70, F(1, 221) = 6.03, p = .015). This could not account for our hypothesized effect though, because neither SST valence (B = .02, SE = .11, t = .21, p = .653) nor SST difficulty (B = -.09, SE = .08, t = 1.12, p = .361) had any effect on choice (granola vs. chocolate bar), and including SST valence and SST difficulty as additional predictors in the logistic regression model did not reduce the significance of the predicted interaction effect (B = .38, SE = .14, t = 2.66, p = .008).