Supplemental Materials

Negative Emotional Stimuli Enhance Vestibular Processing

by N. Preuss et al., 2015, Emotion

Introduction

In this document, we provide Jags/Stan code for all models used in our data analysis. For both response accuracy and response time, we first identified the best-fitting models by comparing DIC (Deviance Information Criterion) for Jags models, and WAIC (Watanabe-Akaike information criterion) for Stan models. Models with smaller DIC should be preferred over models with greater DIC. WAIC is an improvement over DIC for Bayesian models. It is computed using the log pointwise posterior predictive density and a correction term (Gelman et al., 2013a,b); it can be interpreted as an approximation to cross-validation and is numerically more stable than DIC.

For further analysis, we used the models with the lowest DIC/WAIC values.

Accuracy Models

We estimated both random coefficients (random intercepts & random slopes for each participant) models and random intercept models. Deviance information criteria (DIC) were smaller for the random intercept models than for the random coefficients models. Therefore, we reported the results of the random intercept model. We drew 2000 samples from the posterior distribution using 3 chains, with 5000 iteration each, with a burn-in period of 1000 samples for each chain. We used every third sample in order to minimize auto-correlation of the samples. Each chain was monitored for convergence using the (Gelman and Rubin, 1992) statistic. Values of close to 1 indicate convergence; this was achieved for all parameters.

For further details on implementation please see Gelman & Hill (2007).

Parameter estimation for Accuracy Data

We fit the following logistic regression model to the response accuracy data. Each response was modeled as being drawn from a Bernoulli distribution with rate parameter giving the probability of a correct response. We predicted the logit of using the following linear model:

where is the effect of the neutral image category, is an additive effect for image category , is the effect of motion intensity and is an additive participant effect.

We placed non-informative priors on all image effects and on the effect of motion intensity. Participant effect was given prior centred at 0, with a uniform prior between 0 and 100 on the between-participant standard deviation.

Jags code

model{
for (i in 1:length(response)) {
response[i] ~dbern(theta[i])
# linear model on rate parameter theta
logit(theta[i]) <-(b0 +u[subject[i], 1]) +
b.snake *snake[i] +
b.threat *threat[i] +
b.erotic *erotic[i] +
b.sports *sports[i] +
b.mutilation *mutilation[i] +
b.intensity *intensity[i]
}
# priors on fixed effects
b0 ~dnorm (0, .001)
b.snake ~dnorm (0, .001)
b.threat ~dnorm (0, .001)
b.mutilation ~dnorm (0, .001)
b.sports ~dnorm (0, .001)
b.erotic ~dnorm (0, .001)
b.intensity ~dnorm (0, .001)
# priors on participant effects
for (j in 1:J) { # J is the number of participants
u[j, 1] ~dnorm (0, tau1.subject)
}
tau1.subject <-pow(sigma1.subject, -2)
sigma1.subject ~dunif (0, 100)
}

Bayes Factor for Accuracy Data

In order to calculate a Bayes factor using the Savage-Dickey density ratio (Lee & Wagenmakers, 2014), we fit the following model to the accuracy data, using a similar model the above, but with the images aggregated into 2 categories, negative and positive/neutral.

In this model, the coefficient is thus an indicator variable which takes the value for mutilation, snake and threat images, and the value for objects, sports and erotic content. is thus an additive effect of negative images, and we tested the order-restricted hypothesis that effect was zero, versus the alternative hypothesis that was drawn from a Half-Cauchy distribution with mean and scale :

Jags code

model{
for (i in 1:length(response)) {
response[i] ~dbern(theta[i]) # accuracy
# linear model on rate parameter theta
logit(theta[i]) <-(b0 +u[id2[i], 1]) +
b.negative *negative[i] +
b.intensity *intensity[i]
}
# additive participant effect
for (j in 1:J) { # J is the number of participants
u[j, 1] ~dnorm (0, tau1.subject)
}
tau1.subject <-pow(sigma1.subject, -2)
sigma1.subject ~dunif(0, 100)
# priors on fixed effects
b0 ~dnorm (0, .001)
# coefficient for negative images is defined as effect size * standard
# deviation
b.negative <-delta*sigma
b.intensity ~dnorm(0, .001)
# delta and sigma come from half-Cauchy distributions
# see (Lee & Wagenmakers, 2014) for details
delta ~dnorm(0, lambdadelta)T(0,)
lambdadelta ~dchisqr(1)
sigma <-abs(sigmatmp)
sigmatmp ~dnorm(0, lambdasigma)
lambdasigma ~dchisqr(1)
}

Response time models

We modeled response time using an Ex-Gaussian (exponentially modified Gaussian) distribution in Stan. The Ex-Gaussian results from the convolution of a normal and an exponential distribution, and its probability density function (pdf) is given by:

where erfc is the complementary error function:

The parameters of the Ex-Gaussian are and , the mean and variance of the normal component, and , the rate of the exponential component. In particular, the parameters which are usually of greatest interest are , which shifts the distribution along the x-axis, and , which affects the tail behavior.

After selecting the model with the lowest WAIC value, we fit 2 separate models in order to estimate the effect of the image category on and , using the same linear model as above. In model 1, we predicted using the linear model, while estimating participant-specific and parameters, which were in turn given group-level prior distributions. All priors used in the model were mildly informed to match the scale of the data being modeled.

The fixed effects coefficients (image coefficients) were given Cauchy priors (Gelman et al., 2013a):

and the coefficient of motion intensity was given a normal distribution:

In model 1 (linear model on ), we estimated individual and parameters (for participant ), which were drawn from truncated normal distributions with group level parameters.

In model 2, we predicted the logarithm of using the same linear model. Since must be , it is natural to place a linear model on the logarithm of , rather than on itself. We estimated individual and parameters (for participant ); since is constrained to be > 0, we used a normal distribution for individual and group-level priors.

For all RT models, we drew a total of 1000 samples from the joint posterior distribution, after discarding the initial samples from the burn-in period (4 chains * 250 samples). Each of the 4 chains was checked for convergence using the (Gelman and Rubin, 1992) statistic.

Parameter Estimation With Linear Model on

Stan code

data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
// all parameters given lower bounds are constrained to be positive
real<lower = 0> sigma[n_participants];
real<lower = 0> lambda[n_participants];
vector[n_images] beta;
real bmotion;
real u[n_participants];
real<lower = 0> lambda_mu;
real<lower = 0> lambda_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
model {
real mu[n_trials];
// group-level priors
lambda_mu ~ normal(0, 5);
lambda_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
// priors on fixed effects (image categories)
beta[1] ~ cauchy(0, 2.5); // intercept: neutral images
for (i in 2:n_images){ //additive effects of other image categories
beta[i] ~ cauchy(0, 2.5);
}
// random participant effects
for (j in 1:n_participants){
lambda[j] ~ normal(lambda_mu, lambda_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on mu
mu[k] <- X[k] * beta + u[participant[k]] + bmotion * motion[k];
//response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}
generated quantities {
vector[n_trials] log_lik;
real mu[n_trials];
for (k in 1:n_trials){
mu[k] <- X[k] * beta + u[participant[k]] + bmotion * motion[k];
// used to calculate WAIC
log_lik[k] <- exp_mod_normal_log(y[k],
mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}

Parameter Estimation With Linear Model on

Stan code

data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
// all parameters given lower bounds are constrained to be positive
real mu[n_participants];
real<lower = 0> sigma[n_participants];
vector[n_images] beta;
real bmotion;
real u[n_participants];
real mu_mu;
real<lower = 0> mu_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
model {
real lambda[n_trials];
// group-level priors
mu_mu ~ normal(0, 5);
mu_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
// priors on fixed effects (image categories)
beta[1] ~ cauchy(0, 2.5); //intercept: neutral images
for (i in 2:n_images){ //additive effects of other image categories
beta[i] ~ cauchy(0, 2.5);
}
// random participant effects
for (j in 1:n_participants){
mu[j] ~ normal(mu_mu, mu_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
/* because the lambda parameter must be positive, it makes sense
to fit a linear model on the logarithm scale */
for (k in 1:n_trials){
// linear model on log(lambda)
lambda[k] <- exp(X[k] * beta + u[participant[k]] + bmotion * motion[k]);
//response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}
generated quantities {
vector[n_trials] log_lik;
real lambda[n_trials];
for (k in 1:n_trials){
lambda[k] <- exp(X[k] * beta + u[participant[k]] + bmotion * motion[k]);
// used to calculate WAIC
log_lik[k] <- exp_mod_normal_log(y[k],
mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}

Bayes Factor Models

In order to calculate a Bayes factor using the Savage-Dickey density ratio (Lee & Wagenmakers, 2014), we fit the following models to the RT data, using a similar model the above. The images aggregated into 2 categories, negative and positive/neutral.

Bayes Factor Model for Effect of Negative Images on

Stan code

data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
real<lower = 0> sigma[n_participants];
real<lower = 0> lambda[n_participants];
real alpha;
// for an order-restricted comparison:
real<lower=0> delta;
// real delta; // for undirected comparison:
real<lower = 0> sigmab;
real bmotion;
real u[n_participants];
real<lower = 0> lambda_mu;
real<lower = 0> lambda_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
transformed parameters {
real beta;
// beta is the additive component for negative images
beta <- delta * sigmab;
}
model {
real mu[n_trials];
// priors
lambda_mu ~ normal(0, 5);
lambda_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
alpha ~ normal(0, 2.5);
// Half-Cauchy prior on effect size delta
delta ~ cauchy(0, 1);
sigmab ~ cauchy(0, 1);
for (j in 1:n_participants){
lambda[j] ~ normal(lambda_mu, lambda_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on mu
mu[k] <- X[k, 1] * alpha + X[k, 2] * beta +
u[participant[k]] + bmotion * motion[k];
// response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[k],
sigma[participant[k]],
lambda[participant[k]]);
}
}

Bayes Factor Model for Effect of Negative Images on

Stan code

data {
int<lower = 0> n_trials;
int<lower = 0> n_participants;
int<lower = 0> n_images;
vector<lower = 0>[n_trials] y;
matrix[n_trials, n_images] X;
int<lower = 1, upper = n_participants> participant[n_trials];
int<lower = -1, upper = 1> motion[n_trials];
}
parameters {
real mu[n_participants];
real<lower = 0> sigma[n_participants];
real alpha;
real delta;
real<lower = 0> sigmab;
real bmotion;
real u[n_participants];
real mu_mu;
real<lower = 0> mu_sd;
real<lower = 0> sigma_mu;
real<lower = 0> sigma_sd;
real<lower = 0> u_sd;
}
transformed parameters {
real beta;
// beta is the additive component for negative images
beta <- delta * sigmab;
}
model {
real lambda[n_trials];
// priors
mu_mu ~ normal(0, 5);
mu_sd ~ cauchy(0, 2.5);
sigma_mu ~ normal(0, 1);
sigma_sd ~ normal(0, 1);
u_sd ~ cauchy(0, 2.5);
bmotion ~ normal(0, 2);
alpha ~ normal(0, 2.5);
// Half-Cauchy prior on effect size delta
delta ~ cauchy(0, 1);
sigmab ~ cauchy(0, 1);
for (j in 1:n_participants){
mu[j] ~ normal(mu_mu, mu_sd);
sigma[j] ~ normal(sigma_mu, sigma_sd);
u[j] ~ normal(0, u_sd);
}
for (k in 1:n_trials){
// linear model on log(lambda)
lambda[k] <- exp(X[k, 1] * alpha + X[k, 2] * beta +
u[participant[k]] + bmotion * motion[k]);
// response time is ~ Ex-Gaussian
y[k] ~ exp_mod_normal(mu[participant[k]],
sigma[participant[k]],
lambda[k]);
}
}

References

Gelman, A. and J. Hill (2006). Data analysis using regression and multilevel/hierarchical models, Cambridge University Press

Gelman, A., and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Stat. Sci. 457–472.

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013a). Bayesian Data Analysis, 3rd Edn., eds A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin (Boca Raton, FL: CRC press), 165–196.

Gelman, A., Hwang, J., and Vehtari, A. (2013b). Understanding predictive information criteria for Bayesian models. Stat. Comput. 1–20.

Lee, M. D. and E.-J. Wagenmakers (2013). Bayesian Modeling or Cognitive Science: A Practical Course, CambridgeUniversity Press.