Supplementary material online

Detailed Methods

The experimental set contained six types of speaker-inconsistent utterances: 40 were odd for a male speaker (“Before I leave I always check whether my make up is still in place”), 40 were odd for a female speaker (“I broke my ankle playing football with friends”), 20 were odd for a young speaker (“Every evening I drink some wine before I go to sleep”), 20 were odd for an adult speaker (“I cannot sleep without my teddy bear in my arms”, 20 were odd for a speaker with an ‘upper-class’ accent (“I have a large tattoo on my back”), and 20 were odd for a speaker with a ‘lower-class’ accent (“Every month we go to an opera for an evening out”). The speaker identity (SI) violation always emerged at a single critical word (italicized here, English translation sometimes requires two words), which was never sentence-final. All sentences were recorded with a consistent and an incongruent speaker (sex manipulation: 4 males and 4 females; age manipulation: 2 young children aged 6 and 8, and 2 adults; accent manipulation: 2 speakers with a Dutch accent typically perceived as ‘lower-class’, and 2 with a Dutch accent typically perceived as ‘upper-class’). Recordings contained no obviously different prosodic contours. Across recordings, the SI congruent and incongruent critical words were matched on acoustic duration (SI congruent: mean = 520 ms, sd = 149 ms; SI incongruent: mean = 524 ms, sd = 140 ms).

For each trial list, 80 SI congruent and 80 SI incongruent utterances (balanced across the six speaker subtypes) were mixed with 192 additional utterances, spoken by four adult female speakers and one adult male speaker. Of these additional utterances, 48 contained a classic lexical semantic (LS) anomaly (e.g., “You wash your hands with horse and water”), and 48 contained a correct control (e.g., “You wash your hands with soap and water”). For purposes unrelated to the current issue, another 48 contained a world-knowledge dependent anomaly (e.g., “You wash your hands with mud and water”, see Hagoort et al., 2004), and a final 48 coherent items were true filler sentences. The three variants of an item (e.g. “You wash your hands with soap/horse/mud and water”) were always spoken by the same speaker. Six pseudo-randomized trial lists were created such that no participant heard the same sentence in more than one variant, each variant was heard by an equal number of participants, the longest consecutive sequence of trials of the same type was two, and, for the speaker-consistency items, the same speaker featured no more than five times in any one trial list.

After the EEG experiment participants were asked to fill out a Dutch translation of the Empathizing Questionnaire and the Systemizing Questionnaire (Baron-Cohen et al., 2003; Baron-Cohen and Wheelwright, 2004). These questionnaires each consist of 40 experimental and 20 control items. An example of an experimental EQ item is “I can tell if someone is masking their true emotion”. An example of an experimental SQ item is “I can easily visualize how the motorways in my region link up”. Responses are given on a 4-point scale ranging from ‘strongly agree’ to ‘strongly disagree’. Approximately half of the items are reversed. Participants received 0 for a ‘non-empathic/systematic’ response, whatever the magnitude, and 1 or 2 for an ‘empathic/systematic response’ depending on the strength of the reply. Consequently, the maximum score on each scale is 80 and the minimum score is zero. The EQ preceded the SQ.

Additional data analysis: Time-frequency analysis

Introduction

To maximize finding electrophysiological markers of differential sensitivity to social information processing, in addition to the ERP analyses, we explored inter-individual differences in oscillatory brain activity in relation to language processing. Time-frequency (TF) analyses of EEG data reveal changes in the amplitude of oscillatory activity over time, reflecting processes of synchronization or desynchronization of neuronal populations (Pfurtscheller and da Silva, 1999; Rodriguez et al., 1999; Tallon-Baudry and Bertrand, 1999). Although the empirical relationship between oscillatory activity and ERP magnitude in sentence processing remains to be investigated, TF analyses of the EEG provides a complementary window on the investigation of (inter-individual differences in) language processing. Evidence for this follows from recent sentence processing experiments using both analyses techniques in event-related designs (Bastiaansen et al., 2005; Hagoort et al., 2004; Hald et al., 2006). In the Hagoort et al. study (2004) a dissociation between semantic and world-knowledge information was revealed at the level of oscillatory brain dynamics, which was absent at the level of ERPs. Whereas both types of information elicited similar N400 effects, semantic violations resulted in an increase in theta band power in comparison to a matched control condition, whereas world knowledge violations were associated with an increase in gamma band power.

In the present study we tested for inter-individual differences in oscillatory brain activity in a wide frequency range of 1 to 100 Hz, in relation to the lexical semantic and speaker identity violations. We hypothesized a) that oscillatory activity in these two manipulations would affect different frequency bands, with semantic violations affecting the theta band, and b) that oscillatory activity in these frequency bands possibly correlated with sex and/or empathy.

Methods

Data from critical trials were analyzed according to the following procedure. After off-line re-referencing of the EEG signals to the mean of the left and right mastoid, they were filtered with a 100 Hz low pass filter. Segments ranging from 1000 ms before to 2000 ms after the acoustic onset of the critical word were baseline-corrected by subtracting mean amplitude in the -500 to 0 ms pre-stimulus interval, and semi-automatically screened off-line for eye movements, muscle artifacts, amplifier blocking, and electrode drifting. Segments containing such artifacts were rejected (10% for LS and 13.4% for SI manipulations, respectively, with no asymmetries across congruent and incongruent conditions).

The data were analyzed using the Fieldtrip open source Matlab toolbox for EEG/MEG-analysis developed at the Donders Institute for Brain, Cognition and Behaviour (http://www.ru.nl/neuroimaging/fieldtrip). In order to reveal event-related changes in power for the different frequency components of the EEG, Time-Frequency (TF) representations of the single trial data were computed by using the multi-taper approach described by Mitra and Pesaran (1999). In order to optimize the trade-off between time- and frequency resolution, TF representations were constructed in two different, partially overlapping frequency ranges (see e.g. Womelsdorf et al., 2006 for a similar approach to multitaper analysis). In the low-frequency range (2–36 Hz), 2-Hz frequency-smoothing and 500 ms time-smoothing windows were used to compute power changes in frequency steps of 2 Hz and time steps of 10 ms. In the high-frequency range (30–100 Hz), power changes were computed in 5-Hz frequency steps and 10 ms time steps, with a 10-Hz frequency smoothing and a 200 ms time-smoothing. The TF representations of the single-trial data were averaged separately for the LS and SI conditions. The resulting average power values were then expressed as the percentage power increase or decrease relative to the power in a 500 ms prestimulus baseline interval.

The statistical significance of the differences between conditions for the observed TF representations of power change was evaluated by a cluster-based random permutation approach (Maris and Oostenveld, 2007). This non-parametric statistical approach corrects for the multiple-comparisons problem. Since we have little a-priori knowledge about when and where to expect condition differences, we did not preselect time- or frequency windows, nor EEG electrodes for statistical analysis. The approach naturally takes care of interactions between electrodes, time points and frequency bins by identifying clusters of significant differences between conditions in the time, space and frequency dimensions, and effectively controls the Type-1 error rate in a situation involving multiple comparisons. The procedure is briefly described here (for an elaborate description of the approach, see Maris and Oostenveld, 2007).

First, for every data point (electrode-time-frequency) a simple dependent-samples t-test is performed, resulting in uncorrected p-values. All data points that do not exceed a pre-set significance level of .05 are zeroed. Clusters of adjacent non-zero data points are computed, and for each cluster a cluster-level test statistic is calculated by taking the sum of all the individual t-statistics within that cluster. Next, a null-distribution is created. Subject averages are randomly assigned to one of the two conditions 500 times, and for each of these randomizations, cluster-level statistics are computed. For each randomization, the largest cluster-level statistic is entered into the null distribution. Finally, the actually observed cluster-level test statistics are compared against the null distribution, and clusters falling in the highest or lowest 2.5th percentile are considered significant. Two pairwise comparisons were performed: LS correct vs. LS anomalous, and SI correct vs. SI incorrect. To test for inter-individual differences linear regression analyses with EQ score and SQ score as potential predictors were performed on each electrode-time-frequency point of the TF representations of both SI and LS contrasts. Significance of the regression coefficient was evaluated using the same cluster-based random permutation approach as used in the analysis of power changes.

Results

TF data of two participants were excluded from the analyses due to excessive (muscle) artifacts, which manifest themselves primarily as broadband power increases in the higher frequency bands. For the same reason, electrodes T7 and T8 were excluded from the TF analyses. Supplementary Figure 1 presents the TF representations of the grand average (N = 34) EEG power changes for both the LS and SI manipulations at a representative electrode, Pz. For each manipulation, the raw TF difference (violation minus correct) is shown, with the graphical representation of the results of the randomization analysis for the same electrode. Additionally the scalp topography of the statistically significant power changes is shown at the bottom of the figure. Statistical analysis revealed that relative to the LS correct condition, LS violations elicited one significant positive cluster (p = .003) in the theta band (2-7 Hz) in the 500-1100 ms latency interval, indicating a larger theta band power increase for the semantic violation condition compared to the semantic correct condition (Suppl. Fig. 1A). No significant clusters were obtained in any of the higher frequency bands. In contrast to the results for LS, analyses of SI revealed no significant power changes in any of the frequency bands, neither in the experiment as a whole, nor in the first or second half (Suppl. Fig. 1B). However, as presented in Supplementary Figure 2, additional regression analyses with EQ score as a regressor (N = 25) revealed a marginally significant positive cluster of correlations of gamma band (50-60 Hz) power with EQ score in the 300-900 ms latency interval (p = .088). Additional analyses confirmed that this marginal effect could be ascribed to a significant positive cluster of correlations of gamma band (50-60 Hz) power with EQ score in the 300-900 ms latency interval in the speaker violations (p = .021), which was absent in the speaker congruent condition. When testing for speaker identity adaptation effects in the TF domain, the same positive cluster was obtained in the SI incongruent condition in the first half of the experiment (p = .023), but was absent in the second half (Suppl. Fig. 2). In contrast to the SI manipulation, no correlations of EQ score with LS violations were obtained in any of the frequency bands.

---Insert Supplementary figures 1 and 2---

Discussion

Our ERP results clearly indicate that there is a qualitative difference between the integration of semantic and social information into the linguistic context. Although both types elicit similar N400 effects, with similar onset latencies and topographical distributions, a person’s ability to empathize correlates with social information processing but not lexical semantic processing. Moreover, this difference also appears to manifest itself in the oscillatory brain dynamics, where both types of information affect power changes in different frequency bands. The lexical semantic manipulation revealed a theta power (2-7 Hz) increase from 500 to 1100 ms after word onset for the incongruent condition (Suppl. Fig. 1A). This is in accordance with previous studies revealing theta power increases in semantic processing (Bastiaansen et al., 2008; Bastiaansen et al., 2002; Hald et al., 2006). The speaker identity manipulation, in contrast, did not elicit changes in theta power, nor in any of the other frequency bands (Suppl. Fig. 1B). However, when investigating inter-individual differences in empathizing, a marginally significant positive correlation of EQ score with increased gamma band (50-60 Hz) power became evident in the 300-900 ms latency window (Suppl. Fig. 2A and C). Hence, individuals with an empathizing-driven cognitive style not only revealed a larger N400 effect, but also a marginally larger gamma band (50-60 Hz) power increase in the speaker identity contrast, which was the result of a significantly larger gamma band power increase to the speaker identity violations. In contrast, no correlation of empathy with lexical semantic violations was found in any of the frequency bands. These findings, therefore, appear to constitute a difference between pragmatic and semantic processing, with two different electrophysiological effects emerging in the time-frequency domain.

In studies investigating language processing, theta oscillations have previously been linked to retrieval of lexical semantic information from memory (for a recent review see Bastiaansen and Hagoort, 2006). The theta power increase in our lexical semantic incongruent condition, relative to the congruent condition, possibly indicates that given the preceding sentence context, semantic violations required more retrieval efforts than semantic congruent words. The absence of a significant theta power increase in the speaker identity violations relative to congruent condition may be ascribed to the fact that given the preceding sentence context, these incongruent critical words were less violating than the lexical semantic incongruent critical words, and as a result required hardly any increased retrieval efforts. However, in individuals who empathize to a greater degree, the brain did keep track of these speaker identity violations as revealed by an increase in gamma band power, reflecting detection of an incompatibility of linguistic content and context-bound stereotypical assumptions about the speaker. Although these results need replication, they tentatively would fit with previous studies across several cognitive domains that report a local increase of gamma power when multiple types of information are required to be integrated, such as in integrative processes in perception and language (Hagoort et al., 2004; Luo et al., 2009; Melloni et al., 2007; Rodriguez et al., 1999; Tallon-Baudry and Bertrand, 1999).