FACE IDENTITY RECOGNITION IN ASD 1

Running Head: FACE IDENTITY RECOGNITION IN ASD

Reduced relianceon optimal facial information for identity recognition in Autism Spectrum Disorder

Word count: 5,198

Abstract

Previous research into face processing in Autism Spectrum Disorder (ASD) has revealed atypical biases towards particular facial information during identity recognition. Specifically, a focus on features (or high spatial frequencies) has been reported for both face and non-face processing in ASD. The current study investigated the development of spatial frequency biases in face recognition in children and adolescents with and without ASD, using non-verbal mental age to assess changes in biases over developmental time. Using this measure, the control group showed a gradual specialisation over time towards middle spatial frequencies, which are thought to provide the optimal information for face recognition in adults. By contrast, individuals with ASD did not show a bias to one spatial frequency band at any stage of development. These data suggest that the ‘mid-band bias' emerges through increasing face-specific experience, and that atypical face recognition performance may be related to reduced specialisation towards optimal spatial frequencies in ASD.

KEYWORDS: Face processing; spatial frequency; autism; development; experience

1. Introduction

1.1 Face processing in Autism Spectrum Disorder

Autism Spectrum Disorder (ASD) is a pervasive neurodevelopmental disorder diagnosed on the basis of impaired development of social interaction and communication, as well as markedly restricted activities and interests (DSM IV-TR, American Psychiatric Association, 2000). Due to the crucial role of faces in the social contexts in which those with ASD are particularly impaired (Joseph & Tager-Flusberg, 2009), face processing has received a great deal of attention in the study of this neurodevelopmental disorder. Withdrawal from social situations was highlighted in the first reports of childhood autism by Kanner (1943), and a lack of interest in faces and sharing information can be traced back to early infancy through retrospective reports and videotapes, as well as prospective studies of children at-risk of developing ASD(Johnson, Frith, Siddons & Morton (1992); see also Elsabbagh & Johnson, 2007, for a review).

In terms of the processing of face identity, individuals with ASD often fall below standardised norms on tests of face recognition (Klin, Sparrow, de Bildt, Cicchetti, Cohen & Volkmar, 1999), possibly due to unusual strategies for processing face stimuli. Indeed, atypical patterns of attention during face processing have been reported in behavioural studies (e.g., Annaz, Karmiloff-Smith, Johnson, & Thomas, 2009; Joseph & Tanaka, 2003; Langdell, 1978; Riby, Doherty-Sneddon, & Bruce, 2008a, 2008b), and during eyetracking (e.g., Falck-Ytter, 2008; Klin, Jones, Schultz, Volkmar, & Cohen, 2002; Pelphrey, Sasson, Reznick, Paul, Goldman & Piven, 2002; van der Geest, Kemner, Verbatern, & van Engeland, 2002). Faces do not capture the attention of individuals with ASD in the same way as in typically-developing controls (Riby & Hancock, 2009), and ASD has been characterised by reduced looking times to people in general, and to faces in particular, in both static and dynamic social scenes(e.g., Klin et al., 2002; Riby & Hancock, 2008; Speer, Cook, McMahon, & Clark, 2007). Finally, some, but not all, brain imaging studies have found slower and less specific neural responses to faces in ASD compared to controls (Grice, Spratling, Karmiloff-Smith, Halit, Csibra, de Haan, & Johnson, 2001; Humphreys, Hasson, Avidan, Minshew, & Behrmann, 2008; McPartland, Dawson, Webb, Panagiotides, & Carver, 2004; Webb, Dawson, Bernier, & Panagiotides, 2006).

Although the research reviewed above indicates atypical face processing in ASD, the reasons underlying this atypicality remain unclear, and even less is known about emergence over development. To shed light on these issues, in the current study we investigated the use of different spatial frequency bands in face recognition in ASD and controls.

1.2 Spatial frequency biases in face recognition

It is only recently that spatial frequency biases in face recognition have been investigated in ASD (e.g., Boeschoten, Kenemans, van Engeland, & Kemner, 2007; Deruelle, Rondan, Salle-Collemiche, Bastard-Rosset, & Da Fonséca, 2008; Deruelle, Rondan, Gepner, & Tardif, 2004; Leonard, Annaz, Karmiloff-Smith, & Johnson, 2011; Vlamings, Marthe, van Daalen, van der Gaag, Jan, & Kemner, 2010). Different spatial frequencies correspond to varying levels of detail in the visual environment, with low spatial frequencies (LSFs) generally thought to convey information about the global shape and overall contours of visual stimuli, while high spatial frequencies (HSFs) carry information about the more detailed features (Goldstein, 2009). In comparing spatial frequency use for face recognition in ASD and controls, several studies have now reported a greater reliance in ASD on HSFs, as compared to an LSF bias in typically-developing controls (e.g., Boeschoten et al., 2007; Deruelle et al., 2008, 2004; Vlamings et al., 2010). These results are consistent with previous findings of more featural, detailed processing of faces and other visual stimuli in ASD (e.g., Frith, 2003). However, two methodological issues need to be considered when interpreting the above research. First, the developmental dimension is missing: no study has directly compared children and adults using the same experimental procedure, making it unclear how spatial frequency biases might emerge during development. Second, all of the studies included only LSFs and HSFs of the face stimuli presented, making it difficult to assess any other spatial frequency bias that participants might have. This second issue turns out to be important, because a large body of literature now suggests that, although LSFs and HSFs are useful and sometimes sufficient for face recognition (e.g., Fiorentini, Maffei, & Sandini, 1983; Halit, de Haan, Schyns, & Johnson, 2006), in adults the optimal band for face recognition consists of middle spatial frequencies (MSFs: between 8 and 24 cycles per face; Costen, Parker & Craw, 1994; Hayes, Morrone & Burr, 1986; Leonard, Karmiloff-Smith & Johnson, 2010; see Ruiz-Soler & Beltran, 2007, for a review). Furthermore, Leonard et al. (2010) found that in typical development, this ‘mid-band bias’ was actually rather late to develop, with 7- and 8-year-old children still relying more on HSFs for face recognition than older children and adults. It is therefore possible that previous accounts of a high spatial frequency bias in ASD depend both on the age at which the individuals were tested and on the lack of stimuli testing the mid-band bias. It is therefore critical to investigate the use of middle spatial frequencies for face recognition in ASD within a developmental context.

The above point was addressed by Leonard et al. (2011), who found a surprisingly similar pattern of developing biases for face recognition over chronological age in a developmental comparison of individuals with ASD and typically-developing controls. However, chronological age may not accurately reflect the level of functioning of an individual with ASD, as they can be developmentally delayed even in the relatively stronger domain of visuo-spatial processing (e.g., Joseph, Tager-Flusberg, & Lord, 2002). For this reason, the current study tracked spatial frequency biases for face recognition in relation to non-verbal mental age, with the implication that those with lower non-verbal mental ages will have reduced face-specific experience because they are younger (as in the controls), or because they are lower-functioning children with ASD, who show reduced looking time to faces than their relatively high-functioning counterparts (Riby & Hancock, 2009). The analyses presented in this paper thus assess how variance in mental age affected spatial frequency biases for face recognition, rather than controlling for this variance through chronological age matching. Data from inverted faces were also analysed, providing a stimulus with which neither group would have much experience (see Leonard et al., 2010). In line with previous findings, it was predicted that the control group would show a gradual decrease in the reliance on HSFs, resulting in an MSF bias by adolescence. If the development of the mid-band bias for face recognition relies on increased experience with faces in typically-developing children (Leonard et al., 2010), the ASD group should not show a bias toward MSFs for upright face recognition at any stage. In addition, based on previous work it was predicted that neither group should be biased toward MSFs for inverted faces.

2. Methods

2.1 Participants

Thirty-two males (age range: 7 years 2 months – 15 years 5 months) participated in the study in two separate groups. Previouspiloting in typically-developing children found that testing participantsbelow 7 years resulted in a drastically increased drop-outrate, which would likely be even greater in children with autism. The control group consisted of seventeen participants (mean chronological age: 11 years 5 months, SD: 2 years 5 months; mean non-verbal mental age: 10 years 3 months, SD: 2 years 10 months), who had no reported learning difficulties or clinical diagnoses. The remaining fifteen participants (mean age: 10 years 4 months, SD: 2 years 6 months; mean non-verbal mental age: 9 years 7 months, SD: 2 years 11 months) were in the ASD group. All had a UK statement of special needs, with a primary diagnosis of ASD from a trained psychiatrist or pediatrician, using established criteria from the DSM IV-TR. Recent research has yielded a high level of agreement between clinical and research diagnoses (Mazefsky & Oswald, 2006). In line with other recently published studies (e.g., Franklin, Sowden, Burley, Notman & Alder, 2008; Williams & Jarrold, 2010) therefore, the official diagnosis from an experienced, trained clinician and the non-verbal and face recognition data collected here were considered sufficient background information for the current report.

The two groups did not differ from each other on either chronological age, t(30) = -1.64, p = .11, or non-verbal mental age, t(30) = -.65, p = .52 (see Materials for explanation of the measure of non-verbal mental age). As expected from previous research (e.g., Annaz et al., 2009), the groups differed on the Benton Test of Facial Recognition (see Materials for details), with a significantly lower mean score in the ASD group (M = 18.33; SD = 3.22) than in the controls (M = 20.82; SD = 2.72), t(30) = -2.37, p = .02.

2.2 Materials

Each child was tested on a series of standardised and experimental tasksfrom Leonard et al. (2011), including Raven’s Standard Progressive Matrices (Raven et al., 2000), which was used as a measure of NVMA for both groups, and the Benton Test of Facial Recognition (Benton,Sivan, Hamsher, Varney, & Spreen, 1983). Raven’s Matrices are often used for matching purposes in the literature (Mottron, 2004) and are appropriate for a wide age range (Riby et al., 2008a). The Benton test has also been widely used in children with and without neurodevelopmental disorders in previous studies utilising the developmental trajectory approach (e.g., Annaz et al., 2009; Karmiloff-Smith et al., 2004; Thomas, Annaz, Ansari, Scerif, Jarrold, & Karmiloff-Smith, 2009).

Both upright and inverted face stimuli were viewed by participants (see Figure 1 for examples). Only two face identities were presented in order to keep memory demands to a minimum for the youngest children and for those with ASD. The upright face stimuli were adopted from a set produced by Näsänen (1999), and included the original unmasked face images and three masked faces, in which a narrow band of spatial frequencies was masked by noise. One noise mask covering each of 8, 16 or 32 cycles per image was chosen, corresponding to 1.1, 2.2 and 4.4 cycles per degree during presentation, and representing LSF, MSF and HSF masks respectively. A further ‘training stimulus’ was produced for the computerised task, with black bars (subtending 0.1 degree of visual angle) added to the face image using the Windows Paint program. Inverted face stimuli were produced by rotation of the above face images by 180° in Adobe Photoshop. All face stimuli subtended 7 x 7 degrees of visual angle at the viewing distance of approximately 53 cm.

[place Figure 1 about here]

2.3 Procedure

Participants followed the ‘child procedure’ outlined in Leonard et al. (2010), completing a familiarisation/training period with the face identities through a number of games before beginning the computerised task. These games included both naming and memory tasks, for which the child earned points for each correct answer. Once the main task began, trials were blocked so that upright trials always preceded inverted trials. In both sets of trials, a test face (either masked or unmasked) was presented, followed by the two original unmasked faces. Participants had to decide which of the two face identities had been presented on the test trial, demonstrating their choice by pointing to the face on the screen. The positions of the ‘choice stimuli’ (e.g., left or right) were counterbalanced, with the two face identities appearing equally often on both sides of the screen. The duration of stimulus presentation depended on the age and group membership of the participant: The ASD group and younger control children saw the target face for 2 seconds, while control children over the age of ten saw the target face for 0.5 seconds. Extensive piloting with control children revealed these to be the optimal exposure durations for recognition of the target faces. The different durations did not affect the pattern of spatial frequency biases found in previous testing of a group of ten-year-olds (see Leonard et al., 2010). Piloting with children with ASD revealed that the very quick exposure was demotivating for them as they found it too difficult, and that the two-second exposure ensured that they received at least an equal amount of exposure to the face as the control group. In addition, when assessed by chronological age, individuals with autism presented a very similar pattern of results to the control group using these different exposure durations (Leonard et al., 2011), suggesting that differences in spatial frequency biases between the two groups in the current study are due to levels of functioning or non-verbal mental age and not due to the differences in target duration.

During the test trials, each of the SF masks was presented a total of 16 times (eight in upright trials, eight in inverted trials), with sixteen unmasked faces randomly presented throughout these trials to provide the baseline measure in each face orientation (producing a total of 64 trials). Trials were initiated bythe experimenter and began when participants were judged to be attending to the fixation point. The experimenter recorded the participant’s answer by pressing the appropriate button on a mouse attached to the computer. Upon finishing the computerised task, participants completed the Raven’s Matrices and the short form of the Benton test. All participants were rewarded with a choice of stickers or school merit awards throughout the procedure, and with a certificate when all tasks were completed.

3. Results and Discussion

The mean and standard deviations of spatial frequency used for both groups are presented in Table 1. Scores were calculated by subtracting task accuracy from 100% (i.e., achieving 100% accuracy for the LSF mask would demonstrate that the LSF band was not being used in the task, resulting in a ‘use’ score of 0%).The data suggest that the mean use of each spatial frequency differed more in upright than inverted trials between groups. However, a mixed analysis of variance (ANOVA) with spatial frequency (SF: LSF, MSF, HSF) mask and stimulus orientation (upright, inverted) as within-subjects factors, and group (ASD, control) as the between-subjects factor revealed only a significant main effect of SF mask, F(2,60) = 18.74, p < .001, ηp2 = .4 (Greenhouse-Geisser corrected statistic reported). Post-hoc pairwise comparisons with Bonferroni corrections revealed that this effect was due to significantly lower use of LSFs (M = 8.32) than MSFs (M = 21.01) or HSFs (M = 27.44), p < .001. No other main effects or interactions were significant (Fs < 3.80, ps > .06).

[place Table about here]

While no significant differences between group means were found using this standard approach, it is important to consider the effect of development on these spatial frequency biases across the wide range of non-verbal mental ages studied. Cross-sectional developmental trajectories are therefore presented throughout the rest of this section, using NVMA as a covariate (see Thomas et al., 2009, for a more detailed explanation of this approach). A 3(SF: LSF, MSF, HSF) x 2(Orientation: upright, inverted) x 2(Group: ASD, control) mixed analysis of covariance (ANCOVA) was first conducted on the upright and inverted data from the two groups, with NVMA as covariate. The within-subjects effects are independent of the covariate, and will be reported from analyses excluding NVMA as a factor. Degrees of freedom may therefore differ between main effects and interactions, and within- and between-subjects factors (see Annaz et al., 2009, for an explanation). Greenhouse-Geisser corrected statistics are reported where necessary due to violations of sphericity.

Analyses revealed that identity recognition was affected differently by the three SF masks, F(2,60) = 18.74, p < .001, ηp2 = .4, and by NVMA, F(1,28) = 4.80, p = .04, ηp2 = .1, but not by group membership, F(1,28) = 1.14, p = .29, ηp2 = .04, or by orientation, F(1,30) = 1.77, p = .19, ηp2 = .1. The effect of SF mask differed with changing NVMA, F(2,56) = 4.38, p = .02, ηp2 = .1, and with group membership, F(2,56) = 4.54, p = .02, ηp2 = .1, but not with orientation, F(2,60) = .82, p = .42, ηp2 = .3. There was a significant three-way interaction between SF mask, NVMA and group, F(2,56) = 4.77, p = .01, ηp2 = .1, suggesting that the use of particular SFs for identity recognition changed with non-verbal mental age differently in the ASD and control groups. No significant interaction was found between group and NVMA, F(1,28) = 2.07, p = .16, ηp2 = .1.

Although no main effect of orientation was found, there were significant interactions between orientation and group, F(1,28) = 5.84, p = .02, ηp2 = .1, and between orientation, SF mask and group, F(2,56) = 3.15, p = .05, ηp2 = .1. There was a non-significant trend between orientation, group and NVMA, F(1,28) = 3.70, p = .07, ηp2 = .1. No significant interactions were found between orientation and NVMA, F(1,28) = .19, p = .67, ηp2 = .01, or between orientation, SF mask and NVMA, F(2,56) = .13, p = .88, ηp2 = .01, but there was a marginally significant interaction between all four factors, F(2,56) = 3.00, p = .06, ηp2 = .1. Examination of Figure 2 confirms the suggestion from these analyses that SF masks affected identity recognition differently in the two groups for upright and inverted faces, and that these differences were further affected by non-verbal mental age. In addition, inspection of within-subjects contrasts revealed a significant linear interaction between the four factors, F(1,28) = 5.02, p = .03, ηp2 = .2, suggesting that the significant four-way interaction may have been masked in the initial analyses by increased variability in one or more of the factors between the two groups (e.g., Annaz et al., 2009; Thomas et al., 2009). For both these reasons, it was decided to conduct follow-up analyses within each group in order to clarify the different patterns of spatial frequency biases suggested by these initial results.