An Investigation into Metaphor Use at Different Levels of Second Language Writing

Abstract

Recent studies in linguistics have shown that metaphorisubiquitous. This has important consequences for language learners who need to useitappropriately in their speech and writing. This study aims to provide a preliminary measure of the amount and distribution of metaphor used by language learners in their writing across CEFR levels. Two hundred essays written by Greek and German-speaking learners of English are examined for their use of metaphor.The findings arethat the overall density of metaphor increases from CEFR levels A2 to C2. At lower levels, most of the metaphoric items are closed-class, consisting mainly of prepositions, but at B2 level and beyond, the majority of metaphoric items are open-class. Metaphor is used to perform increasingly sophisticated functions at each of the levels. At B2 level significantly more errors start to be perceived in the metaphorically-used words and there is more evidence of L1 influence. Descriptors are provided for CEFR levels A2-C2 regarding the use of metaphor.

1Introduction

In very broad terms, metaphor involves describing one entity in terms of another unrelated entity (e.g. when women’s careers are described as ‘hitting a glass ceiling’). Studies of metaphor have shown that it performs key functions, such as the signalling of evaluation, agenda management and mitigation. It is also used to convey humour, to refer to shared knowledge and to denote topic change (Cameron 2003; Semino 2008). An ability to use it appropriately can thus contribute to a language learner’s communicative competence (Littlemore and Low 2006 a and b), and is therefore likely to be a key indicator of a language learner’s ability to operate at different levels of proficiency as defined by the Common European Framework of Reference for Languages (CEFR). The CEFR, which forms part of a wider European Union initiative, is a series of descriptions of language abilities which can be applied to any language and can be used to set clear targets for achievements within language learning. It has now become an internationally-recognised way of benchmarking language ability. There are six levels (A1, A2, B1, B2, C1 and C2), each of which contains a series of ‘can-do’ statements, which describe the various functions that one would expect a language learner to perform in reading, writing, listening and speaking, at that level. The focus of this study is on writing ability.The statements for writing ability are shown in Figure 1:

Insert Figure 1 here

In these statements we can see a clear progression in terms of the complexity of functions that a learner is expected to perform and we might thus expect their use of metaphor to both change and increase across the different levels. For example, at Level A1, one might expect very little use of metaphor, whereas at Level C2 one might expect learners to use metaphor to convince and persuade as well as to link their ideas to one another. To date, there has been no detailed investigation into how a learner’s use of metaphor develops across these different levels. Such a study would be useful as it could contribute descriptors pertaining to the use of metaphor which could then be used in training materials. Its findings would be of interest to organisationsinvolved in language assessment, as they could be incorporated into the marking criteria for their written examinations.

Cambridge ESOL has produced English examinations that cover the top five of these levels (A2-C2). In this article we describe a study, funded by Cambridge ESOL[i], which used the Cambridge ESOL corpus of exam scripts produced by successful students at each of these levels, to meet the following aims:

  • To identify features of metaphor that distinguish writing at the different CEFR levels A2-C2, as measured by the Cambridge Exams, ‘Key English Test’ (KET) (A2), ‘Preliminary English Test’ (PET) (B1), ‘First Certificate in English’ (FCE) (B2), ‘Certificate in Advanced English’(CAE) (C1) and ‘Certificate of Proficiency in English’ (CPE) (C2)
  • To provide descriptors relating to metaphor-use in writing that could be incorporated into the different CEFR descriptors for each level of English

2The objectives of the study

Our study had five objectives. Our first objective was to measure the amount of metaphor produced across CEFR levels A2 to C2. The most widely-used approach to metaphor identification is the Pragglejaz Group (2007) Metaphor Identification Procedure (MIP), which involves identifying as metaphor, any lexical unit that has the potential to be processed metaphorically. The analyst begins by identifying all the lexical units in the text and then for each lexical unit, he or she establishes its meaning in context and then decides whether it has a more basic contemporary meaning in other contexts. Basic meanings tend to be more concrete, more precise, or more closely related to bodily action. If a more basic meaning can be identified, the analyst decides whether its meaning in the text can be understood in comparison with this more basic meaning. If this is the case then the lexical unit is marked as being ‘metaphorically-used’. This technique was subsequently developed by Steen et al (2010), giving rise to the ‘MIPVU’, which amongst other things treats the metaphorically-used words within similes as a kind of metaphor. We used a slightly adapted version of the MIPVU and we give reasons for employing this technique in Section 4.

Our second objective was to explore the extent to which learners make use of open- and closed-class metaphorical items across the five CEFR levels investigated. The reason for including this objective was that different patterns have been observed concerning the ways in which learners of English use closed-class items, such as prepositions (Nacey, 2010). We were interested to see whether this would be the case at the different levels of the CEFR.

Our study’s third objective was to investigate the quantity, size and distribution of metaphor clusters produced by learners at each of the levels. It has been observed that people tend to produce metaphors in clusters, that these clusters serve important communicative functions (Cameron and Low 2004; Cameron and Stelma 2004), and that some of the most communicatively effective clusters are those that contain mixed metaphors (Kimmel 2010). One might expect development in the production of metaphor clusters in learners’ writing at the different levels.

It is important to look not just at the amount of metaphor that is being used but at what learners use metaphor for in their essays, in other words, what functions it is used to perform. Our fourth objective was therefore to assess the ways in which the learners’ use of metaphor contributes qualitatively to a learner’s ability to perform the relevant functions at each level.

Our study’s fifth objective was to conduct an initial exploration into the accuracy rates for metaphor, and the extent to which the use of metaphor appeared to be affected by the L1 background of the learners, at each of the levels. One might also expect metaphor errors to be due, to some extent, to L1 background, and one might expect the amount of L1 influence to decrease gradually across the different levels as the learners acquire an understanding of the ways in which metaphor is used in the target language. Alternatively, as Kellerman (1987) has shown for idioms, L1 influence on metaphor use may peak at the beginning and advanced stages of learning.

3Research Questions

The objectives listed above translate into the following research questions:

In two sets of Cambridge ESOL exam scripts (one produced by Greek-speaking learners of English and one produced by German-speaking learners of English)

  1. In what ways does the amount of metaphor produced vary across CEFR levels A2 to C2?
  1. In what ways does the use that learners make of open-class metaphorical items resemble or differ from that which they make of closed-class metaphorical items across the different CEFR levels?
  1. In what ways does the distribution of metaphor clusters vary across CEFR levels A2 to C2?
  1. In what ways do the functions performed by the metaphors vary across CEFR levels A2 to C2 and how closely do these functions relate to the CEFR descriptors?
  1. To what extent do learners use metaphor ‘incorrectly’ and how is their use of metaphor affected by their L1 background?

4Methodology

One hundred essays written by Greek learners of English (twenty at each level) and one hundred essays written by German learners of English (twenty at each level) were selected from the Cambridge ESOL database of anonymised examination scripts, produced by students who had been successful in their examinations (KET, PET, FCE, CAE and CPE). As far as possible, attempts were made to extract essays on related subjects in order to minimise the impact of topic type in our results. We used the same search terms to extract essays from the corpus at each of the five levels. We chose the words ‘politician’, ‘politics’, ‘government’, ‘economy’, ‘measures’ and ‘environment’. These reflect topics that have been shown to involve a substantial amount of metaphor (Semino, 2008). They are also broad enough to encompass a wide variety of essays, allowing us to extract sufficient data at each level. The selection of a single topic renders our results less generalizable than we would have liked. However, as the main objective of the study was to compare metaphor use across the different levels it was necessary to control, to some extent, for topic.Given that we only examined twenty essays at each level, a random topic selection would have meant that different topics were being covered at the different levels, and this would have skewed the data.

As we saw in Section 2, in order to identify all potentially metaphorically-used lexical units in the essays, we used an adapted version of the MIPVU Metaphor Identification Procedure (Steen et al. 2010), which is based on the Pragglejaz Group (2007) Metaphor Identification Procedure. Some useful features of the MIPVU for our particular project are that it includes ‘direct metaphors’ (i.e. similes and the like) as well as ‘implicit metaphors’, such as the use of ‘this’ and ‘that’ or pronouns such as ‘it’ or ‘one’ to refer back to metaphorically used words (e.g. The path she took was indeed the right one), and ‘possible personifications’ (such as ‘the department needs to act’). All of these features have been found to vary across languages, and present considerable challenges to learners.

One difference between our approach and the MIPVU was that, unlike the MIPVU, we chose to break phrasal verbs and multiword items into their constituent parts if their meanings were deemed to be partially motivated by the basic senses of their constituents. We did this for three reasons. Firstly, in the MIPVU there are no clear criteria for defining exactly what constitutes a multiword item, and strong cases have been made in the literature for viewing decomposability as a continuum rather than as an either/or phenomenon (Howarth, 1998; McCarthy et al., 2003). Secondly, language learners often make mistakes within phrasal verbs and multiword items, suggesting that they may not always be learning them as fixed phrases, and that they may at times be treating them as novel compounds (MacArthur and Littlemore, 2011). Thirdly, there is a significant body of research, drawing on a wide range of methodologies, whose findings indicate that the metaphoricity underlying language is more salient for language learners than it is for native speakers, particularly when it sits within multiword items (Cieślicka, 2006; Cooper, 1999; Littlemore, 2001, 2010; Matlock and Heredia, 2002; Siyanova-Chanturiaet al., 2011).

Another difference between our approach and the MIPVU was that while the MIPVU does not cross word-class boundaries based on the assumption that each word in discourse is connected to a specific referent, we included items that involved a change in word class. For example, in our approach, ‘snaked’ would count as a metaphor, even though it has a different word class in its basic sense. Here we follow Deignan’s (2005) work, which shows that metaphorical senses often differ formally from their literal counterparts.

At this point, it is worth reiterating that both the MIP and the MIPVU intended for use as a method for the identification of potential metaphors and that it is impossible to tell, simply by looking at a text, whether a metaphor was produced intentionally or not. Suggestions have been put forward as to how textual cues might be employed to indicate possible deliberateness (Krennmayr, 2011; Steen, 2008, 2010, 2011) but these cues are by no means watertight and the notion of ‘deliberate’ metaphor remains contentious as it is virtually impossible to determine whether any human act is the product of conscious, deliberate thought (Deignan, 2011; Gibbs, 2011; Müller 2011).

The issue of deliberate metaphor production becomes even more acute when investigating language produced by L2 users, as they may not even know the basic meaning of the metaphorically-used words that they are employing, which makes it somewhat problematic to encode them as metaphor (MacArthur and Littlemore, 2011). Nacey’s (2010) approach to this problem was to take all ‘deviant’ language at face value and to code it as ‘metaphor’ where appropriate according to the MIPVU, even though those producing the language may not have viewed it as metaphor themselves. Nacey then went back through her data in order to identify possible reasons behind these apparent metaphors and found a number of explanatory factors including transfer, over-extension and creative language use. This is the approach that we employed in order to answer Research Question 5.

To maximise reliability across raters, we followed the Pragglejaz Group’s (2007) procedure of using group discussion to reach a consensus in cases where there was disagreement. The research team (which comprised the four authors of this paper) was divided into pairs and each pair was given a section of the corpus to analyse. Each member of the pair identified the metaphors independently using the MIPVU procedure and then met to discuss any cases where there was disagreement. A third member of the team was brought in when agreement could not be reached. Extended discussions took place until agreement could be reached in all cases. In order to ensure consistency, decisions were recorded and the whole corpus was searched again for instances where those metaphors occurred.

Once the metaphorically-used words had been identified, we calculated the metaphoric density of the texts produced at each level by dividing the number of metaphorically-used words by the total number of words at that level and multiplying the result by one hundred. As the data consisted of independent samples, we employed a series of Mann Whitney U Tests to establish whether any of the differences across levels were statistically significant. We focussed on word tokens rather than word types. The metaphors were then categorized into open and closed-class items. We used these figures to calculate the proportions of metaphors that comprised open and closed-class items at each level. A search for metaphor clusters was conducted using a time series analysis. A span size of20 words was selected and the metaphoric density across the words in this span was calculated. This result was placed at the mid-point (the 10th word). The span was then shifted one word down, and the metaphoric density calculated for the next twenty-word span. The result was placed at the mid-point (the 11th word), and so on until the end of the text was reached. The technique allowed us to produce metaphoric density charts, such as the following:

Insert Figure 2 here.

Figure 2Illustration of a moving metaphoric density chart for a CAE essay written by a German learner of English

The metaphor cluster that appears at point B in the above chart was as follows:

If a girl developsin a way to like dolls and languages and hate computer and maths this is just fine - but one should not "push" her in any direction. Thiswidelyspreadpattern of thinking is mirroredin German politics.

(German learner of English, CAE: C1)

This technique informedour qualitative analysis, by allowing us to identify stretches of text with high localised metaphoric density. The next stage was to decide what percentage of metaphor to use as a ‘cut-off’ point in our definition of a metaphor cluster. Previous studies (e.g. Cameron and Stelma, 2004) have used the ‘sudden onset’ of metaphor as their main identification criterion for a metaphor cluster. Under this approach, the spike that appears at 121 words in Figure 2 would be a candidate for consideration as a metaphor cluster because it follows a long period of relatively low-level metaphor use, even though the actual metaphoric density of this spike is relatively low (10%). However in our study, we wanted to compare the use of metaphor clusters across levels, so we needed to identify a standard starting point in terms of metaphoric density. In order to do this, we conducted manual examinations of the metaphoric density charts for a number of essays at each of the five levels, and analysed them alongside the essays themselves. We looked at clusters at 5% intervals until we reached a level where we could discern visible metaphor use above and beyond the sorts of highly conventionalised metaphorical uses of prepositions. We agreed that the most ‘meaningful’ level to start at was 30%. Taking 30% density as our stating point, we calculated the number and distribution of clusters that appeared at each level in both data sets.We then conducted a manual search of the metaphors that appeared both within and outside the clusters to establish how learners were using metaphor at each of the levels. We looked for uses of metaphor that had not appeared in our data at previous levels, focussing both on the forms of metaphor and on the functions it was being used to perform at each level. Since we also coded direct metaphor in our dataset, there may be concern that metaphor density figures are artificially inflated because each lexical unit that is part of a simile is counted (see Dorst 2011;Krennmayr 2011). However, in our data direct comparisons tended not to be elaborate, and frequently consisted of only one word from the source domain, which makes this issue less of a concern.