Manual for the

Jamaican component (ICE-JA)

Ingrid Rosenfelder

Susanne Jantos

Nicole Höhn

Christian Mair

Department of English

University of Freiburg, Germany

June 2009

Table of contents

1. Introduction 1

1.1. History of ICE-JA 1

1.2. Research based on ICE-Jamaica 2

1.3. English in Jamaica 3

2. Markup conventions 5

2.1. Markup tags used in both components (written and spoken) 7

2.2. Markup tags used in the spoken component 10

2.3. Markup tags used in the written component 11

2.4. Other transcription conventions 12

2.5. Encoding of special characters 13

3. References 14

4. Table of Contents 15

4.1. General structure of ICE 15

4.2. ICE-Jamaica corpus texts 17

S1A – Private Dialogues – direct conversations (S1A-001 – 090) 17

S1A – Private Dialogues – distant conversations (S1A-091 – 100) 20

S1B – Public Dialogues – class lessons (S1B-001 – 020) 21

S1B – Public Dialogues – broadcast discussions (S1B-020 – 040) 22

S1B – Public Dialogues – broadcast interviews (S1B-040 – 050) 23

S1B – Public Dialogues – parliamentary debates (S1B-050 – 060) 24

S1B – Public Dialogues – legal cross-examinations (S1B-060 – 070) 25

S1B – Public Dialogues – business transactions (S1B-070 – 080) 25

S2A – Unscripted Monologues – spontaneous commentaries (S2A-001 – 020) 26

S2A – Unscripted Monologues – unscripted speeches (S2A-021 – 050) 28

S2A – Unscripted Monologues – demonstrations (S2A-051 – 060) 29

S2A – Unscripted Monologues – legal presentations (S2A-061 – 070) 30

S2B – Scripted Monologues – broadcast news (S2B-001 – 020) 31

S2B – Scripted Monologues – broadcast talks (S2B-021 – 040) 33

S2B – Scripted Monologues – speeches (not broadcast) (S2B-041 – 050) 34

W1A – Non-printed – Non-professional writing – student untimed essays (W1A-001 – 010) 36

W1A – Non-printed – Non-professional writing – student examination essays (W1A-011 – 020) 36

W1B – Non-printed – Correspondences – social letters (W1B-001 – 015) 37

W1B – Non-printed – Correspondences – business letters (W1B-016 – 030) 38

W2A – Printed – Informational (learned) – humanities (W2A-001 – 010) 39

W2A – Printed – Informational (learned) – social sciences (W2A-011 – 020) 39

W2A – Printed – Informational (learned) – natural sciences (W2A-021 – 030) 40

W2A – Printed – Informational (learned) – technology (W2A-031 – 040) 41

W2B – Printed – Informational (popular) – humanities (W2B-001 – 010) 42

W2B – Printed – Informational (popular) – social sciences (W2B-011 – 020) 42

W2B – Printed – Informational (popular) – natural sciences (W2B-021 – 030) 43

W2B – Printed – Informational (popular) – technology (W2B-031 – 040) 44

W2C – Printed – Informational (reportage) – press news report (W2C-001 – 020) 46

W2D – Printed – Instructional – administrative/regulatory (W2D-001 – 010) 49

W2D – Printed – Instructional – skills/hobbies (W2D-011 – 020) 50

W2E – Printed – Persuasive – press editorials (W2E-001 – 010) 52

W2F – Printed – Creative – novels/short stories (W2F-001 – 020) 53

1.  Introduction

1.1. History of ICE-JA

Work on the Jamaican subcomponent of ICE started in the early 1990s, when first contacts were established between Sidney Greenbaum, coordinator of the ICE project, and Kathryn Shields-Brodber at the University of the West Indies (Mona, Jamaica), with the help of Prof. Christian Mair (University of Freiburg). After starting an informal collaboration on the ICE-Jamaica project, most of the written material, as well as some of the spoken texts were collected between 1994 and 1996 by Prof. Mair and Andrea Sand (University of Freiburg). Further data material for the corpus was collected in the beginning of the 2000s by Lars Hinrichs and Dagmar Deuber (University of Freiburg), and in 2005 by Hubert Devonish (University of the West Indies, Mona, Jamaica) and students, who contributed most of the spontaneous spoken material. From 2005 to 2008, the project of completing the corpus and investigating changing language norms in Jamaica with the help of the data material collected was funded by the German Research Foundation (Deutsche Forschungsgemeinschaft) as grant MA 1652/4.

Over the years, the following people have contributed to the completion of the ICE-JA corpus by recording, compiling, transcribing, and/or proofreading texts for the corpus:

- in Freiburg:

Christian Mair

Dagmar Deuber

Andrea Sand

Lars Hinrichs

Ingrid Rosenfelder

Susanne Jantos

Nicole Höhn

Andrea Moll

Samuel Harding

Christine Wender

Jennifer Beck

Jennifer Hudson

Julia Pauli

Bianca Kossmann

Birgit Waibel

Heike Fiedler

Andreas Sedlatschek

Nicole Knäble

Laura Rocholl

Michaela Hilbert

Tobias Maier

Tamsin Sanderson

Stefanie Rapp

Lisa Wild

Luminita Trasca

Tina Schurreit

- in Mona:

Hubert Devonish

Kathryn Shields-Brodber

Michelle Bryan-Ennis

Andre Sherriah

Daidrah Smith

Kedisha Williams

Audene Henry

Lauri-Ann Clarke

Tereka Brown

Indra Vincent

Catilda Frazer

Michelle Stewart

Karen Carpenter

Gayon Williams

The Jamaican Language Unit

2

1.2. Research based on ICE-Jamaica

Deuber, Dagmar (2009a). "'The English we speaking': morphological and syntactic variation in educated Jamaican speech." Journal of Pidgin and Creole Languages 24: 1-52.

Deuber, Dagmar (to appear, 2009b). "Caribbean ICE corpora: some issues for fieldwork and analysis." In: Marianne Hundt, Daniel Schreier & Andreas Jucker (eds.), Corpora: Pragmatics and Discourse – Papers from the 29th International Conference on English Language Research on Computerized Corpora (ICAME 29). Amsterdam: Rodopi. 425-450.

Deuber, Dagmar (to appear). "The creole continuum and individual agency: Approaches to stylistic variation in Jamaica." In: Lars Hinrichs Joseph Farquharson (eds.), New Approaches to Caribbean Language Variation. Amsterdam: Benjamins.

Deuber, Dagmar (in progress). Style and standards in English in the Caribbean: Morphological and syntactic variation in Jamaica and Trinidad. Habilitation (post-doc thesis). University of Freiburg, Germany.

Höhn, Nicole (2008). "Discourse styles in spoken British English: a corpus-based study." Recherches Anglaises et Nord Américaines (RANAM) 41: Variability and Change in Language and Discourse. Strasbourg: Universités des Sciences Humaines. 161-186.

Höhn, Nicole (in progress). Quotatives and discourse markers in spoken Jamaican English. Ph.D. dissertation. University of Freiburg, Germany.

Jantos, Susanne (to appear, 2009). “Agreement in educated Jamaican English: A corpus-based study of spoken usage in ICE-Jamaica.” In: Anja Wanner & Heidrun Dorgeloh (eds), Approaches to Syntactic Variation and Genre. Berlin: Mouton de Gruyter.

Jantos, Susanne (2009). Agreement Variation in educated Jamaican English: A Corpus Investigation of ICE-Jamaica. Ph.D. dissertation. University of Freiburg, Germany.

Mair, Christian (2007). "English in North America and the Caribbean." In: Christopher F. Laferl & Bernhard Pöll (eds.), Amerika und die Norm. Literatursprache zwischen Tradition und Innovation. Tübingen: Niemeyer. 3-23.

Mair, Christian & Sandra Mollin (2007). "Getting at the standards behind the standard ideology: what corpora can tell us about linguistic norms." In: Sabine Volk-Birke & Julia Lippert (eds.), Anglistentag 2006 Halle: Proceedings. Trier: Wissenschaftlicher Verlag. 2007. 341-353.

Mair, Christian (to appear 2009a). "Corpus linguistics meets sociolinguistics: studying educated spoken usage in Jamaica on the basis of the International Corpus of English (ICE)." In: Lucia Siebers & Thomas Hoffmann (eds.), World Englishes: Problems, Properties, Prospects. Amsterdam: Benjamins.

Mair, Christian (to appear, 2009b). "The consequences of migration and colonialism I: pidgins and creoles." In: Peter Auer (ed.), Language and Space. HSK – Handbücher zur Sprach- und Kommunikationswissenschaft/ Handbooks of Linguistics and Communication Science. Berlin: Mouton de Gruyter.

Mair, Christian (to appear, 2009c). "Corpus linguistics meets sociolinguistics: the role of corpus evidence in the study of sociolinguistic variation and change." In: Antoinette Renouf & Andrew Kehoe (eds.),Corpus Linguistics: Refinements and Reassessments – Proceedings of the 2007 ICAME Conference – Stratford-upon-Avon. Amsterdam: Rodopi. 7-35.

Mair, Christian (in progress). "Chattin' patois – face-to-face and on the web: contrasting strategies of localisation in Jamaican English"

Rosenfelder, Ingrid (to appear, 2009). "Rhoticity in educated Jamaican English: an analysis of the spoken component of ICE-Jamaica." In: Lucia Siebers & Thomas Hoffmann (eds.), World Englishes: Problems, Properties, Prospects. Amsterdam: John Benjamins.

Rosenfelder, Ingrid (2009). Sociophonetic variation in educated Jamaican English: An analysis of the spoken component of ICE-Jamaica. Ph.D. dissertation, University of Freiburg, Germany.

1.3. English in Jamaica

English has been spoken in Jamaica since the island was first conquered by the British in 1655, when an expedition led by Penn and Venables took the island from the Spanish (Lalla & D’Costa 1990: 7). Jamaica remained under British rule as a colony until 1962, becoming an important producer of sugar cane and coffee and developing a culture of large plantations. The importation of large numbers of slaves from Africa, together with intensive language contact between these and speakers of regional varieties of English led to the development of a creole language, Jamaican Creole (locally usually referred to as patois or dialect), on the island in the seventeenth century (Lalla & D’Costa 1990: 16). In 1962, political independence was gained from Britain.

Today, English remains the official language of Jamaica (Devonish 1986: 24). However, it is not the mother tongue of the large majority of the population, who speak Jamaican Creole as their first language (Patrick 2004: 408). The current language situation in Jamaica is characterized by the presence of a (post-) creole continuum (DeCamp 1961, 1971), in which a creole language (in this case, Jamaican Creole) coexists with its corresponding lexifier language (English), with fine gradations in intermediate varieties in between. The endpoints of this continuum are usually labelled basilect (designating the creole variety) and acrolect (designating its lexifier language), while intermediate varieties constitute one or several mesolect(s).

English enjoys high prestige in Jamaica (Akers 1981: 9). With respect to its social distribution, use of English in terms of ethnicity “is still associated with the elite, which up to approximately fifty years ago consisted of mainly the white and near-white members of the population” (Christie 2003: 2). This historical background is also responsible for the high social and economic status that is attributed to speakers with proficiency in English (Akers 1981: 8/9). Moreover, use of English correlates with the level of education attained. As has been noted by Christie (2003), “[t]he ability to read and write Standard English remains the mark of an educated person” (Christie 2003: 39). Creole, on the one hand, is associated with rural origins and low levels of formal education or even illiteracy (Akers 1981: 8/9). With respect to ethnicity and social class, it is typically spoken by “the poorest members of the society, who are mostly black”, these being “labourers, small farmers, domestic helpers, small craftsmen and others belonging to the same social class as these” (Christie 2003: 2, see also Akers 1981: 8). Due to these typical associations, prejudices exist within the Jamaican society against the use of Jamaican Creole, which in some quarters may still be perceived as “indicating a lack of intelligence” (Christie 2003: 5), as well as “illiteracy and ignorance” (Christie 2003: 39). At the same time, however, Jamaican Creole is also seen as “a symbol of national identity” indicating national pride, and there is evidence for a change in attitudes towards the two language varieties.

2.  Markup conventions

Markup of the texts included in the corpus generally follows the principles outlined in the ICE markup manuals (Nelson 2002a, 2002b). Textual markup encodes “features of the original text that would otherwise be lost” (Greenbaum 1996: 7), such as formatting information for the written texts, or pauses and sections of overlapping speech in the spoken material. The markup uses an SGML (Standard Generalized Markup Language)-based system of tags enclosed in angled brackets. Thus, tags which enclose a stretch of text have the form <xxx>…</xxx>, with an opening and a closing tag belonging to each tag set, whereas a single tag <xxx> is used to mark a particular point in the transcription (Nelson 2002a: 3, 2002b: 4).

An overview of all tags used in the spoken and written component of the ICE-Jamaica corpus can be found in Tables 1 and 2, respectively.

Markup tags used in the spoken component
<#> / text unit
<I>...</I> / subtext
<$A>, <$B>, etc / speaker identification
<{>...</{> / overlapping text
<[>...</[> / overlapping text strings
<w>...</w> / orthographic word (initial or final apostrophe)
<X>...</X> / extra-corpus text
<?>...</?> / uncertain transcription
<O>...</O> / untranscribed text, anthropophonics, etc.
<.>...</.> / incomplete word
<}>...</}> / text normalization
<->...</-> / text normalization: deletion
<+>...</+> / text normalization: insertion
<=>...</=> / text normalization: original normalization
>...</> / editorial comment, background noises etc.
<(>...</(> / discontinuous word
<)>...</)> / normalized discontinuous word
<@>...</@> / changed (anonymized) name or word
<,> / short pause (one syllable)
<,,> / long pause (more than one syllable)
<quote>...</quote> / quotation
<mention>...</mention> / words mentioned as words
<foreign>...</foreign> / foreign words
<indig>...</indig> / indigenous words
<unclear>...</unclear> / unclear transcription

Table 1: Markup tags used in the spoken component of the corpus (Nelson 2002a: 11).

Markup tags used in the written component
<#> / text unit
<I>...</I> / subtext
<l> / line break
<p>...</p> / paragraph
<h>...</h> / heading
<w>...</w> / orthographic word (initial or final apostrophe)
<X>...</X> / extra-corpus text
<?>...</?> / uncertain transcription
<unclear>...</unclear> / unclear transcription
<O>...</O> / untranscribed text
<.>...</.> / incomplete word
<}>...</}> / text normalization
<->...</-> / text normalization: deletion
<+>...</+> / text normalization: insertion
<=>...</=> / text normalization: original normalization
>...</> / editorial comment
<(>...</(> / discontinuous word
<)>...</)> / normalized discontinuous word
<@>...</@> / changed (anonymized) name or word
<sb>...</sb> / subscript
<sp>...</sp> / superscript
<ul>...</ul> / underline
<it>...</it> / italics
<bold>...</bold> / boldface
<typeface>...</typeface> / change of typeface
<roman>...</roman> / roman type
<smallcaps>...</smallcaps> / small capitals
<footnote>...</footnote> / footnote
<fnr>...</fnr> / reference to footnote
<space> / orthographic space
<quote>...</quote> / quotation
del>...</del / deleted text
<marginalia>...</marginalia> / marginalia
<mention>...</mention> / words mentioned as words
<indig>...</indig> / indigenous words
<foreign>...</foreign> / foreign words

Table 2: Markup tags used in the written component of the corpus (Nelson 2002b: 15/16).