Dr. Sharon Armon-Lotem

Spontaneous production

Spontaneous production data (CHILDES)

Sample Excerpt of an English Transcript

@Begin

@Participants: ODA: Odaya: bilingual+SLI subject. EXP: Experimenter.

@Age of ODA: 6;9

@Sex of ODA: female

@Birth Order: 2

@Lang: early 2nd language (L1-English) [sequential bilingual]

@Date: May 2005

@Location: Gan Dekalim.

1.  *EXP: let’s talk about Purim.

2.  *EXP: did you have a party at your gan?

3.  %com: code switching.

4.  *ODA: no.

5.  *EXP: you didn’t have a party?

6.  *ODA: no.

7.  *EXP: so where did you have a party on Purim?

8.  *ODA: no.

9.  *ODA: my sister hala party.

10.  %com: had a

11.  *ODA: she in gan.

12.  %com: missing verb ‘is’

13.  *EXP: really?

14.  *ODA: yeah.

15.  *EXP: how old is she?

16.  *ODA: she um six six.

17.  %com: missing verb ‘is’

18.  *EXP: what did she dress up as?

19.  *ODA: what?

20.  *EXP: was she a princess?

21.  *ODA: a princess?

22.  *ODA: no.

23.  *ODA: she was a bunny.

24.  *EXP: a bunny?

25.  *ODA: yeah.

26.  *EXP: and what were you?

27.  *ODA: I was a bride, kala

28.  *EXP: really?

29.  *ODA: yeah.

30.  *EXP: and what did you wear?

31.  *ODA: some flowers and a white dress.

32.  *EXP: was it big?

33.  *ODA: big?

34.  *ODA: oh of course.

35.  *EXP: what color was it?

36.  *ODA: color, color, white.

37.  *EXP: and where did you go on Purim?

38.  *EXP: to whose house did you go?

39.  *ODA: to my safta’s house.

40.  %com: code switching

41.  *EXP: what did you do there?

42.  *ODA: there I put the taxposot.

43.  %com: code switching

44.  *ODA: and then I sam the manot for all the people.

45.  %com: code switching

46.  *ODA: and then I went lishmoa the megilla

47.  %com: code switching

48.  *ODA: and then miklaxat

49.  %com: code switching

@End

CHILDES Database - http://childes.psy.cmu.edu/

·  Longitudinal – diary studies, naturalistic samples, controlled naturalistic samples

·  Cross-sectional – CDI, naturalistic samples, controlled naturalistic samples

How to collect spontaneous samples?

1.  It is best to choose a child you already know. Otherwise, you might need to spend some time on getting acquainted.

2.  It is best to record the child when there are no other kids around, so there will be fewer distractions. Recording two kids at the same time will make the transcription part more difficult.

3.  The older the child the easier it is. If you decide to record a younger child, you might want to record her/him in interaction with a parent or a caregiver. If you decide to record an older child, it is best to focus on a narrative. That is, ask the child to tell you about something that happened to her/him.

4.  Try to avoid dialogues that involve labeling, such as, What’s this? This is a … We want the recording to be rich.

5.  Try to avoid recording songs, poems, rhymes or stories told from a book. Since this material does not necessarily give an accurate picture of the child’s knowledge, and you will not be able to use it for the following steps.

How to analyze spontaneous samples

Quantitative measurements

·  Count words (type and token frequency), morphemes, clauses, utterances

·  Measure MLU

·  Compare the no. of utterances to the no. of clauses

Qualitative measurements

·  Lexical distribution - kind and number of different words: types/tokens

·  Semantic content - favored classes of words

·  Morphological structure - inflectional affixes, binyan patterns, roots, etc.

·  Phonetic shape - comparison with normative, endstate usage

·  Syntactic level – the simple clause: use of word order, agreement and tenses, sentence structure (subject, verb, object, adverbials), subjects, questions and negation, etc. How complex are the simple clauses and phrases?

·  Syntactic level – the complex clause: which ones, which complementizers are used, how often

·  Language use: use of tenses, communicative competence, appropriateness of language use.

·  The emerging narrative: structure, connectivity, etc.

·  Errors

The effect of sample size on errors estimates

Reading: Rowland, C. F., Fletcher, S. L. and D. Freudenthal. 2008. How big is big enough? Assessing the reliability of data from Naturalistic samples. In H. Behrens (ed.) Corpora in Language Acquisition Research: History, Methods and Perspectives. John Benjamins Publishing Company: Amsterdam/New York, pp 1-24

Small sample cannot capture:

·  Infrequent errors

·  Short lived errors

·  Error rates in low frequency structure

How to calculate error rate?

Divide number of errors by the number of contexts in which the error could have occurred

But

·  high frequency items dominate the outcome

·  collapsing the data over time masks the development

·  ignoring subsystems could mask important factors

Solutions

·  If you plan to record 4 hours per month, do them all in one week

·  If you target a particular structure/error, try to estimate the frequency of the error to determine the number of hours necessary to capture it (page 12, figure 2)

·  Combine different samples (matched by Brown stages) to enlarge the corpora and apply statistical measures

Productivity

·  Brown (1973) – 90% of obligatory contexts

·  Berman & Armon-Lotem (2003) – same morpheme appears on a few verbs and same verb appears with more than one morpheme

·  Vocabulary size and productivity (page 22, table 5)

Appendix - Transcription instructions

·  The transcription uses a phonetic/phonemic transcription.

·  Along the transcription we try to keep the text line as close as possible to the way it was uttered, except for some correction of pronunciation (which are marked of course) in order to make the text comprehensive.

·  In order to achieve uniformity and clarity we use the CHILDES conventions

1. HEADER
Each file starts with a header in the following form:

·  Participants: XXX is ------the subject. YYY is ------, the interviewer (mark relation to the subject). ZZZ is ------(other participants).

[XXX, YYY, and ZZZ are the first three letters of the name]

·  Age of XXX:

·  Sex of XXX:

·  Date:

·  Situation: circumstances of recording

·  Comment: Enter all comments which apply across the entire file, e.g., general comments about pronunciation of certain sounds or words, e.g., The child rarely pronounces [h].

2. TEXTLINE

·  Every utterance is a separate text-line. If it continues across line boundaries, the following lines should be indented.

·  Each text-line starts with the three initial letters of the speaker name/identification in CAPITAL letters, then a colon ':', e.g. *HAG: for Hagar, *MOT: for mother, etc.

·  Each text-line should end with a dot (period), exclamation mark (!), or a question mark (?).

·  Words should be separated by a space. (in Hebrew, this includes ve, she, le, be, ke, me).

·  Each text-line is followed by any commentary relevant for that line, as shown below.

Special Symbols: punctuation marks are used as in written languages. In addition a few other symbols from CLAN:

. = end of utterance
, = short pause
... = longer pause, or trailing off, hesitation (within an utt)
: = introducing quotation
" = quotation
? = question
! = exclamation
^ = compound (smixut xavura), e.g. beyt^ha-sefer
- = separates determiner from the word, e.g. ha-bayit, ba-bayit
xxx = unintelligible string


3. COMMENTARY TIERS
These are qualitative notes – com, sit, pho, err, etc.