The Development of Input Surrounding Action Verbs in Speech to Infants and Toddlers

Pence dissertation proposal 1

The Development of Input Surrounding Action Verbs in Speech to young language learners

Khara Pence

This dissertation explores how action verbs are presented to English-learners of two different ages in terms of their descriptive and acoustic properties. More specifically, the paper addresses the following questions: (1) How are action verbs presented in speech to infants in comparison to speech to adults, and (2) to what extent is input with respect to action verbs different for 14-16-month-olds (early-lexical infants) and 21-23-month-olds (advanced-lexical infants)?

Why are action verbs so difficult to learn? Several hypotheses exist for why verb learning is such a difficult task. One such hypothesis emphasizes the conceptual complexity surrounding action verbs in relation to the relative simplicity of object nouns (Gentner, 1982; Gentner & Boroditsky, 2001). Actions are ephemeral, and thus are more difficult to label than concrete, tangible, and permanent objects in the environment. With the exception of continuous actions, it is highly likely that by the time a verb has been uttered, the action to which it refers will no longer be perceptually available. Complicating matters further is that many verbs are polysemous. For example the Merriam-Webster Online Dictionary contains 3 entries for the noun bottle, and an extraordinary 42 entries for the verb run(Merriam-Webster Incorporated, 2002). Golinkoff and her colleagues propose three additional reasons that verbs are conceptually more complex than object nouns (Golinkoff, Jacquet, Hirsh-Pasek, & Nandakumar, 1996). First, in order to learn a verb, one must uncover that verb’s semantic components (e.g., causation, manner, path) and then select the corresponding surface elements. As Talmy (1995, p. 57) explains, this is no straightforward task given that “a combination of semantic elements can be expressed by a single surface element, or a single semantic element by a combination of surface elements. Or again, semantic elements of different types can be expressed by the same type of surface element, as well as the same type by several different ones.” As a first step in overcoming this challenge, English-learning infants have been found to attend to manner and path in motion events (Pulverman, Sootsman, Golinkoff, & Hirsh-Pasek, 2002). This ability is crucial because verbs are often distinguished according to these very event components and infants may benefit from analyzing events when they are able to attend to those aspects of events that may be lexicalized as verbs. Manner and path represent only two of a host of commonly lexicalized semantic elements, leaving infants with a great many event concepts to conquer. Second, verbs may be conceptually more complex than nouns because of the difficulty associated with detecting invariant features across actions and events. Because actions contain a greater array of semantic components than objects, they necessarily encompass a greater amount of potential variability (See Talmy, 1985 for a discussion of commonly lexicalized semantic elements). The ability to detect invariant features is a prerequisite for categorization, and since verbs label categories of action, verb learning trades upon the ability to detect invariant features across actions and events. New research is uncovering how infants are able to detect certain invariant properties across actions and events. By 9 months, infants are able to detect changes in the manner of a novel action as well as the agent performing the action, and are able to use those invariant features as a basis for category formation (Salkind, et al 2002). At the same age, however, infants are not able to utilize the rate at which a novel action is performed as a basis for category formation (S. J. Salkind, personal communication, November 29, 2003). Third, just as verbs are complicated in that a single verb can describe several different actions or events, many different verbs can be used to describe a single action. For example, the act of eating may be labeled with the verbs “devour,” “munch,” “consume,” or “ingest.”

Verbs might also be difficult to learn because their acquisition may depend on an understanding of the nouns that surround them in utterances. Gillette, Gleitman, Gleitman, and Lederer (1999) propose that verb learning depends on knowledge of nouns, as they enable the acquisition of clause-level syntax that can be used as a bootstrap for new lexical items. Evidence from Naigles and Hoff-Ginsberg (1998) supports this claim. They demonstrated that children’s first verbs depend upon not only the frequency and positioning of these verbs, but also upon the syntactic environments in which these verbs are presented such that the more varied the environment surrounding the verb, the easier it is for children to acquire them. Thus, it follows that since children learning verbs benefit from understanding the words surrounding them, verbs should be acquired only after a sufficient number of nouns are comprehended. Also, in order to use a verb correctly, children must master the necessary argument structure required by that verb. For example, transitive verbs must contain a direct object, while ditransitive verbs require a direct object and an indirect object.

In addition to being conceptually complex and advanced in terms of the prerequisite amount of nouns and syntax necessary for their acquisition, verbs do not appear to be the focus of communicative interactions with infants and toddlers (at least for English-speakers). In English, mothers frequently elicit noun production but rarely elicit verbs (Goldfield, 2000). Even in languages where argument dropping is licensed (e.g., Japanese, Korean, Mandarin Chinese), nouns have been found to predominate in the discourse in many instances (but see examples below of how this is not always the case). In one pro-drop language, Japanese, mothers often use objects to engage their infants in social routines (even though they may not label these objects as consistently as American mothers) (Fernald & Morikawa, 1993). Tardif, Shatz, and Naigles (1997) uncovered conflicting evidence upon examining another set of pro-drop languages. They found that in Italian (as in English) caregivers’ speech was more noun oriented, while in Mandarin, caregivers’ speech was more verb oriented. Camaioni and Longobardi (2001) later posited that these differences might be attributed to the diverse morphological environments in which Italian verbs are presented (in comparison to the morphologically transparent environments of Mandarin Chinese verbs). Researchers are also beginning to examine how context might mediate the extent to which nouns and verbs are used in input to young children. For example, two separate studies confirm that English speaking mothers emphasize nouns both in book reading and toy-play contexts, while Korean mothers emphasize nouns in book reading contexts and focus more on actions in toy-play contexts (Gopnik, Choi, and Baumberger,1996; Choi, 2000). Tardif, Gelman, and Xu (1999) similarly found that Mandarin and English speaking mothers used more noun types than verb types in book reading contexts, but used more verb types than noun types when given toys to play with.

While nouns appear to predominate in speech to children in a variety of languages and communicative contexts, there are some cases in which verbs are emphasized over nouns. For example, Italian-speaking mothers have been found to produce verb types and tokens more frequently than noun types and tokens and to place verbs in initial or final position more than nouns (Camaioni & Longobardi, 2001). Korean mothers have been reported to engage in activity-oriented discourse and provide more action verbs than English speaking mothers (Choi & Gopnik, 1995), and Mandarin-speaking mothers have been found to produce roughly twice as many verb tokens as noun tokens and to place verbs in salient positions (utterance-initial and utterance-final) more frequently than nouns (Tardif, 1993). Most striking is that the input children receive may be related to the age and rate at which they acquire verbs. For example, Korean children experience a verb spurt around 19 months during which they acquire 10 or more verbs over a one-month period (Choi & Gopnik, 1995). English-speaking children do not experience a similar verb spurt. Mandarin-speaking children also produce more action words than object words at 22 months (Tardif, 1996). It is quite possible that the linguistic environment, including the sheer frequency of verbs, their positioning, and the simplicity of their morphological environments might propel children’s production of verbs in certain languages, despite their inherent conceptual complexities. Conversely, a minimized focus on action verbs by English-speaking parents, when combined with conceptual and syntactic intricacies could explain why verbs are so difficult to acquire.

Returning again to nouns, in addition to their relative cognitive and structural simplicity and the ways in which they are focused in discourse, nouns in English appear to benefit from prominent acoustic properties. In communicative situations with children, caregivers tend to place object nouns in salient utterance final positions and on exaggerated pitch peaks (Fernald & Morikawa, 1993; Tardiff, Shatz, & Naigles, 1997.) There is a limited amount of research demonstrating that caregivers highlight object nouns in speech to their children in ways that might promote their acquisition. Fernald and Mazzie (1991) had mothers describe articles of clothing in a picture book to their 14-month-olds. They found that when introducing object nouns for the first time, mothers placed these words in utterance final position and focused them acoustically with primary stress. The few researchers to examine parents’ established tendencies (for highlighting object nouns in various ways) empirically in comprehension tasks have revealed positive results. Shady and Gerken (1999) found that young two-year-old children were better able to comprehend familiar object nouns placed in utterance final position than those presented in medial position, while one- and two-year-olds in a separate study were also able to learn the names of novel lexical items best when they were placed in utterance-final position in the context of ID speech (Golinkoff, Hirsh-Pasek, & Alioto, 1995). Furthermore, Golinkoff and Alioto (1995) demonstrated that English-speaking adults were better able to learn new nouns in Mandarin Chinese when those words were presented in ID speech and placed in utterance-final position as opposed to medial position. In general, caregivers (especially English-speaking caregivers) tend to use object-focused discourse and highlight nouns acoustically in their speech. According to the few available comprehension studies, the way in which object nouns are presented in ID speech appears beneficial for word learning. Research examining how action verbs are presented in ID speech is largely absent from the literature. It may be that verbs are not featured prominently in the input until parents intuit their children’s readiness for these more complicated relational terms. This finding coupled with the conceptual and structural complexities surrounding verbs would explain why verbs are so difficult for English-speaking children to acquire.

Experiment 1

The first experiment sought to characterize the presentation of action verbs to 14-16-month-olds in terms of their descriptive and acoustic properties. There were two main goals in this experiment. The first goal was to establish that mothers’ speech to infants is different than speech to adults when actions are the focus of attention. The second goal was to determine whether mothers emphasize action verbs acoustically in infant-directed speech in comparison to a matched set of object nouns when actions are the focus of attention. The age range for experiment 1 was selected upon consideration of two factors. First, Fernald and Mazzie (1991) examined the speech of mothers to their 14-month-old infants. It was found that at 14 months, nouns were used prominently, both in terms of placement and acoustic highlighting. In order to replicate these findings for action verbs, it was important to select a very similar age range to rule out the possible effects of developmental differences that would influence action verb use in the input. Second, infants of this age are characterized as early-lexical (Fenson et al, 1994; Gogate, Bahrick, & Watson, 2000) because they are steadily adding new words to their vocabulary. Because this experiment is concerned with how action verbs per se are used in speech to infants, it was important to select an age where infants would be likely to consider lexical items as units of analysis.

Hypotheses

Hypothesis 1: Speech to early-lexical infants should contain a larger number of utterances per story than speech to adults. Because these early-lexical infants are acquiring new words, mothers may treat the storybook reading as a teaching opportunity. It is predicted that mothers will introduce new vocabulary to their infants and elaborate where necessary. Mothers should provide very little detail in the adult-directed stories, as they should recognize the actions in the storybook to be familiar to adult listeners.

Hypothesis 2: Speech to early-lexical infants should contain a smaller MLU in words than speech to adults. Speech to infants generally contains utterances that are shorter and simpler than speech to adults(Bohannon & Marquis, 1977; Hayes & Ahrens, 1988). This finding should hold for the stories/descriptions in this experiment.

Hypothesis 3: Speech to early-lexical infants should contain a smaller MLU in morphemes than speech to adults. Speech to infants is morphologically less complex than speech to adults. In fact, infants have been found to process speech best at a level just above what they are able to produce (similar to the notion of the Zone of Proximal Development, Vygotsky, 1978). We would expect the utterances in these stories/descriptions to adhere to this pattern.

Hypothesis 4: Speech to early-lexical infants should contain a smaller type-token ratio than speech to adults. Because speech to infants contains a significant amount of repetition, we would expect a smaller type-token ratio in ID stories/descriptions. We would expect less repetition of words in the AD stories/descriptions, and thus a higher type-token ratio.

Hypothesis 5: Speech to early-lexical infants should contain a larger number of target verbs per story than speech to adults. Again, because early-lexical infants are beginning to acquire new words, we would expect mothers to treat the storybook reading as an opportunity to teach their infants new words. With explicit instructions to describe the actions and activities taking place in the storybook, it is predicted that mothers will include a large number of verbs in their stories. Considering that the actions and activities in the storybook are familiar to adults, mothers should provide fewer verbs in the adult-directed condition.

Hypothesis 6: Speech to early-lexical infants should contain a larger proportion of repetition or restatement of target verbs per story than speech to adults. Speech to infants contains a significant amount of repetition. We should expect to find that mothers repeat or restate the verbs they use, especially with explicit instructions to describe the actions and activities in the storybook.

Hypothesis 7: Speech to early-lexical infants should contain a larger proportion of verbs in utterance-final position than speech to adults. Utterance-final position is a beneficial position for parsing lexical items. One reason is that utterance-final words are bounded by a pause, designating the lexical item as a unit of analysis. A second reason is that utterance-final words benefit from recency effects; they may be more easily recalled than utterance-initial words.

Hypothesis 8: Speech to early-lexical infants should contain target verbs that are placed in more diverse syntactic contexts than speech to adults. Naigles and Hoff-Ginsberg (1998) found a relationship between the diversity of syntactic environments in which verbs were presented and their subsequent acquisition. We would expect mothers to use more diverse syntactic environments when speaking to their early-lexical infants than when speaking to adults who are already familiar with the actions and activities in the storybook.

Hypothesis 9: Speech to early-lexical infants should contain target verbs (in utterance-final position) that are elongated in comparison to nouns (also in utterance-final position). Speech to infants generally contains vowels that are lengthened in comparison to speech to adults (Bernstein-Ratner, 1984, Albin & Echols, 1996). Ina case where actions are the intended focus, verbs are expected to exhibit this type of acoustic highlighting.

Hypothesis 10: Speech to early-lexical infants should contain target verbs (in utterance-final position) uttered in wider pitch ranges than nouns (also in utterance-final position). Exaggerated pitch in infant-directed speech may serve to attract and maintain infants’ attention (Fernald, 1984, 1985, 1989, 2000; Fernald & Simon, 1984; Fernald & Kuhl, 1987). Exaggerated pitch may also serve to highlight focused lexical items for infants. Fernald & Mazzie (1991) found that target nouns in a storybook were placed on exaggerated pitch peaks. We would expect to find focused verbsto be placed on exaggerated pitch peaks in these stories/descriptions as well.