Phoneme frequencies and acquisition1

Phoneme frequencies and acquisition of lingual stops in Japanese*

Kiyoko Yoneyama

Mary E. Beckman

Jan Edwards

Ohio State University

[*]We thank the Japanese speakers who participated in the experiments; Mieko Kawashima, Noriko Kobayashi, and Rumiko Yoneyama for providing contacts; Takashi Otake for lending recording equipment; and Hiroko Nakamura, Hiromi Takayama, Satoko Katagiri, and Kuniko Yasu for assisting in analysis. This work was supported by an Ohio State University Summer Interdisciplinary Research Fellowship in Cognitive Science to Kiyoko Yoneyama, and by National Institute for Deafness and Other Communicative Disorders grant DC02932 to Jan Edwards. Address for correspondence: Jan Edwards, Department of Speech and Hearing Science, Ohio State University, 110 Pressey Hall, 1070 Carmack Road, Columbus, OH 43210, USA. e-mail:

ABSTRACT

Cross-language differences in phoneme acquisition are interesting because they challenge some theories that posit linguistic or biological universals of acquisition. This paper examines acquisition of word-initial /t/ and /k/ in Japanese. Analyses of an on-line dictionary and a child-directed speech sample showed that /k/ is more frequent than /t/. A cross-sectional study of 47 children aged 2;3 to 5;3 found that /k/ is acquired earlier than /t/. Also, errors occurred most often before /e/ (the least frequent vowel). Here substitutions included [k] for /t/ and [t] for both stops. ([t] is the allophone of /t/ before /i/.) These results differ from English, where /t/ is more frequent than /k/, [t] is not an allophone of /t/, /t/ is acquired before /k/ word-initially, and both [k] for /t/ and affricate substitutions are rare. These differences between English and Japanese suggest that lexical phoneme frequency and allophony play a role in phoneme acquisition.

INTRODUCTION

A long-standing challenge in phonological development research is to explain the order of phoneme acquisition, within and across languages. Researchers agree on the general patterns that occur in phoneme acquisition in English and several other well-studied languages that are closely related to English. For example, although there is much individual variability (see Vihman, 1993, and other literature reviewed there), English-acquiring children generally produce labial and alveolar stops and nasals early on, while stable production of most fricatives and affricates does not emerge until later (see, inter alia, Prather, Hedrick, & Kern, 1975; Smit, Hand, Freilinger, Bernthal, & Bird, 1990; Kent, 1992). Similar patterns have been observed for French (Chevrie-Muller & Lebreton, 1973), German (Ellsen, 1991; Fox & Dodd, 1999), Dutch (de Houwer & Gillis, 1997), and Spanish (Dinnsen, 1992). Some substitution patterns are also similar across these related languages. For example, [t] for /k/ is a very common error among English-speaking children with phonological disorder (e.g., Ingram, 1976), and Möhring (1938) reports this as among the most common errors among German school children. Mowrer & Burger (1991) even found that children acquiring Xhosa, a language with a very different phonological system from any of these European languages, had most difficulty with /s, , r/, sounds that are also late-acquired and frequently misarticulated in English-speaking children. Another regularity in phoneme acquisition across languages is that aspirated stops, and especially prevoiced or prenasalized stops, are generally acquired later than voiceless unaspirated stops. That is, regardless of how stop categories are distributed in a language, children generally produce short lag stops early on (these correspond to the “voiced” stops /, /, // in English in initial position, but to the voiceless stops /, /, // in French) and then acquire aspirated and prevoiced stops later (e.g., Kewly-Port & Preston, 1974; Macken & Barton, 1980a, b; Allen, 1985).

However, researchers have also noted some differences in order of phonemes acquired and in substitution patterns across languages. For example, there is some evidence that /l/ is acquired earlier by French-speaking children as compared to English-speaking children (Chevrie-Muller & Lebreton, 1973; Vihman, 1993). Similarly, Pye, Ingram, & List (1987) show that children acquiring Quiché (a Mayan language) produce both liquids and affricates (/l/ and /t/) in their early phoneme inventories, while these sounds are among the last to be acquired by English-speaking children. So & Dodd (1995) also observed that Cantonese-acquiring children produce affricates relatively early, in a [ts] for /s/ substitution pattern. Hua & Dodd (2000) show that in Putonghua (the standard variety of Mandarin Chinese spoken in the PRC), alveopalatal affricates and alveopalatal and velar fricatives are among the earliest acquired sounds. In fact, if the criterion of 90 percent correct is used for acquisition, then these affricates and fricatives are acquired earlier than /p/ in Putonghua, a result which contrasts with English, where /p/ is among the earliest acquired sounds and is acquired much earlier than lingual affricates and fricatives. Error patterns also differed in Putonghua as compared to English. The second most frequent error pattern was “backing”, with 65 percent of the children substituting a more posterior constriction for a dental place of articulation (e.g., [] for the initial /s/ in /sua/). In English, by contrast, “fronting” of /s/ to [] is more common and can persist quite late in children with misarticulated /s/ (e.g., Weismer & Elbert, 1982; Baum & McNutt, 1992).

Ingram (1988) divides theories of phonological acquisition into two categories: functional and maturational. Functional theories explain phonological acquisition in terms of specifically linguistic universals, such as the principle of contrast and the inventory of distinctive features available to the language-acquiring child, while maturational theories explain phonological acquisition in terms of biological and physiological constraints during development. A maturational account can explain differences across individual children acquiring the same language, since these could be “an expression of their biological differences” (Locke, 1988: 665). However, a “pure” maturational account would have trouble explaining cross-language differences, since biological variation across individuals in a population should not differ systematically across different populations of children when these populations are defined culturally, by the languages that they are acquiring. A functional account would have similar trouble explaining cross-language differences in phoneme acquisition, if it claims that particular consonants or vowels have certain specifically linguistic but universal properties.

One very influential functional account that invokes specifically linguistic universal properties is that of Jakobson (1941/1968), who claimed that children babble the sounds of all languages as part of an innate repertoire of humanoid vocalizations, and then cease babbling as they begin to produce real words, after which phonemes are acquired in a relatively fixed order that maximizes sequential and paradigmatic opposition. Specifically, phonemes in the infant’s first words are an “unmarked” consonant and vowel that define a maximal opposition found in all human languages: the sequential opposition between a non-lingual unaspirated stop and an open vowel, as in /pa/ (i.e., /ba/, if the child is acquiring English). Subsequent phonemes will also be acquired in terms of maximal oppositions. For consonants, the first opposition will be oral versus nasal and the second will be labial versus dental. For vowels, the first opposition will be low versus high, and so on. Thus, if a child first produced an unaspirated oral stop such as French /p/ or English /b/ and a low back /a/, then the next consonant acquired would be /m/ and the next vowel acquired would be /i/. Each successive opposition that is acquired by the child affects a smaller and smaller sound class, with the most “marked” sounds (that is, the least common sounds across languages) acquired last. Thus, the order of acquisition mirrors phonological patterns across languages, such as the ubiquity of unaspirated stops among the world’s consonant systems, and the implicational universal stating that if a language has nasal consonants it also will have oral stops.

Much of Jakobson’s proposal has been disproved. Empirical research has shown that children do not babble the sounds of all languages, that babbling does not cease before the onset of word production, and that there is much overlap between the sounds produced in first words and in concurrent babbling (e.g., Vihman, Macken, Miller, Simmons, & Miller, 1985; Locke, 1989). However, Jakobson’s claim that the order of phoneme acquisition should be understood in terms of specifically phonological universals influenced much subsequent work on order of phoneme acquisition. For example, Dinnsen’s recent work on order of phoneme acquisition (Dinnsen, 1992; Dinnsen, Chin, Elbert, & Powell, 1990) sets up five levels of phoneme inventories, going from least complex to most complex. The child moves from a lower level to a higher level inventory as additional contrasts are acquired. Thus, the difference between level A and level B is that level B adds a voicing contrast, and the difference between level D and level E is that level E adds a stridency contrast and/or a laterality contrast.

Like Jakobson’s earlier account, Dinnsen’s functional account invokes the same notion of “markedness” that phonologists have used to understand phoneme inventories across languages. Contrasts which involve “unmarked” sounds are ones that occur in all or many languages, whereas “marked” sounds occur more rarely. Dinnsen’s developmental hierarchy is reminiscent of Lindblom & Maddieson’s (1988) account of cross-language differences in adult phoneme inventories. That is, “unmarked” sounds are ones that involve easier gestures or simpler gestural coordination, and languages should use simple contrasts before they resort to more complex ones. Therefore, languages with smaller phoneme inventories contrast only “unmarked” features and simple feature combinations, whereas more “marked” or elaborate sounds, such as palatalized velars, ejectives, or pre-nasalized stops, occur only in languages with larger consonant inventories that also include the less marked sounds. If markedness is understood as an unexplained phonological primitive, then Dinnsen’s account is quite different in conception from a maturational account. On the other hand, if markedness universals have a biological explanation, as suggested by Lindblom & Maddieson (1988), then a functional account such as Dinnsen’s is conceptually close to some maturational accounts. By either interpretation of markedness, however, a functional account that invokes markedness universals predicts that acquisition order also should be fairly invariant across languages.

One of the earliest proponents of a maturational explanation of the order of phoneme acquisition was Locke (1983) who proposed that the earliest acquired sounds will be those that are most perceptually salient and easiest to produce. Since both perception and production play a role, a sound may be acquired late either if it is difficult to hear or if it is difficult to say. For example, the low amplitude of the inter-dental fricative and its confusability with /f/ may explain the late acquisition of // in English, even though it should be easy to produce relative to /s/, whereas the fine motor precision required to direct the air stream against the incisors in the alveolar fricative may explain the late acquisition of /s/ relative to /t/.

A number of other researchers also have considered the order of phoneme acquisition in terms of physiological constraints. For example, Kent (1992) reinterprets Dinnsen’s (1992) typology in terms of motor development. He proposes that level B inventories emerge when the child has acquired consistent control over the fine timing of a laryngeal gesture relative to a coordinated supra-laryngeal gesture in order to contrast voiced and voiceless obstruents, and that level E inventories emerge when the child has acquired the more precise lingual control needed to posture the tongue for a strident sound or for a lateral or retroflex sound. Thus, if acquisition order is universal, the universals are due to a universal biological hierarchy of the motor control needed to coordinate two gestures or to produce particular gestures, as in Lindblom & Maddieson’s (1988) account of adult phoneme inventories.

MacNeilage & Davis (1990) also consider phonological acquisition in terms of motoric constraints, which they use to explain several universals of adult language as well. Speech is an inherently rhythmic motor activity, and babbling emerges at a stage when many other rhythmic behaviors also begin to appear (Thelen, 1981). MacNeilage and Davis propose that certain phonological patterns common in babbling — the predominance of CV syllable sequences and the co-occurrence of coronal consonants with front vowels, of dorsal consonants with back vowels, and of labial consonants with open vowels — can be explained by this emergence of rhythmic behavior at six months. A CVCVCV sequence would occur if the child phonates while simply wagging the jaw up and down. The co-occurrence patterns can also be explained by the fact that the tongue rests on the jaw. If the tongue is relatively low for a preceding or subsequent open vowel, the lips are likely to contact each other on the jaw’s upswing before any part of tongue contacts the roof of the mouth. If the tongue is bunched up for a close vowel, on the other hand, adjacent consonants are likely to involve lingual contact, which will be front (coronal) around a front vowel and back (dorsal) around a back vowel, because the child is controlling the up/down movement of the jaw, not the front-back movement. MacNeilage & Davis (1990) predicted that the place co-occurrence patterns attested in babbling also would be attested at greater than expected frequency in early word productions and even in adult lexicons, predictions that were borne out later in longitudinal studies of six children (Davis & MacNeilage, 1995) and in on-line dictionary studies of stop-vowel sequences in adult lexicons for ten languages (MacNeilage, Davis, Kinney, & Matyear, 2000). Thus, in this maturational account, specifically linguistic universals, such as the prevalence of CV syllables across languages and even constraints that refer to lexical frequencies of specific consonant-vowel combinations, are linked to the more general emergence of rhythmic behavior in the middle of the first year.

Lexical frequencies also play a role in some recent functional accounts, but the explanatory link is in the other direction. For example, Stemberger & Bernhardt (1999) suggest that high-frequency sounds and sequences should be acquired earlier, because they “require fewer resources” to learn and to access from memory once they are learned. Insofar as a particular sound or pattern is universally more frequent in adult languages, then, it will consistently be acquired earlier and might function as a universal “default” pattern at a later stage of acquisition when the child has acquired contrasting sounds or patterns that are less frequent. For example, “velar (and even labial) consonants may be replaced by alveolars ([t] for /k/), the most frequent place of articulation in adult languages” (Stemberger & Bernhardt, 1999, p. 419). In this example, the more frequent “default” place is claimed to be universal, in keeping with Jakobson’s (1941) earlier claim that /t/ is acquired before /k/ and that velar fronting is observed in acquisition of all languages “everywhere and at all times” (Jakobson, 1941/1968, pp. 46-47). Insofar as lexical frequencies are not universal, however, this kind of functional account can begin to explain systematic cross-language differences in order of phoneme acquisition.

For example, if /t/ and /l/ are acquired early in Quiché but late in English, can this be related to differences in the frequency with which these two sounds occur in the two languages? Pye et al. (1987) compared the frequency of occurrence of word-initial /l/ and /t/ in the 500 most frequently produced words of five young Quiché-speaking children (ages 1;7 to 3;0) to patterns of acquisition in English-speaking children, and found that word-initial /l/ and /t/ were much more frequent in these Quiché-speaking children than in the words used by English-speaking children at the same age. Along similar lines, Ingram (1988) discusses the acquisition of /v/, which is a late-acquired sound in English and has a low frequency of occurrence in words commonly used by young English-speaking children. By contrast, based on diary studies of a single child in each language, /v/ has a much higher frequency of occurrence in the initial vocabularies of Swedish, Estonian, and Bulgarian children, and also is acquired earlier in these languages. The earlier acquisition of /l/ in French might also be related to lexical frequency, given the high token frequency of /l/ which appears in word-initial position in many forms of the direct article (l’, le, la, les).

Ingram (1988) described the difference between /v/ in English as compared to Swedish, Estonian, and Bulgarian in terms of “functional load,” rather than lexical frequency of occurrence. Functional load is the relative importance of a sound within a language’s phonological system of contrasts, that is, the extent to which it functions to differentiate one word from another (Martinet, 1952). Functional load obviously is related to frequency of occurrence, since a low-frequency sound cannot contrast many words. The two notions are not identical to each other, particularly if frequency of occurrence of a sound in a particular language is calculated in terms of token frequency rather than type frequency. For example, // in English occurs in several high-frequency words such as the, this, that, but it has a low functional load because it can be replaced by /d/ without affecting meaning. However, Ingram’s (1988) operational definition of functional load was in terms of lexical type frequency. That is, as in Pye et al. (1987), he calculated the functional load of the target sound by counting the number of distinct words beginning with it in the child language sample.

It is not surprising that lexical phoneme frequency might play a role in phoneme acquisition by one-to-three-year old children, given that cross-linguistic research has shown similar influences on infant babbling. For example, de Boysson-Bardies, Hallé, Sagart, & Durand (1989) found that measurements of vowel formants in vocalizations produced by Arabic-, French-, English-, and Cantonese-acquiring 10-month-old infants differed in ways that reflected differences in vowel frequencies in the lexicons of the ambient languages. Similarly, de Boysson-Bardies & Vihman (1991) and de Boysson-Bardies, Vihman, Roug-Hellichius, Durand, Landberg, & Arao (1992) found that transcriptions of consonants in babbling in a longitudinal study of French-, English-, Swedish-, and Japanese-acquiring children beginning at 9 months and ending at 14 to 20 months (when the child had acquired 25 words) reflected cross-language differences in the relative frequency of different consonants in the ambient adult languages. For example, the French-acquiring infants produced relatively more labial sounds than did the English-, Swedish-, and Japanese-acquiring infants, and this is in keeping with the relatively greater frequency of labial consonants in French. In a related analysis, Vihman (1992) found that the English-acquiring infants produced more monosyllabic and more consonant-final forms in their babbling, relative to French-acquiring infants. These differences also reflect the frequencies of the different prosodic word shapes in the two languages.