Structures behind grammaticalization<1>
Elly van Gelderen
Centre for Advanced Study, Oslo, and Arizona State University
August 2005 version
There are many linguistic changes where words lose meaning and gain grammatical function. This grammaticalization often involves a full phrase becoming one word, or a verb becoming an auxiliary. The current paper provides a characterization of structural ways to examine grammaticalization. Within a Minimalist framework, it uses the Head Preference and Late Merge Principles for this purpose. Thus, it assumes that change is cyclical and provides grammar-internal reasons for this. It also uses Feature Grammaticalization to account for cross-linguistic similarities.
1. Introduction
Grammaticalization involves the loss of semantic and phonological information and the increase of grammatical function (see e.g. Heine & Reh1984; Traugott & Heine 1991). Well-known examples include verbs changing to auxiliaries and prepositions to complementizers. Many of the accounts of the last 20 years are functional but recently there have been attempts to account for grammaticalization in a formal, structural way. Thus, Roberts & Roussou (2003) and van Gelderen (2004) discuss changes of lexical heads to functional heads (e.g. verbs to modals) in structural terms, and van Gelderen (2004) adds the change from specifier to head (e.g. demonstratives to articles). Simpson & Wu (2002) discuss specifier to higher specifier changes (e.g. negative DPs to negatives).
In this paper, I’ll show that several kinds of grammaticalization processes can be given a formal account and add examples to those given in the literature. I will also focus on why, cross-linguistically, certain lexical words develop into certain grammatical categories and will call this phenomenon Feature Grammaticalization. It has long been recognized that languages change from synthetic to analytic and back again (e.g. Bopp 1868). This fact has also been denied, e.g. by Jespersen (1922) and more recently by Norde (1997). Hodge (1970) calls this cyclical phenomenon the ‘Linguistic Cycle’. In this paper, I show what some of the structural reasons behind the cycle are, recognizing the difficulties in using the terns synthetic and analytic. I will formulate a way to characterize analytic and synthetic in a Minimalist framework.
The outline is as follows. In section 2, I’ll first provide some background on Minimalist phrase structure and the two Economy Principles compatible with this framework. In section 3, I provide a few examples of the change from phrase to head, phrase to higher phrase, and head to higher head and discuss their relationship. Under this view, grammaticalization is uni-directional, brought about by structural factors. In section 4, I examine if these principles are relevant to concepts such as synthetic and analytic. The data used come from the Helsinki Corpus, the Old English Dictionary Project, and Middle English electronic texts from the Oxford Text Archive.
2 Some structures
Within the generative tradition (e.g. Chomsky 1986), syntactic structures are built up using general rules, such as that each phrase consists of a head (X in (1)), and a complement (ZP in (1)) and specifier (YP in (1)):
(1) XP
YP X'
X ZP
In early work, this schema is quite strict, e.g. specifiers and complements are always full phrases. This changes with the introduction of (minimalist) bare phrase structure in the 1990s (Chomsky 1995). A verb and a pronoun object can merge with each other, as in (2), while one of the two heads projects, in this case V:
(2) VP
V D
see it
Phrase structures are built using merge and move. `Merge' combines two items, e.g. see and it, of which one projects into a higher level and transmits its categorial features. The VP domain is seen as the thematic-layer, i.e. where theta-roles are determined. After functional categories such as I and C are merged to VP, ‘agree’ ensures that features in IP and CP find a noun or verb with matching (active) features to check agreement and Case. Movement to the specifier of IP occurs in those languages that have EPP features but head-movement may be seen as part of the PF component.
Using general Minimalist principles, one can argue that checking between two heads, also referred to as incorporation, is more economical than between a specifier and a head. This is formulated in van Gelderen (2004: 11) as (3):
(3) Head Preference Principle:
Be a head, rather than a phrase.
Principle (3) holds for merge (projection) as well as move (checking). The preferred structures are (4a) and (4b) rather than (4c), where FP stands for any functional category and where, for instance, a pronoun is merged in the head position in (4a), moved to it in (4b), but occupies the specifier position in (4a):
(4)a. FP b. FP c. FP
. F’ . F’ pro F’
pro ... F … F …
pro F
As I show below, the Head Preference Principle is relevant to a number of historical changes: whenever possible, a word is seen as a head rather than a phrase. In this way, pronouns change from emphatic full phrases to clitic pronouns to agreement markers and negatives from full DPs to negative adverb phrases to heads.
Within recent Minimalism, there is a second economy principle (see e.g. Chomsky 1995: 348). Merge, as in (2) above, "comes `free' in that it is required in some form for any recursive system" (Chomsky 2001: 3) and is "inescapable" (Chomsky 1995: 316; 378). This means that merge is more economical than (merge and) move. Thus, it is less economical to merge early and then move than to wait as long as possible before merging. In van Gelderen (2004: 12), this is formulated as (5):
(5) Late Merge Principle:
Merge as late as possible
Chomsky (2001: 7-8) reformulates the notions of merge and move as external and internal merge respectively. "Argument structure is associated with external merge (base structure); everything else with internal merge (derived structure)" (p. 8). The latter leaves a copy in place, but is otherwise similar to merge. In this system, internal and external merge are variants of each other. I will argue that internal merge (i.e. earlier move) is still less economical since there is an additional copy in the derivation. For convenience, I will continue to use the term move rather than internal merge.
How does Late Merge account for language change? If non-theta-marked elements can wait to merge outside the VP (Chomsky 1995: 314-5), through external merge, they will do so. I will therefore argue that if, for instance, a preposition is less relevant to the argument structure (e.g. to, for, and of in ModE), it will tend to merge higher (in IP or CP) rather than merge early (in VP) and then move. Why certain words are more appropriate than others will be seen as due to Feature Grammaticalization relevant to both principles (3) and (5). Like (3), Late Merge is argued to be a motivating force of linguistic change, accounting for the change from specifier to higher specifier and head to higher head.
3 Examples of change due to Economy
In this section, it is shown how (3) and (5) account for changes traditionally referred to as grammaticalization. The change from specifier to (higher) specifier follows from (5), that from specifier to head from (3), and that from head to (higher) head again from (5).
3.1 Specifier to Specifier
Without using Late Merge, Simpson & Wu (2002: 291 ff.) analyze a change in negation in the history of French as in (6). Initially, the negative ne selects a Focus projection below the NegP but above the VP. The negative element pas in the FocP moves to the specifier of NegP, as in (6a). This object then becomes base generated in the specifier of FocP, as in (6b), and subsequently in Spec NegP, as in (6c):
(6)a.NegP b. NegP c. NegP
Neg FocP Neg FocP Spec Neg'
ne Spec Foc' ne Spec Foc' pas Neg VP
pasi Foc VP pas Foc VP ne VNP
V ti V NP
The change from specifier to higher specifier falls under the Late Merge Principle, as in (5) above, since in (6b) there is less movement than in (6a) and the negative is merged latest<2> in (6c). The next step will be for pas to become a head, in accordance with (3). This has presumably happened in varieties of French where ne has disappeared.
3.2 Specifier to head
English negatives provide evidence for the Head Preference Principle in (3) because they change from specifier or full phrase to head. Initially, there is a negative nominal, as in the Old English (7), with a structure as in (9a) below. Next, the negative becomes restricted to na wiht/na thing, as in (8) and represented in (9b). Finally, the negative specifier changes to a single word or head, not, represented in (9c):
(7) Æt nyxtan næs nan heofodman Þæt ..
At night not-was no headman who
`At night there wasn't a headman who ...' (Peterborough Chronicle, anno 1010.26, Thorpe's edition)
(8) ne fand Þær nan Þing buton ealde weallas
not found there no thing (Peterborough Chronicle, anno 963.18)
The different stages can of course be represented in the same text, as (7) and (8) show. The initial stage is one from specifier to a higher specifier, as shown in (9a), in accordance with (4) above. After the negative phrase becomes generated in the specifier, as in (9b), it can then become a head, as in (9c). Much has been written on this cycle since Jespersen (1916), but by using (3) above, we find a structural explanation:
(9) a. CP
. C'
n-æsi NegP
DPj Neg'
D NP Neg VP
nan man ti tj....
b. CP
. C'
n-isi NegP
A Neg'
na(w)uht Neg ....
ti
c. CP
. C'
C NegP
Ø Neg'
Neg ...
not/n't
In the history of English, as soon as stage (c) is reached, the verb and not are written as one word, as in (10). This is quite frequent in letters such as the 15th century Paston Letters which have benot, darnot, letnot, shalnot, woldnot, and many others. It takes another 300 years before the auxiliaries start to contract with the negative, as in (11). (Both sentences are from the Helsinki Corpus):
(10) Þat we cannot tell of (Wycliffite Sermons, sermo 16, I, 285, c1380)
(11) But I shan't put you to the trouble of farther Excuses, if you please this Business shall rest here. (John Vanbrugh, The Relapse c1680).
In texts that write the forms together, ne is no longer used as a negative head.
The change shown in (9) is a traditional grammaticalization that can be accounted for by two structural principles, (3) and (5) above. This change results in a loss of semantic specificity and phonological weight. Thus, na wiht means ‘no creature’ and is more specific than just the negative marker and the loss of phonology between nawiht and not is obvious. What happens is that the semantic feature [negative] on D is reanalyzed as a grammatical one. I will refer to this as Feature Grammaticalization.
Other instances of specifier to head grammaticalization provided in van Gelderen (2004) involve relative and demonstrative pronouns becoming complementizers, demonstratives becoming articles. In table 1 a few of the most common ones are listed without further discussion.
Demonstrative pronoun that to complementizer Demonstrative pronoun to article
Negative adverb to negation marker Adverb to aspect marker
Adverb to complementizer Pronoun to agreement
______
Table 1: Examples of specifier to head changes
3.3 From head to head
After a phrase becomes a head, further loss of meaning and increase in grammatical function comes about if the head changes to a higher head, one with less lexical content. Another possibility is for the head to disappear, an option I do not discuss in this paper. Like the change from specifier to higher specifier, the change to higher head follows from Late Merge. Clear examples are those where verbs become auxiliaries. Since verbs need to move to higher categories to check their agreement features and since they do not contribute to the theta-roles, they can wait to merge later. Another example of this concerns the preposition for. I will show how features are transformed in this process, in accordance with Feature Grammaticalization.
In the Peterborough Chronicle <3> (hence PC and, as before, quoted with the entry year from Thorpe's edition), for is used as a preposition of causation, as in (12) and (13).
(12) þa luuede se kining hit swiðe for his broðer luuen Peada. 7 for his wedbroðeres luuen Oswi. 7 and for Saxulfes luuen þes abbodes
`Then loved the king it much for love of his brother Peada and for his pledge-brother Oswiu and for love of the abbot Saxulf' (PC, anno 656.4).
(13) ouþer for untrumnisse ouþer for lauerdes neode ouþer for haueleste ouþer for hwilces cinnes oþer neod he ne muge þær cumon
`either from infirmity or from his lord's need or from lack of means or from need of any other kind he cannot go there' (PC, anno 675.30).
It is remarkable how many of these concern constructions in which the PP of which for is the head is preposed, as in (13), (14), and (15):
(14) for mine londe 7 for mine feo. mine eorles fulle to mine cneo
for my land and for my property my earls fell to my knees (Layamon, Caligula
1733-4).
(15) þu 3ef þeseluen for me to lese me fra pine
`you gave yourself to me to release me from pain' (Wohunge 88-9).
According to van Dam (1957: 6), this fronting occurs regularly in OE. In (15), for is ambiguous between P and C, and hence the language learner ends up reanalyzing the P as C, and the DP as a topicalized element. In this connection, it is remarkable that the first instances of that-deletion listed in the OED (entry for that II 10) are as in (16) and (17), from the 14th century, i.e. where a for-phrase has been fronted and can serve as C: