Introduction to the Course s1

Movement

Lecture 3: Movement

1 Pre-theoretical Concepts of Movement

The informal idea that elements move about in linguistic expressions is very natural and has probably been around for centuries. All of the following sentences seem to indicate that things move from one place to another in certain constructions:

(1) a the goalie kicked the ball the ball was kicked by the goalie

b you saw him who did you see

c you will tell him will you tell him

d he took a gun from the draw from the draw, he took a gun

e a soldier in full uniform arrived a soldier arrived in full uniform

It seems that there are numerous cases where sentences with at least related, if not virtually identical meanings are related to each other through the reorganisation of their elements. It is therefore quite an intuitive and natural idea that natural language grammars involve processes by which things move about.

Despite the obviousness of the idea, however, no one really took the idea beyond the informal level and certainly no one considered the consequences of having grammatical rules which change the position of words and phrases in a structure. Thus, we might find informal descriptions such as: ‘to form a question, move the question word to the front of the sentence’, or ‘in the passive, the object moves to the subject position’, but it was never discussed what kind of a grammatical rule might be able to do such a thing.

2 Transformational Grammar

Last week we introduced Chomsky’s formalisation of the structuralists notion of Constituent Structure analysis, the Phrase Structure Grammar, and we briefly demonstrated why Chomsky thought that such a grammar was not an adequate one for modelling natural language phenomena. His main criticism was that Phrase Structure Grammars allow no way for independently generated structures to be connected. This is problematic precisely because of sentences such as those in (1), where it is clear that these pairs of sentences are connected in some way. A phrase structure grammar would generate these sentences as it does all others and hence all sentences should have the same status, related to any other sentence to exactly the same degree. Note that it is not just sharing the same lexical items that relates two sentences:

(2) a John loves Mary

b Mary loves John

c Mary, John loves

All of these sentences share the same lexical stock, but (2a) and (c) are more strongly related to each other than either is to (39b).

Chomsky also pointed out that there are ambiguities that do not appear to have a lexical or structural explanation. For example, consider:

(3) the shooting of the hunters

This is ambiguous in terms of whether the hunters are doing the shooting or getting shot. This is clearly not a case of lexical ambiguity as all the words mean the same thing in both interpretations. Yet is does not seem to be a case of structural ambiguity either as the PP seems to be a modifier of the noun in both cases, presumably analysed by a simple PSG as follows:

(4) NP

Det N PP

the shooting of the hunters

Note that this nominal phrase is related to two very different sentences:

(5) a the hunters shot something

b someone shot the hunters

What this gives us is virtually the opposite to the case where two syntactically different sentences are related, as in the sentences in (1):

(6) Mary was loved by John

John loves Mary

Mary, John loves

(7) the hunters shot something

the shooting of the hunters

someone shot the hunters

What is needed, Chomsky argued in 1957, is a new kind of rule that has the power to relate sentences. The kind of rule Chomsky envisioned, he termed a transformational rule. In 1957, the system that Chomsky described was very different from that he subsequently developed and as it has little consequence in the subsequent development of transformational grammar, I will deal with it only briefly. Chomsky proposed that we need to identify a subset of a language’s sentence, called the kernel, which are more basic than others. These were to be generated by a simplified phrase structure grammar. Transformational rules were rules which operated on kernel sentences to produce all of the others, thus accounting for relatedness between sentences: a kernel sentence will be related to all those which are produced from it via the transformational rules. Thus, the model looked like the following:

(8)

One obvious problem with this model is not only identifying which the kernel sentences are but also justifying why this set should be those selected as more basic than others. Chomsky argued that kernel sentences were active, positive and declarative and that passive, negative and interrogative sentences were all produced by transformation from a kernel sentence. However, there is no real justification for claiming that active, positive declarative sentences are any more basic than others and so the system seemed arbitrary at best. The model did not survive and so we will spend no more time on it.

From the start of the 1960’s Chomsky developed Transformational Grammar along more familiar lines, with each expression of a language associated with a structural description which applied before transformations took place, called a Deep Structure, and a structural change effected by the operations of transformations, called a Surface Structure. Phrase Structure Rules were responsible for forming Deep Structures and Transformations acted on these to form Surface Structures:

(9)

Although this meant that no two sentences were directly related to each other by transformations, Chomsky was still able to maintain a connection between different sentences by maintaining that there was a sufficient similarity in their Deep Structures. Thus an active and a passive sentence might both start with an identical Deep Structure, but end up with different Surface Structures by the application of different transformations in both cases. Obviously this gets rid of the need to define a set of kernel sentences and so there is no need to explain why one type of sentence should be considered more basic than any other.

3 The Expansion of Transformations

During the 1960s a great number of transformational rules were proposed to account for a wide range of linguistic observations, mainly from English, but also from a steadily expanding set of other languages too. It is worth considering a few of these to demonstrate how they worked and to see what problems they faced.

Let us start with the treatment of the English verbal system that Chomsky first proposed in his 1957 book. As we mentioned last week Chomsky at first assumed the traditional perspective that the auxiliary verbs form a structural unit with the verb:

(10) S

NP VP

John Verb NP

Aux V the paper

has read

Thus we start with the following phrase structure rules:

(11) S → NP VP

VP → Verb NP

Verb → Aux V

Aux → (have) (be)

Note that the use of the auxiliaries have and be is not mutually exclusive and is optional. This is represented by the brackets around the two Aux elements in the last rule. But now we have to face the fact that depending on which auxiliary is used a different form of the verb appears:

(12) a John has read the paper

b John is reading the paper

The key to understanding what is going on here can be found by observing the following sentence:

(13) John has been reading the paper

Note that the main verb is in its –ing form, while the auxiliary be is in its perfective form represented by the morpheme –en. Given that the main verb is in the perfective form in (12a) where it follows the auxiliary have, we can conclude that any verb that follows have will be in its perfective form. By extension then, we can say that any verb that follows the auxiliary be will be in its –ing form. Therefore we have an association between have and the morpheme –en, and one between be and the morpheme –ing, and both of these morphemes end up on the following verbal element. To capture the association between the auxiliary verbs an the morphemes, Chomsky proposed the following:

(14) Aux → (have + en) (be + ing)

In other words, when the Aux node is expanded into words, both the auxiliary verb and its associated morpheme are inserted together and this accounts for the association of the two elements. Obviously the insertion of each auxiliary is an option. But the point is that if the option is taken, both the auxiliary and the morpheme will be inserted. The rule set will produce the following sequences of elements:

(15) a John + read + the + paper

b John + have + en + read + the + paper

c John + be + ing + read + the + paper

d John + have + en + be + ing + read + the + paper

To account for the fact that the morphemes end up on the following verbal element, Chomsky proposed the following transformation:

(16) structural description: aff + V

structural change: # V + aff #

What this says is that if in a structure we find a sequence where an affix precedes a verb, we move the affix behind the verb and make the two together a single word (indicated by the word boundary symbols ‘#’). Thus applying this rule to (15) we get:

(17) a John + read + the + paper

b John + have + # read + en # + the + paper

c John + be + # read + ing # + the + paper

d John + have + # be + en # + # read + ing # + the + paper

Which, when phonological adjustments have been made, come out as:

(18) a John read the paper

b John has read the paper

c John is reading the paper

d John has been reading the paper

The analysis became known as the ‘affix hopping’ analysis because of the way that the affix hops backwards onto the following verbal element.

An important part of this analysis was the treatment Chomsky gave to the tense morpheme. Essentially he wanted to give the same treatment to this as the other verbal morphemes and so it should be subject to the affix hopping transformation too. The important observation is that the tense morpheme always appears on the first verbal element and therefore we can surmise that its underlying position is in front of this element so that it can hop backwards onto it:

(19) ed + have Þ # have + ed # = had

To achieve this all we need to do is include the tense element in the Aux phrase structure rule:

(20) Aux → T (have + en) (be + ing)

Once this is done, the tense morpheme will be generated at the front of the verbal elements and by the affix hopping rule will become attached to whichever verb follows it. Note that it is an obligatory element and so every sentence will have a tense element in it. To account for infinitives and participles, Chomsky proposed that the infinitival to and the gerundive –ing could also be taken as instances of T. As to is not an affix, it would not undergo affix hopping, but the gerundive morpheme will behave like all the others. Thus we will get:

(21) … ed + read … Þ read

… ed + have + en + read … Þ had read

… to + have + en + read … Þ to have read

… ing + have + en + read … Þ having read

Now let us consider the slightly more complex transformation which is involved in forming the passive. As is well known, there are at least four differences between active and passive sentences: the object of the active corresponds to the subject of the passive; the subject of the active corresponds to the NP inside an optional by phrase in the passive which follows the verb; there is an auxiliary be in the passive as well as a morpheme situated on the main verb:

(22) a the goalie saved the ball

b the ball was saved (by the goalie)

Assuming the basic phrase structure rules in (11), we can assume that the basic subject position is before the verb and the basic object position follows it. Therefore in the active no transformation acts to move these elements. Of course, there are transformations which operate in active sentences, affix hopping for example, and so it is not the case that the Deep Structure and the Surface Structure of active sentences are identical. However, something more radical applies with the passive. We might propose the following:

(23) SD: NP1 + Aux + V + NP2

SC: NP2 + Aux + be + en + V (by NP1)

Essentially what this says is that if we find a structure in which there is a subject (NP1) a verb and an object (NP2) we can apply the transformation in which the object replaces the subject, the auxiliary be is inserted followed by the morpheme –en and the subject can optionally be placed in a by phrase following the verb. Assuming that the passive transformation applies before the affix hopping rule, the passive morpheme will be treated like the others and land on the following main verb.

Although this is a little primitive and there are obviously technical issues to be addressed, even at this point we can point out some positive aspects of the analysis. First it is predicted that only transitive verbs will be able to passivise. This is because for the transformation to be able to operate, its structural description must be met. This states that there must be an object and only a transitive verb will have an object. Hence John was smiled will be ungrammatical because no transformation could produce it. Second, any restrictions which apply to a verbs subject and object need only to be stated with respect to Deep Structures and they will apply equally to active and passive sentences and they will not have to be restated in mirror image for each. So the grammaticality of (24a) has the same source as the grammaticality of (25a) and the ungrammaticality of (24b) has the same source as (25b):

(24) a sincerity frightens John

b * John frightens sincerity

(25) a John is frightened by sincerity

b * sincerity is frightened by John

4 The Problem With Transformations

The passive transformation is a god example of what came to be seen as the main problem for this type of analysis. This one transformation apparently has the ability to move or perhaps simply delete the subject, move the object into the vacated subject position and insert various things into structure, thereby radically changing the whole thing. Of course, at first, this was seen as the most possible aspects of transformations: you can analyse anything with them! But this turns out to be a huge problem: if you can do anything with a transformation, it is impossible to explain why certain things happen in certain constructions in certain languages. In other words, transformations offer a very good way of describing linguistic phenomena, but they cannot explain it.