Minimalism: An overview

There are two stages in Minimalism:

Stage 1: Chomsky 1993, 1995

Stage 2: Chomsky 2000, 2001

COMMON ASSUMPTIONS

Both stages share a number of core assumptions:

(1) No levels of representation other than LF and PF

There are no levels of Representation where separate Principles apply, like D-structure, S-structure. Derivations yield pairs relating the two interfaces PF and LF.

-Phonetic Form (PF): interface to the articulatory-perceptual system, contains information on how to map specific morpheme combinations to phonemes, and how to phonetically realize these phonemes (e.g. div[ai]ne - div[I]nity,...)

-Logical Form (LF): interface to the conceptual-intentional system, contains what is sometimes called logico-semantic information (e.g. why can A barber from every city hates it mean the same as In every city, there is a barber that hates this city, but A barber from no Western city hates it cannot be paraphrased asIn no Western city, there is a barber that hates this city.)

Derivations must converge at PF and LF, i.e. they must satisfy the Principle of Full Interpretation at the interfaces.

(4) a. A derivation D based on a numeration N is successful iff D CONVERGES at the two interfaces LF and PF. (NB: This does not mean that the derivation will yield a wellformed output! Economy principles still have to be factored in at this point)

b. D converges at an interface I iff D is interpretable at I (informally: is well-formed

w.r.t. the requirements imposed by I).

c. If a derivation D does not converge, D is said to CRASH.

In essence, the need for a D to converge entails that D must only contain information which can be processed by I. For instance, D should no longer include instructions about phoneme structure once D reaches the LF-interface. For LF, it does not matter whether the subject in a sentence such as John saw Mary is pronounces as [d¥] or as [t•]. Conversely, derivations that converge at PF must be void of semantic information. For PF, it is immaterial which of the two possible interpretations the string Everybody didn’t come to the party is assigned.

(2) No direct access to the Lexicon

The derivation does not access the lexicon directly, but works on a proper subset of the lexicon only, called the NUMERATION (in stage 1) the Lexical Array (in stage 2). First, this reduces computational complexity. Second, there is empirical evidence for N/LA from questions surrounding the proper definition of the concept of COMPETITION in syntax.

(3) Structure Building

The derivation uses the core operations MERGE and MOVE. Recently, the number of operations has been reduced even further, as Chomsky re-interprets Move as a specific

instance of Merge (viz. internal Merge; classic cases of Merge are referred to as external or root Merge).

(4) Checking of uninterpretable features drives operations (Merge of expletive, Move and in stage 2 Agree).

(5) Spell Out

The point in the computation where the derivation splits towards the two interface levels, PF and LF.

(6) Economy

The Minimalist Program (MP; Chomsky 1991-2004) defines the contours of a family of theories of natural language syntax which embody the two central notions of conceptual/methodological and substantive economy. In contrast to earlier manifestations of (transformational versions of) generative grammar (such as the Government and Binding model; Chomsky 1981) theories that follow the guidelines of MP are characterized by two properties:

CONCEPTUAL ECONOMY

Driven by methodological, meta-theoretical considerations: Analyses

should employ a minimum of principles and grammatical constructs. A theory that uses

fewer ingredients to explain a given set of phenomena is explanatory more adequate than

a theory that uses additional assumptions.

Minimize number of levels or representation: DS and SS are eliminated as levels of representation (NB: also has substantive effects)

 Minimize number of axioms of theory...

SUBSTANTIVE ECONOMY

Syntactic derivations for natural language sentences are evaluated by ECONOMY PRINCIPLES. These principles compare sufficiently similar derivations along various dimensions (amount of operations, length of movement paths; computational cost;...) and eliminate all but the most economical one.

DIFFERENCES

Important differences between the two stages:

(1) The most important difference: Architecture of Grammar. Single Spell-Out point vs. Cyclic Spell-Out

In stage 1 the architecture of GB is retained: T/Y model. Spell-Out applies in a single point in the derivation:

ARCHITECTURE OF THE GRAMMAR

Lexicon -> Numeration -> Overt Syntax ---> Spell-Out -> covert syntax/LF

(computational system CHL) 

PF

This means that there are two cycles in the derivation. The derivation before Spell-Out (overt) and the derivation after Spell-Out (covert). If operations happen at PF, then three cycles: derivations ate PF.

In stage 2, there is no overt-covert distinction with two independent cycles. The derivation proceeds by phase. At each stage of the derivation a subset LAi is extracted, placed in active memory and submitted to the derivational procedure. The chunk of the derivation that has access to a given subarray is called a phase.

Definition of Phases: phases of a derivation are syntactic objects which are derived by choice of a subarray LAi. CPs and vPs constitute phases while TPs not. Chomsky proposes that phases must satisfy the strong cyclicity condition in (1):

(1)The head of a phase is "inert" after the phase is completed triggering no

further operations.

According to (1), a phase head cannot trigger Merge or Attract is a latter phase. This means that derivations proceed phase by phase.

A further condition is the, so called, "phase-impenetrability condition” in (2):

(2) In phase α with head H, the domain of H is not accessible to operations

outside α, but only H and its edge

This forces Move of a phrase inside the phase to Spec,H in order for this phrase to be accessible to operations outside α, i.e. a strong form of Subjacency. For example, in English, an object that undergoes wh-movement undergoes first (covert) object shift—

in order to be accessible for wh-movement.

Cyclic Spell-Out

In Chomsky (1995, footnote 50) it is noted that there are [-interpretable] features that may have a PF reflex, for example φ-features on T, v or Case on DPs. This is problematic if the checking relation is overt: the -Interpretable feature is checked and erased before Spell-Out but its phonetic matrix remains.

To resolve the problem, Chomsky (1998: 48) proposes that instead of single Spell-Out there is cyclic Spell-Out: deleted features are erased but only after they are sent to the phonological component—possibly at the phase level. Spell-out applies cyclically in the course of a derivation. Cyclic Spell-Out is contingent on feature checking operations.

In the previous model, there where two cycles, the cycle of overt and the cycle of covert derivations. With cyclic Spell-Out, there is only one cycle and all operations are cyclic.

(2) A Difference in the Computational Operations: Long Distance Agree

In Stage 1 the computational operations are (Select), Merge, Move, Delete.

Feature checking is either the result of Merge (in the case of expletive checking EPP) or Move (overt triggered by Strength, or covert).

In Chomsky (1995), overt movement is XP/head movement, while covert movement is feature movement. The motivation for feature movement is conceptual. Given that the operation Move is triggered by the requirement that some feature F must be checked, the minimal operation should be one that would raise only the feature F. For this reason, Chomsky explores the possibility that Move α be replaced by the operation Move F, F a feature, which he takes to be "more principled" (Chomsky, 1995: 262).

Pied Piping

In this conception of movement, the question arises why the whole category raises along with the raised feature. The answer is that there is "generalized pied-piping" required for convergence:

(3)Economy Condition: F carries along just enough material for convergence

The amount of material carried along should (in an optimal theory) be determined by "bare output conditions"; Chomsky (1995:262-263) speculates that "...for the most part-perhaps completely- it is properties of the phonological component that require such pied-piping. Isolated features and other scattered parts of words may not be subject to its rules, in which case the derivation is canceled; or the derivation might proceed to PF with elements that are unpronounceable, violating FI.....".

(3) provides a rationale for Procrastinate. If pied-piping is required for convergence at PF and given that nothing at all is the least that can be carried along for convergence, then covert movement will always be preferred over overt movement because covert movement will not require any pied-piping.

In Stage 2, the computational operations are: (Select), Merge, Agree, Move

How it works: Part 1 [p. 101]

(I) Select [F] from the universal feature set {F}

(II) Select LEX, assembling features from [F]

(III) Select LA (lexical array) from LEX

(IV) Map LA to EXP, with no recourse to [F] for narrow syntax

How it works: Part 2

a. Merge: "takes two syntactic objects (a, b) and forms K(a, b) from them.

b. Agree: "establishes a relation (agreement, Case-checking) between an LI a and a feature F in some restricted search space (its domain)."

c. Move: combining Merge and Agree. [A-movement if motivated by a phi-feature; A-bar if motivated by a P ["peripheral"]-feature]

Occurrences

-Move creates two occurrences of a single a, where an "occurrence of a" is the full context of a.

-"Chain" is a set of occurrences. If occurrences are "full contexts" we don't need to say that a chain is a sequence, since there will be a containment relation between the contexts that allows us to reconstruct whatever we might needed the ordering property of a sequence for.]

Prioritizing

-Move is more complex than its subcomponents.

-Move is more complex than even its subcomponents together -- since it involves the extra step of determining pied piping.

-Consequently:

(4) Merge or Agree "preempt" Move.

"This yields most of the empirical basis for Procrastinate", p. 102

(3) Triggers for Movement

In stage 1 movement is triggered by “Strength”, understood as a property that PF cannot tolerate (Chomsky 1993) or as something that a derivation cannot tolerate (Chomsky 1995). In Chomsky (1995) Strength induces cyclicity (featural cyclicity) which permits limited countercyclicity as in “tucking in” derivations (Richards).

In stage 2 movement is linked to EPP features. The EPP feature of T is probably universal. EPP features are assigned to phases (CP and vP) when they have an effect on the outcome (successive cyclic movement).

(4) Freezing effects

In stage 1 there is a problem with movement from a Case position to a Case position. This is prohibited, but it is not obvious why.

"...We are now tentatively assuming that if all features of some category α have been checked, then α is inaccessible to movement, whether it is a head or some projection. But if some feature F is yet unchecked, α is free to move. Economy conditions exclude "extra" moves and anything more than the minimal pied-piping required for convergence...." (Chomsky 1995: 266) [see Chomsky 2000 who develops this idea]

Now this has to be stated differently because an element might have checked the Case feature of a lower functional head but still be the closest element for EPP and phi-checking of a higher head. Recall that categorial features and phi-features on DPs are interpretable and hence accessible to further computation.

Here, Chomsky appeals to either the features of the target or to Global Economy, depending on the case he considers:

Two Cases:

(i) Examples like (4) are ungrammatical because the derivation does not converge. The Case feature of the higher T remains unchecked:

(4)*John seems that t is intelligent

"Freezing" in this case, is the by-product of the fact that some requirement of the target cannot be satisfied, i.e. its Case feature is not checked.

(ii) In Chomksy (1995: 357) [embedded within a multiple specifier system]: Suppose that we have a derivation in which the object raises overtly in the specifier of vP before T is merged with vP (Icelandic, with Object Shift and Multiple Specifiers):

(5) vP

3

spec vP

ostinni3

spec v'

margar mys3

v VP

6

ei

T is merged with vP and there is no expletive in the numeration. In that case, both the shifted object and the subject may raise to T because they both have a D feature and they are both equidistant (multiple specifiers are equidistant from target) from T satisfying Attract Closest. The derivation where the subject raises converges, correctly. However, the derivation where the object raises also converges, incorrectly. To rule out the latter possibility, Chomsky proposes that the derivation in which the object and not the subject raises is blocked by Economy. A derivation in which the subject and not the object raises to T involves two raising operations: First the object raises to v and then the subject raises to T. On the other hand, a derivation in which the object and not the subject raises to T involves three raising operations: First, the object raises to v, and the Case feature of the object and v is checked. Then the object raises to T and it checks and erases the D feature of T. But the subject still needs to check its own Case feature and the N feature of T. Therefore, the more economical derivation (the one with two steps) blocks the less economical one (the one with three steps).

In stage 2, an operation can only be triggered by an uninterpetable feature:

(17) Activity condition

A goal must bear some uninterpretable feature [otherwise it is frozen in place].

1