1,928 636 11MB
Pages 383 Page size 432 x 647.76 pts Year 2012
Noam Chomsky
Lectures on. Government and Binding· The Pisa Lectures
1988 FORIS PUBLICATIONS Dordrecht- Holland/Providence Rl- U.S.A.
.
Published by: Foris Publications Holland P.O. Box 509
3300 AM Dordrecht, The Netherlands Sole distributor for the U.S.A .. and Canada. Foris Publications USA, Inc. P.O. Box 5904 Providence Rl 02903 U.S.A.
First edition 1981 Second revised edition 1982 Third revised edition 1984 Fourth edition 1986 Fifth edition 1988
The editors would like to thank Reineke Bok-Bennema for compiling the index. ISBN 90 70176 28 9 (Cloth) ISBN 90 70176 13 0 (Paper)
© 1981 Foris Publications- Dordrecht No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission from the copyright owner. Printed in The Netherlands by ICG Printing, Dordrecht.
Lectures on Government and Binding The Pisa Lectures
Studies in Generative Grammar The goal of this series is to publish those texts that are representative of recent advances in the theory of formal grammar. Too many studies do not reach the public they deserve because of the depth and detail that make them unsuitable for publication in article form. We hopethatthe present series will make these studies available to a .wider audience than has hitherto been possible. Editors: Jan Koster
H enk vim Riemsdi) k
See for other books in this series page 373
Table of Contents
Preface
..............................................
VII
Chapter 1. Outline of the theory of core grammar ........ .
17 17 34
Chapter 2. Subsystems of core grammar . ......... .... ... 2.1. Levels of representation .................... .... .. 2.2. LF-representation and 0-theory (1 ) . ............ ... 2.3. The categorial component and the theories of case and government ................ ............... ...... 2.4. E mpty categories .. ........ . ..................... 2.4.1.Trace and PRO ...... ................ ...... 2.4.2.Further properties of PRO .......... ......... 2.4.3.Control theory ....... ........... ........... 2.4.4.Trace and bounding theory .................. 2.4.5.The evolution of the notion "trace" in transformational generative grammar ................... 2.4.6.Some variants and alternatives .... ........... 2.5. The base ....................................... 2.6. LF-representation and 0-theory (2 ) ................ 2.7. Some remarks on the passive construction .......... 2.8. Configurational and non-configurational languages .. 2.9. Modules of grammar ...... ...................... Notes .... .................... ......................
85 89 92 101 117 127 135 138
Chapter 3. On government and binding . ............ ..... 3.1. The OB-framework ........ ...................... 3.2. The GB-framework . ....... ......... ............. 3.2� l.The concept of government .................. 3.2.2.Case theory and 0-roles ..................... 3.2.3.The theory of binding . ...................... Notes ........................ .... . .................
153 153 1 61 1 62 170 183 222
Chapter 4. Specification of empty categories .............. 4.1. NIC and RES(NIC) ............................. 4.2. Basic properties of RES(NIC) ................... .. 4.3. The pro-drop parameter (1 ) . ......................
231 231 233 240
48 55 55 64 74 79
vi
Table of Contents . . . .
. . . .
248 253 275 314
. . . . . .
. . . . . •
285 285 289 300 308 344
. .. . . . ...
321
.. . . .. . .. .
347
.. ... ... .. ... ... ...
355
. . ... . . . .. . . . . . . . . . . . .
359
4.4. The empty category principle (ECP) . 4.5. The pro-drop parameter (2 ) . . . .. . . 4.6. Recoverability and clitics . . . . . . . . . . . . . . . . . . . . . . . Notes . ... ... .. . .. .. .. ... . .. . . ... .. ... .
.
.
.
.
.
.
.
.
•
.
.
.
.
.
.
.
.
.
.
.
.
•
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
•
.
.
.
.
.
.
.
.
.
.
•
.
.
.
.
.
.
•
.
.
. . . •
Index of Names General Index
•
.
.
. .
.
.
.
.
.
•
.
•
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . .. . . . .. . .. . . .. . . . . .. .. .. .. . .. . . . . .. . . . . . .. . . . . . . . . . .
.
•
.
.
.
.
.
.
.
.
.
.
•
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 6. Empty eategories and the rule Move-a Bibliography
.
•
Chapter S. Some related topies . . .. .. 5.1.. The theory of indexing 5.2. Prepositional phrases .. .. .. 5.3. Modifications ofthe ECP . .. .. 5.4. Complex adjectival constructions Notes . . . .
.
.
.
.
.
•
.
.
.
.
.
. .
.
.
··
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
•
.
.
.
. . .
.. .
. . . . •
•
.
.
.
.
.
.
•
.
.
. . . . . •
Preface
The text that follows is based on lectures I gave at the GLOW conference and workshop held at the Scuola Normale Superiore in Pisa in April l979, The material was then reworked in the course of lectures at MIT in 1979. 80, where! was fortunate to have the participation of a number of visitors from otht!r institutions in. the U.S. and Europe. Since the Pisa meetings, there .has been considerable research by a number of linguists within more or less the framework developed in the Pisa discussions, or bearing on this framework of ideas and assumptions. I will not attempt a sys tematic or comprehensive review of this work, much of which seems to me extremely promising, though there are some allusions to it and some is incorporated directly in the main line of presentation, as will ·be in dicated. The delay in the preparation of these lectures for publication may lead to some confusion in literature citations. There are in current literature references to the Pisa lectures, often using this terQl, referring naturally to the actual material presented and discussed at the Apri11979 meetings. In a rapidly developing field, in which papers are often partly outdated sometimes, by work in part based on them -by the time they appear, it is inevitable that passage of over a year will lead to changes and modifica tions, so that this book, though subtitled "the Pisa lectures," is actually different in certain respects from the original. In an effort to help clarify citations in other works, I will occasionally add footnote comment .on differences between this text and the April1979 lectures. The material presented here borrows extensively from recent and current work in ways that will not be adequately indicated; specifically, from the work of linguists of the GLOW circle who have created research ceri.ters of such remarkable vitality and productivity in France, the Netherlands, . Italy and elsewhere. The outstanding contributions of . Richard Kayne, both in his own work and in stimulating research of others, deserve special mention. Preparation of the lectures for publication was greatly facilitated by a transcript of the lectures and discussion prepared by Jean-Yves Pollock andHans-Georg Obenauer, as well as by critical notes by Jan Koster. lam particularly grateful to the participants in the GLOW conference and workshop for their suggestions and criticism, and also to those who took
viii
Preface '
part in a seminar at the Scuola Normale Superiore in March 1979; where much of the material presented in the April lectures was developed. The seminar began with study of an unpublished paper by Tarald Taraldsen (1978b). Though he was not present, his ideas stimulated much of the investigation of binding theory and the nature of variables that was carried out in the seminar. I owe a special debt to Luigi Rizzi for his original ideas and incisive criticism. Students, colleagues and visitors at MIT are also responsible for many improvements and modifications, only partially indicated. Among many others, Henk van Riemsdijk has been particularly helpful with many ideas and suggestions. I am also much indebted to Joseph Aoun and Dominique Sportiche for very helpful comments· on the·issues; ·and on the text� andto'Jean:.:RogerVergnaud"for some essential ideas. I would like to express my gratitude to the Scuola Normale Superiore for having provided me with such excellent conditions for research, and more generally, for the kindness arid hospitality accorded both to me and my family during our stay in Pisa in the spring of 1979. I am also grateful to the National Endowment for the Humanities (USA) for their s�pport during this period, when most of this research was completed: The Pisa lectures were highly "theory internal," in that a certaill general theoretical framework ·was presupposed and options within it were cOn-' sidered and some developed, with scant attention to' alternative points of view or the critical literature dealing with the presupposed framework. I have kept to•the same format in preparing these lectures for publicaiion. Many criticisms of the general point of view I am adopting are discussed' in Chomsky (1980b) and in references cited there, tho�gh the discussion is far from exhaustive. See also Piattelli""Palmarini (1980) for discussion and commentary on the general approach rather than on technical develop�· m:ents. An interesting perspective on the backgrounds of recent work 'is. presented in Newmeyer (1980). I have also discussed th'is topic,· less . comprehensively arid from a more personal point of view, in tlieintrod�c:.. tion to Chomsky (1955). The text that follows is divided into six chapters with subsections> While . material from the original Pisa lecture s is scattered throughout, it is con centrated in chapters 3 and 4. My paper at the GLOW conference (Chomsky, l979a) is a brief outline of material presented in chapter 3 � though again, this material has been subsequbntly Il1odified and develoP.ect, so· that references in that paper to the'ongirial Pisa: lectures do not invariab ly refer accurately to this text. Some of the materiai in chapters 3 �nd 4 is outlined briefly in Chomsky {1979b ); and some of the contents of chapters' 1 ·' and 2 is presented rather informally in Chomsky (1980c). . '; b w Examples are separately numbered in each su section. I ill refer to them simply by number within the subsection in which they appear; and by subsection arid number elsewhere. For example, the notation "2.4. 1 :(7Y' refers to example (7) of chapter 2, section 4, subsection I; the notation "2.4.1" refers to that subsection. Otherwise, notations, conventions and ·
·
·
'
·
·
Preface
ix
terminology are fairly standard, except where indicated. Specifically, I will often present syntactic structures in much reduced form so as to focus on the question at hand. Noam Chomsky Cambridge Mass. Dec.l980
Chapter 1
Outline of the theory of core grammar
I would like to begin with a few observations about some problems that arise in the study of language, and then to turn to an approach to these questions that has been gradually emerging from work of the past few years and that seems to me to show considerable promise. I will assume the general framework presented in Chomsky (1975; 1977a,b; 1980b) and work cited there. A more extensive discussion of certain ofthe more tech nical notions appears in my paper "On Binding" (Chomsky, 1980a; henceforth, OB). The discussion here is considerably more comprehensive in· scope and focuses on somewhat different problems. It is based on certain principles that were in part implicit in this earlier work, but that were not given in the form that I will develop here. In the course of this discussion, I Will consider a number of conceptual and empirical problems that arise in a theory of the 0B type and will suggest a somewhat different approach that assigns a more central role to the notion of govern ment; let us call the alternative approach that will be developed here a "government-binding (GB) theory" for expository purposes. I will then assume that the GB theory is correct in essence and will explore some of its properties more carefully, examining several possible variants and consideri11g their advantages and defects. The ideas developed and ex plored in various forms in chapters 275 will be reformulated from a some what more abstract point of view in chapter 6. In pursuing this course, it is worthwhile to make a distinction between certain leading ideas and the execution of these ideas. Existing languages are a. small and in part accidental sample of possible human languages, and of this sample, only a few have been extensively investigated in ways that bear directly on the questions that concern me here. On a more personal note, there are only parts of this work that I am sufficiently familiar with so as to be able to draw upon it. Furthermore, theoretical innovations commonly suggest new ways of looking at comparatively well-studied hinguages that present them in a different light, or that bring out phenomena that were previously unexamined, or observed but un explained. In applying these leading ideas, it is always necessary to make a number of empirical assumptions that are only partially motivated, at best. The leading ideas admit of quite a range of possibilities of execution. The discussion that follows is based on certain leading ideas, some of which are only beginning to be investigated seriously in a theoretical framework
2
Lectures on government and binding: the Pisa lectures
of the sort considered here: notions of government, abstract Case, 1 binding, and others. Often I will make some decision for concreteness in order to proceed, though leading ideas may not be crucially at stake ifl such decisions. The distinction between leading ideas and mode of execution is a rough but nevertheless useful one. In work subsequent to the lectures on which this text is based, other variants of the same or related leading ideas have been pursued, or significant modifications proposed and examined, in important work in progress that I will not be able to discuss adequately here. 2 The point is, I think, important, sufficiently so that I would like to repeat some remarks published elsewhere on the topic (Chomsky, 1977a, p. 207): "The pure study of language, based solely on evidence of the sort reviewed here, can carry us only to the understanding of abstract conditions on grammatical systems. No particular realization of these conditions has any privileged status. From a more abstract point of view, if it can be attained, we may see in retrospect that we moved towards the understanding of the abstract general conditions on linguis� tic structures by the detailed investigation of one or another 'concrete' realization: for example, transformational grammar, a particular in stance of a system with these general properties. The abstractcondition's may relate to transformational grammar rather in the way that modern algebra relates to the number system. We should be concerned to abstract from successful grammars and successful theories those more general properties that account for their success, arid to develop [universal grarrlimi.r] as a theory of these abstract properties, which might be realized in a variety of different ways. To choose among such reiilizations, it willbe necessaryto move to a much broader doinain of evidence. What linguistics' should try to provide is an abstract characterization of particular arl.d universal grammar that will serve as. a guide and framework for this more general inquiry. This is not to say that the study of highly specific mechim1� ms (e.g., phonological rules, conditions on transformations, etc.) should be abandoned. On the contrary, it is only through the detailed inves tigation of these particular systems that we have any hope of advancing towards a grasp of the abstratt structures, conditions and properties that should, some day, constitute the subject matter of general lin guistic theory.. The goai may be remote, but it is well to keep it in mind as we develop intricate specific theories and try to refine and sharpen them in detailed empirical inquiry. It is this point of view that lies behind the rough distinction between leading ideas and 'execution; 'and that motivates much of what follows. I think that we are, in fact; beginning to approach a grasp of certain basic principles of grammar at what rriay be the appropriate level of abstraction.
Outline of the theory of core grammar
3
At the same time, it is necessary to investigate them and determine their empirical adequacy by developing quite specific mechanisms. We should, then, try to distinguish as clearly as we can between discussion that bears on leading ideas and discussion that bears on the choice of specific realizations of them. Much of the debate in the field is, in my opinion, misleading and perhaps even pointless, in that it concerns the choice among specific mechanisms but uses evidence that only bears on leading ideas which the alternative realizations being considered may all share. The search for the appropriate level of abstraction is a difficult one. It is only quite recently that questions of this nature can even be raised in a serious way. My own suspicion is that as research progr.esses, it will show that many of the most productive ideas are in fact shared by what appear to be quite different approaches. Some of the subsequent discussion relates directly to this question, as does much current work, for example, Burzio ( 1981), Marantz ( 1981). In work of the past several years, quite a broad range of empirical phenomena that appear to have a direct bearing on the theories of govern ment and binding considered here have been examined in a few com paratively well-studied languages. Several theories have been proposed that are fairly intricate in their internal structure, so that when a small change is introduced there are often consequences throughout this range of phenomena, not to speak of others. This property of the theories I will investigate is a desirable one; there is good reason to suppose that the correct theory of universal grammar in the sense of this discussion (hence forth: UG) will be of this sort. Of course, it raises difficulties in research, in that consequences are often unforeseen and what appear to be im provements in one area may turn out to raise problems elsewhere. The path that I will tentatively select through the maze of possibilities; some times rather arbitrarily, is likely to prove the wrong one, in which case I will try to unravd the effects and take a different turning as we proceed. I will be concerned here primarily to explore a number of possibilities within a certain system of leading ideas, rather than to present a specific realization of them in a systematic manner as an explicit theory of UG. Let us recall the basic character of the problem we face. The theory of UG must meet two obvious conditions. On the one hand, it must be com patible with the diversity of existing (indeed, possible) grammars. At the same time, UG must be sufficiently constrained and restrictive in the options it permjts so as to account for the fact that each of these grammars develops in the mind on the basis of quite limited evidence. In many cases that have been carefully studied in recent work, it is a near certainty that fundamental properties of the attained grammars are radically under determined by evidence available to the language learner and must there fore be attributed to UG itself. These are the basic conditions of the problem. What we expect to find, then, is a highly structured theory of UG based on a number of fundamental principles that sharply restrict the class of attainable gram-
4
Lectures on government and binding: the Pisa lectures
mars and narrowly constrain their form, but with parameters that have tci be fixed by experience. If these parameters are embedded in a theory .of UG that is sufficiently rich in structure, then the languages that are determined by fixing their values one way or another will appear to be quite diverse, since the consequences of one set of choices may be very different from the consequences of another set; yet at the saine time, limited evidence, just sufficient to fix the parameters of UG, will d�ter mioe a grammar that may be very intricate and will in general lack ground ing in experience in the sense of an inductive basis. Each such grammar will underlie judgments and understanding and will enter into behavior. But the gr ammar - a certain system of knowledge - is only- indirectly related to presented experience, the relation being mediated by UG. What seems to me particularly exciting about the present period in linguistic research is that we can begin to see the glimmerings of what such a theory might be like. For the first time, there are several theories of UG that seem to have the right general properties over an interesting domain of fairly complex linguistic phenomena that is expanding as inquiry into these systems proceeds. That is something relatively new and quite important, even though surely no one expects that any of these cur rent proposals are correct as they stand or perhaps even in general con ception. The approaches to UG that seem to me most promising fall within the general framework of the so-called "Extended Standard Theory." Each such approach assumes that the syntactic component of the gr ammar generates an infinite set of abstract structures -call them "S-structures" that are assigned a representation in phonetic form (PF) and in LF (read: "logical form," but with familiar provisos3 ). The theory of UG must therefore specify the properties of (at least) three systems of representa tion - S-structure, PF, LF - and of three systems of rules: the rules of the syntactic component generating S-structures, the rules of the PF component mapping S-structures to PF, and the rules of the LF-component mapping S-structure to LF. Each expression of the language determined by the grammar is assigned representations at these three levels, among others. Note that the central concept throughout is "grammar," not "language." The latter is derivative, at a higher level of abstraction from actual neural mechanisms; correspondingly, it raises new problems. It is not clear how important these are, or whether it is worthwhile to try to settle them in some principled way.4 The empirical considerations that enter into the choice of a theory of PF and LF fall into two categories: grammar-internal and gr ammar-exter nal. In the first category, we ask how particular assumptions about.PF and LF relate to the rules and principles of grammar; in the second, we ask how such assumptions bear on the problem of determining physical form, perceptual interpretation, truth conditions, and other properties of utterances, through interaction of PF, LF and other cognitive systems.
Outline of the theory of core grammar
5
I will have little to say about PF here. Assume it to be some standard form of phonetic representations with labelled bracketing, what I will refer to as. "surface structure," adopting one of the several uses of this term. The nature of S-structure and LF, and the rules of grammar deter mining and relating them, will be the central focus of my concerns here. UG consists of interacting subsystems, which. can be considered from various points of view. From one point of view, these are t.he various sub components of the rule system of grammar. From another point of view, which has become increasingly important in recent years, we can isolate subsystems of principles. I will assume that the subcomponents of the rule system are the following:
(1)
(i) lexicon (ii) syntax (a) categorial component (b) transformational component (iii) PF-component . (iv) LF-component
The lexicon specifies the abstract morpho-phonological structure of each lexical item and its syntactic features, including its categorial features and its contextual features. The rules of the categorial component .meet some variety of X-bar theory. Systems (i) and (iia) constitute the. base. Base rules generate D-structures (deep structures) through insertion. of lexical items into structures generated by (iia), in accordance with their feature structure. ·These are mapped to S-structure by the rule Move-a, leaving traces coindexed ',Vith. their. antecedents; this rule. constitutes the trans formational component (iib), and may also appear in the PF- and LF components. Thus the syntax generates S-structures which are assigned PF- and LF-representations by components (iii) and (iv) of (1), res pectively. Some properties of the�e systems and some alternative ap proaches will be considered below, but a good deal will be presupposed from the published literature. The subsystems of principles include the following: (2)
(i) (ii) (iii) (iv) (v) (vi) .
bounding t�eory government theory lHheory binding theory Case theory control theory
Bounding theory poses locality conditions on certain processes and related items. The central notion of government theory is the relation between the head of a construction and categories dependent on it. lHheory is concerned with the assignment of thematic roles such as agent-of-action,
6
Lectures on government and binding: the Pisa lectures
etc. (henceforth: 0-roles ). Binding theory is concerned with relations of anaphors, pronouns, names and variables to possible antecedents. Case theory deals with assignment of abstract Case and its morphological real ization. Control theory determines the potential for reference of the abstract pronominal element PRO. Properties of these systems will be developed as we proceed. These subsystems are closely related in a variety of ways. I will suggest that binding and Case theory can be developed within the framework of government theory,-· and that Case and 0-theory are closely interconnected. Certain notions, such as c-command� seem to be ceritral to several of these theories. Furthermore, the subsystems of (1) and (2) interact: e.g., bounding theory holds of the rule Move-a (i.e., of antecedent-trace relations) but not of other antecedent-anaphor relations of binding and control theory. Each of the systems of (1) and (2) is based on principles with certain possibilities of parametric variation. Through the interaction of these systems, many properties of particular languages can be accounted for. We will see that there are certain complexes of properties typical of particular types of language; such collections of properties should be explained in terms of the choice ofparameters in one or another subsystem. In a tightly integrated theory with fairly rich internal structure, change in a· single parameter may have complex effects, with proliferating con�� quences in various parts of the grammar. Ideally, we hope to find thai complexes of properties differentiating otherwise similar languages are reducible to a single parameter, fixed in one or another way. For analogous considerations concerning language change, see Lightfoot (1979). · · A valid observation that has frequently been made (and often; ir:. rationally denied) is that a great deal can be learned about UG from the study of a sirigl\:danguage; if such study achieves sufficient depth to put forth rules or principles that have explanatory force but are underdeter:. mined by evidi'mce available to the language learner. Then'itis reason�ble to attribute to UG those aspeCts of these rilles or p'rinciples' that � e uniformly attained but underdetermined by evidence. Similarly, study of closely related languages· that differ in some clustering of properties is particularly valuable for the opportunities it affords to identity and clarify parameters of UG that permit a range of variation in the proposed prin ciples. Work of the past several years on the Romance languages, some of which will be discussed below, has exploited these possibilities quhe effectively. Ultimately, one hopes of course· that it will be possible to subject proposals concerning UG to a much broader test so as to deter mine both their validity and their range of parame' tiic variation, insofar as they are valid. Since these proposals concern properties of grammars apart from empirical generalizations, which should be regarded as facts to be explained rather than part of a system of explanatory principles of UG it is possible to put them to the test only to the extent that we have grarnma� tical descriptions that are reasonably compelling in some domain, a point of logic that some find distasteful, so the literature indicates. _
Outline of the theory of core grammar
7
In early work in generative grammar it was assumed, as in traditional grammar, that there are rules such as "passive," "relativization," "question-formation," etc. These rules were considered to be decompos able into more fundamental elements: elementary transformations that can compound in various ways, and structural conditions (in the technical sense of transformational grammar) that are themselves formed from more elementary constituents. In subsequent work, in accordance with the sound methodological principle of reducing the range and variety of possible grammars to the minimum,--these possibilities of compounding were gradually reduced, approaching the rule Move-a as a limit. But the idea of decomposing rules such as "passive," etc., remained, though now interpreted in a rather different way. These "rules" are decomposed into the more fundamental elements of the subsystems of rules and principles (1) and (2). This development, largely in work of the past ten years, represents a substantial break from earlier generative grammar, or from the traditional grammar on which it was in part modelled. It is reminiscent of the move from phonemes to features in the phonology of the Prague school, though in the present case the "features" (e.g., the principles of Case, government and binding theory) are considerably more abstract and their properties and interaction much more intricate. The notions "passive," "relativization," etc., can be reconstructed as processes of a more general nature, with a functional role in grammar, but they are not "rules of grammar." We need not expect, in general, to find a close correlation between the functional role of such general processes and their formal properties, though there will naturally be some correlation. Languages inay select from among the device� of UG, setting the parameters in one or another way, to provide for such general processes as those that were considered to be specific rules in earlier work. At the same time, phenomena that appear to be related may prove to arise from the interaction of several components, some shared, accounting for the similarity. The full range of properties of some construction may often result from interaction of several components, its apparent complexity reducible to simple principles of separate subsystems. This modular character of grammar will be re peatedly illustrated as we proceed. When the parameters of UG are fixed in one of the permitted ways, a particular grammar is determined, what I will call a "core grammar." In a highly idealized picture of language acquisition, UG is taken to be a characterization of the child's pre-linguistic initial state. Experience -in part, a construct based on internal state given or already attained -serves to fix the parameters of UG, providing a core grammar, guided perhaps by a structure of preferences and implicational relations among the para meters of the core theory. If so, then considerations of markedness enter into the theory of core grammar. But it is hardly to be expected that what are called "languages" or "dialects" or even "idiolects" will conform precisely or perhaps even
8
Lectures on government and binding: the Pisa lectures
very closely to the systems determined by fixing the parameters of UG. This could only happen tinder idealized conditions that are never realized in fact in the real world of heterogeneous speech communities. Further more, each actual "language" will incorporate a periphery of borrowings, historical residues, inventions, and so on, which we can hardly expect to and indeed would not want to - incorporate within a principled theory of UG. For such reasons as these, it is reasonable to supposethat UG determines a set of core grammars and that what is actually represented in the mind of an individual even under the idealization to a homogeneous speech c:omniunity would be. a core grammar with a periphery of marked elements and constructions. 6 Viewed against the reality of what a particular person may have inside his head, core grammar is an idealization. From another point of view, what a particular person has inside his head is an artifact resulting from the interplay of many idiosyncratic factors, as contrasted with the more signifi cant reality of UG (an element of shared biological endowment) and core grammar (one of the systems derived by fixing the parameters of UG in one of the permitted ways). We would expect the individually-represented artifact to depart from core grammar in two basic respects: (1) because of the heterogeneous character of actual experience in real speech communities; (2) because of the distinction between core and periphery. The two respects are related, but distinguishable. Putting aside the first factor -i.e., assuming the ideal ization to a homogeneous speech community7 - outside the domain of core grammar we do not expect to find chaos. Marked structures have to be learned on the basis of slender evidence too, so there should be further structure to the system outside of core grammar. We might expect that the structure of these further systems relates to the theory of core grammar by such devices as relaxing certain conditions of core grammar, pro cesses of analogy in some sense to be made precise, and so on, though there will presumably be independent structure as well: hierarchies of accessibility, etc. Some examples will be discussed below; see also the references of note 6, and much additional work. These should be fruitful areas of research, increasingly so, as theories of core grammar are refined and elaborated. Returning to our idealized - but not unrealistic - theory of language acquisition, we assume that the child approaches the task equipped with UG and an associated theory of markedness that serves two functions: it imposes a preference structure on the parameters of UG, and it permits the extension of core grammar to a marked periphery. Experience is necessary to fix the values of parameters of core grammar. In the absence of evidence to the contrary, unmarked options are selected. Evidence to the contrary or evidence to fix parameters may in principle be of three types: (1) positive evidence (SVO order, fixing a parameter of core grammar; irregular verbs, adding a marked periphery); (2) direct negative evidence (corrections by the speech community); (3) indirect negative
Outline of the theory of core grammar
9
evidence - a not unreasonable acquisition system can be devised with the operative principle that if certain structures or rules fail to be ex emplified in relatively simple expressions, where they would be expected to be found, then a (possibly marked) option is selected excluding them in the grammar, so that a kind of "negative evidence" can be available even without corrections, adverse reactions, etc. There is good reason to believe that direct negative evidence is not necessary for language acquisi tion,8 but indirect negative evidence may be relevant-9 We would expect the order of appearance of structures in language acquisition to reflect the structure of markedness in some respects, but there are many complicating factors: e.g., processes of maturation may be such as to permit certain unmarked structures to be manifested only relatively late in language acquisition, frequency effects may intervene, etc. It is necessary to exercise some care in interpreting order of appearance. For example, it has been observed that children acquire such structures as "John wants to go" before "John wants Bill to go," and that they do not make such errors as *"John tries Bill to win" (cf. "John tries to win"). It has sometimes been argued that such facts support the conclusion that there is a multiple lexical categorization for such verbs as want, namely, as taking either a VP or a clausal complement, the latter sub categorization perhaps more marked. In fact, there is evidence that the V-VP alternative is the unmarked case for surface structure, but this does not bear on the question of multiple subcategorization. Rather, it relates to a very different question: namely, the correct analysis of the surface structure V-VP at D-structure, S-structure and LF. I will argue later that this is a structure of the form V-clause at D- and S-structure and at LF, where the clause is invariably of the form NP-VP, with NP = PRO (the empty pronominal element) as an unmarked option in these cases. If so, then the order of acquisition is quite compatible with the preferable as sumption that there is only a single categorization: want-clause.10 How do we delimit the domain of core grammar as distinct from marked periphery? In principle, one would hope that evidence from language acquisition would be useful with regard to determining the nature of the boundary or the propriety of the distinction in the first place, since it is predicted that the systems develop in quite different ways. Similarly, such evidence, along with evidence derived from psycholinguistic experimenta tion, the study of language use (e.g., processing), language deficit, and other sources should be relevant, in principle, to determining the pro perties of UG and of particular grammars. But such evidence is, for the time being, insufficient to provide much insight concerning these problems. We are therefore compelled to rely heavily on grammar-internal con siderations and comparative evidence, that is, on the possibilities for constructing a reasonable theory of UG and considering its explanatory power in a variety of language types, with an eye open to the eventual possibility of adducing evidence of other kinds. Any theory - in particular, a theory of UG -may be regarded ideally ·
10
Lectures on government and binding: the Pisa lecture{
as. a set of concepts and a set of theorems ,.stated in. terms of these con cepts. We. may s�lect a primitive basis of cpncepts in terms' �f .whl�hthe others are definable, and an axiom system from whlch the ,tl;leore�s are derivable .. While it is, needless to say, much too early to hope for a realistic proposal of thls sort in the case ofUG, 11 nevertheless, it is perhaps 11seful to take note of some of the conditions that such a theory should satisfy. In the general case of theory construction, the primitive basis can be selected in any number of ways, so long as the condition of definability is met, perhaps subject to conditions of simplicity of some sort.12 ;But in the case ofUG, other considerations enter. The primitive basis must meet a condition of epistemological priority. That is, still assuming the idealiz?- tion to instantaneous language acquisition, we want tlie primitives to be concepts that can plausibly be assumed to provide a preliminary, pre; linguistic analysis of a reasonable selection of presented data, that is, to provide the primary linguistic data that are mapped by the language faculty to a grammar; relaxing the idealization to permit transitional stages, similar considerations still hold.'l It would, for example, be reasonable to svppose that such concepts as "precedes" or "is voiced" enter into the primitive basis, and perhaps such notions as "agent-of-action" if one believes, say, that the human conceptual system permits analysis of events in these terms independently of acquired language. But it would be unreas'onable to incorporate, for example, such notions as "subject of a sentence'; or other grammatical relations within the class of primitive notions, since it is unreasonable to suppose that these notions can be directly applied to linguistically unanalyzed data. Rather, we would expec;t that such notions would be defined inUG in terms of a primitive basis that meets the condi tion of epistemological priority. The definition might be complex. For example, it might involve some interaction of syntactic configurations, morphology, and 8-roles (e.g., the grammatical subject 1s the (usual) agent of an action and the direct object the (usual) patient), where the terms that enter into these factors are themselves reducible to an acceptable primitive basis.14 Again, an effort to develop a principled theory ofUG is surely premature, but considerations of thls sort are nevertheless not out of place. They indicate that we should, for example be wary of hypotheses that appear to assign to grammatical relations too much of an independent status in the functioning of rule systems. I will return to some exam.: pies. Since virtually the origins of contemporary work on generative gram mar, a major concern has been to restrict the class of grammars made accessible in principle byUG, an obvious desideratum .ifUG is to attain explanatory adequacy, or, to put the same point �ifferently, ifUG is to account for the fact that knowledge of langliage is acquired on the basis of the evidence available. The problem can be viewed in a slightly different light when we distinguish between core grammar and marked periphery. Consider the theory of core grammar, assuming it to be decomposable into
Outline of the theory of core grammar
11
the subsystems of rules of (1). It is reasonable to suppose that the rules (iv) of the LF-component do not vary substantially from language to language, and that such variety as may exist is determined by other ele ments of the grammar; the language learner, after all, has little direct evidence bearing on the character of these rules. While there is variety among the systems associating S-structure and phonetic form (iii), it is plausible to assume that this variety falls within finite bounds. X-bar theory permits only a finite class of possible base systems (ib), and the transfor mational component, consisting of the single rule Move-a, admits at most a finite degree of parametric variation (perhaps, choice of a, or specifica tion of landing sites in the sense of Baltin ( 1978, 1979) ). The lexicon allows for infinite variety only in the trivial sense that there may be no finite bound on the length of words and morphemes; subcategorization frames and _the like are narrowly limited in variety. If these assumptions are correct, then UG will make available only a finite class of possible core grammars, in principle. That is, UG will provide a finite set of parameters, each with a finite number 'of values, apart from the trivial matter of the morpheme or word list, which must surely be learned by direct exposure for the most part. Depending on the nature of the theory of markedness, there may or may not be an infinite class of possible grammars, but this is an essentially uninterest ing question in this connection, since marked constructions will be added by direct evidence (or indirect negative evidence), and can thus proli ferate only slowly, raising no questions of principle. The conclusion that only a finite number of core grarrimars are available in principle has consequences for the mathematical investigation of genera tive power and of learnability.15 In certain respects, the conclusion trivial izes these investigations. This is evident in the case of the major questions of mathematical linguistics, but it is also true of certain problems of the mathematic theory of learnability, under certain reasonable additional assumptions. Suppose thatUG permits exactly ngrammars. No matter how "wild" the languages characterized by these grammars may be, it is quite possible that there exists a finite set of sentences s such that systematic investigation of S will suffice to distinguish the n possible grammars. For example, let S be the set of all sentences of less than 100 words in length (a well-defiried notion ofUG, if the set of possible words is characterized). Then it might be that for each of then possible grammars there is a decision procedure for these "short" sentences (even if the grammars lack deci sion procedures in general) that enables thengrammars to be differentiated in S. The grammars may generate non-recursive sets, perhaps quite "crazy" sets, but the craziness will not show up for short sentences, which suffice to select among these grammars.Under this assumption, so-called "language learning"- i.e., selection of a grammar on the basis of finite data -will be possible . even if the languages characterized ·by the gram . mars have very strange properties.
12
Lectures on government and binding: the Pisa lectures
Note that the assumption is not unrealistic, surely no more unrealistic than others standard in the theory of learnability. Thus, contrary to what has often been alleged, there is no conceptual connection between re cursiveness and "learnability" (in any empirically significant sense of the latter term) - which is not to deny that one might construct some set of conditions under which a connection could be established, a different matter.16 Hence certain questions of the mathematical theory of learn ability are trivialized by the assumption thatUG permits only a finite set of grammars, under plausible additional assumptions, such as those just mentioned. Note also that one should be very wary of arguments purport ing to favor one linguistic theory over another on grounds of alleged problems concerning learnability. On fairly reasonable assumptions, these questions simply do not arise. But even if correct, the finiteness assumption forUG does not show that investigations in mathematical linguistics or the theory of learnability for infinite classes of grammars are pointless. Rather, it indicates that they proceed at a certain level of idealization, eliminating from consideration the properties ofUG that guarantee finiteness of the system of core gram mar. One may then ask whether the infinite set of core grammars available under this abstraction fromUG is learnable in some technical sense of the word, or what the properties may be of the class of generated languages. Work conducted at this level of idealization might prove to have empirical consequences in an indirect but perhaps significant way. For example, abstracting from properties ofUG that guarantee finiteness, the set of possible grammars and languages, now infinite, might be unlearnable in some technical sense. If, on the contrary, the set turns out to be learnable under this idealization, then this is a possibly interesting though rather abstract empirical discovery about the properties ofUG. Furthermore, even for finite classes of grammars that are learnable for a reasonable finite potential data base, significant questions arise in the theory of learnability, specifically, questions relating to the bounds on complexity of senten,ces that suffice for selection of grammar. Similar observations hold of the study of the power of various theories. There have been many allusions in the literature to this topic since iP.e appearance of the badly-misunderstood work of Peters and Ritchie (1973) on the generative power of the theory of transformational grammar. In essence, they showed that in a highly unconstrained theory of transforma tional grammar, a particular condition on rule application (the survivor property, which, Peters independently argued, was empirically well motivated ; cf. Peters (1973)) guarantees that only recursive sets are generated, whereas without this condition any recursively enumerable set can be generated by some grammar. The common misunderstanding is that "anything goes," since without the survivor property or some compar able constraint all recursively enumerable languages have transformational grammars. In fact, the questions do not even arise except under .the ideal ization just noted ifUG permits only a finite class of grammars. lt might, 17
Outline of the theory of core grammar
13
for example, turn out that these grammars characterize languages that are not recursive or even not recursively enumerable, or even that they do not generate languages at all without supplementation from other faculties of mind, but nothing of much import would necessarily follow, contrary to what has often been assumed. 18 It is worth asking whether the correct theory of UG does in fact permit only a finite number of core grammars. The theories that are being studied along the general lines I will be discussing here do have that property, and I think that it probably is the right property.lt may also be worthwhile to investigate the properties of UG under an idealization that would permit an infinite class of grammars, but care must be exercised in considering the implications of results attained in such investigation. One must guard against other fallacies. Early work in transformational grammar permitted a very wide choice of base grammars and of trans formations. Subsequent work attempted to reduce the class of permissible grammars by formulating general conditons on rule type, rule application, or output that would guarantee that much simpler rule systems, lacking de�ailed sp"ecification to indicate how and when rules apply, would never theless generate required structures properly. For example, X-bar theory radically reduces the class of possible base components, various conditions on rules permit a reduction in the category of permitted movement rules, and conditions on surface structure, S-structure and LF allow still further simplification of rules and their organization. Such reductions in the variety of possible systems are obviously welcome, as contributions to explanatory adequacy. But it is evident that a reduction in the variety of systems in one part of the grammar is no contribution to these ends if it is matched or exceeded by proliferation elsewhere. Thus, considering base rules, transformations, interpretive rules mapping . the output of these systems to phonetic and logical form, and output conditions on PF and LF, it is no doubt possible to eliminate entirely the category of base systems by allowing a proliferation in the other components, or to eliminate entirely the category of transformations by enriching the class of base systems and interpretive rules.19 Shifting the variety of devices from one to another component of grammar is no contribution to explanatory ade quacy. It is. only when a reduction in one component is not matched or exceeded elsewhere that we have reason to believe that a better approxi• mation to the actual structure of mentally-represented grammar is achieved. The objective of reducing the class of grammars compatible with primary linguistic data has served as a guiding principle in the study of generative grammar since virtually the outset, as it should, given the nature of the fundamental empirical problem to be faced - namely, ac counting for the attainment of knowledge of grammar - and the closely related goal of enhancing explanatory power. Other guiding ideas, while plausible in my view, are less obviously valid. It has, for example, proven quite fruitful to explore redundancies in grammatical theory, that is, cases
14
Lectures on government and binding: the Pisa lectures
in which phenomena are "overdetermined" by a given theory in the sense that distinct principles (or systems of principles) suffice to account for them. To mention one example that I will consider below, the theories of Case and binding exhibit a degree of redundancy in the OB-frameworkin that each suffices independently to determine a substantial part of the distribution of the empty pronominal element PRO: PRO appears in positions that are not Case-marked, and from an independent point of view, in positions that are transparent (non-opaque) in the sense of binding theory ; but see chapter 4, for some qualifications. In OB, this is mentioned as a problem (cf. OB, note 30); in chapter 3, I will suggest that it is to be resolved by reduction of both Case and binding theory to the more funda mental concepts of the theory of government. To mention another case, I will suggest that the * [that-trace] filter of Chomsky and Lasnik (1977) is too "strange" to be an appropriate candidate for UG and should be reduced to other more natural and more general principles (cf. Taraldsen (1 978b), Kayne (1980a), Pesetsky (1978b)). Similarly, I will suggest that the two binding principles of the 0B�system the (Specified) Subject Condition SSC and the Nominative Island Condi tion NIC - are implausible because of their form, and should be reduced to more reasonable principles. Much recent work is motivated by similar concerns. This approach, which has often proven fruitful in the past and is, · I believe, in the cases just mentioned as well, is based on a guiding intuition about the structure of grammar that might well be questioned: naniely, that the theory of core grammar, at least, is based on fundamental prin ciples that are natural and simple, and that our task is to discover them, clearing away the debris that faces us when we explore the varied pheno mena of language and reducing the apparent complexity to a system that goes well beyond empirical generalization and that satisfies intellectual or even esthetic standards. These notions are very vague, but not incompre hensible, or even unfamiliar: the search for symmetry in the study of particle physics is a recent example; the classical work of the natural sciences provides many others. But it might be that this guiding intuition is mistaken. Biological sys tems - and the faculty of language is surely one -often exhibit redundancy and other forms of complexity for quite intelligible reasons, relating both to functional utility and evolutionary accident. To the extent that this proves true of the faculty of language, then the correct theory ofUG simply is not in itself an intellectually interesting theory, however em pirically successful it may be, and the effort to prove otherwise will fail. Much research on language has been guided by the belief that the system is fairly chaotic, or that language is so intertwined with other aspeCts of knowledge and belief that it is a mistake even to try to isolate a faculty of language for separate study. Qualitative considerations based on "poverty of the stimulus" arguments, such as those mentioned above and considered in more detail elsewhere, strongly suggest that this picture is not generally
Outline of the theory of core grammar
15
correct, but it might prove to b e correct for large areas of what we think of as phenomena of language. In the cases just mentioned, for example, it could turn out to be the case that the redundancies simply exist, that odd and very special properties such as the * [that-trace] filter or the two bind ing conditions of the OB system are simply irreducible and must be stipu lated with UG, or that these principles are already too abstract and that we must' be satisfied with superficial empirical generalizations. It is pointless to adopt a priori assumptions concerning these matters, though one's intuitive judgments will, of course, guide the course of inquiry and the choice of topics that one thinks merit careful investigation. The approach I will pursue here can be justified only in terms ofits success in unearthing a more "elegant" system of principles that achieves a measure of explanatory success. To the extent that this aim is achieved, it is reasonable to suppose that the principles are true, that they in fact characterize the language faculty, since it is difficult to imagine that such principles should merely hold by accident of a system that is differently constituted. Considerations of this sort are taken for granted, generally implicitly, in rational inquiry (for example, in the more advanced sciences), and there is little reason to question them in the present context, though it is quite appropriate to do so elsewhere; specifically; in the context of general epistemology and metaphysics.20 But it is worth bearing in mind that this class of rather vague methodological guidelines has a rather d�fferent status, and much less obvious validity, than the search for more restrictive theories of UG, which is dictated by the very nature of the problem faced in the study of UG. It is quite possible to distinguish between these concerns. For example, a theory of UG with redundancies and inelegant stipulations may be no less restrictive than one that overcomes these conceptual defects. Insofar as we succeed in finding unifying principles that are deeper, simpler and more natural, we can expect that the complexity of argument explaining why the facts are such-and-such will increase, as valid (or,in the real world, partially valid) generalizations and observations are reduced to more abstract prinCiples. But this form of complexity is a positive merit of an explanatory theory, one to be valued and not to be regarded as a defect in it. It is a concomitant of what Moravcsik (1980) calls "deep" as opposed to "shallow" theories of mind, and is an indication of success in developing such theories. It is important to distinguish clearly between complexity of theory and complexity of argument, the latter tending to increase as theory becomes less complex in the intuitive sense. There is little point in dwelling on these matters, though I think it is perhaps useful to bear them in mind, particularly, if one hopes to make sense of current tendencies in the study of language. It is not difficult, I think, to detect the basic difference in attitude just sketched in work of the past years, and my personal feeling is that it may become still more evident in the future, as questions of the sort just briefly mentioned come more to the fore, as I expect they will.
16
Lectures on government and binding: the Pisa lectures
Notes Henceforth, I will capitalize the word "Case" when used in its technical sense, along lines suggested originally by Jean-Roger Vergnaud. Cf. OB, Rouveret and Vergnaud (1980), and Vergnaud (forthcoming). See also Babby (1980), van Riemsdijk (1980).See, for example, the references in chapters 4, 5, below. 2. -See Chomsky (1980b,c) for some discussion. 3. See Chomsky .(1980b,c) for further discussion. See �so the final remarks of 4. chapter I, Chomsky (1965). For some intriguing recent ideas on this subject, see Halle and Vergnaud (1980). 5. 6. Cf. Kean (1975), van Riemsdijk (1978b), George (1980); also many papers in Belletti, Brandi and Rizzi (forthcoming). 7. On the legitimacy of this idealization and the implausible consequences of rejecting it, see Chomsky (1980b, chapter 1) . 8. On this matter, see Wexler and Culicover (1980). See also Baker (1979), Lasnik (1979). 9. For a concrete example illustrating this possibility, involving stylistic inversion and the so-called "pro-drop parameter," see Rizzi (1980b). I 0. For discussion, see Rizzi, ibid. Cf. Chomsky (1955) for an early, and no doubt premature effort in this direction. II. For further discussion of the issue, see Chomsky (1965, chapter I) and (1977a, chapter I); also Baker (1979), Wexler and Culicover (1980), and other related work. See Goodman (1951). 12. 13. On this matter, see Chomsky (1975, chapter 3). Cf. Marantz (1981), where it is suggested that broader considerations (including 14. developmental factors) are involved. One might, perhaps, take a different tack,· and suppose that these or other notions are primitives, linked to notions that meet the condition of epistemological priority by postulates that do not suffice to guarantee definability. The consequence is indeterminacy of the choice of grammar when the extension of each of the primitives' meeting the condition of epistemo" logical priority is fixed.-.J'he iess fully such notions_ as those of the theory of grammatical relations, for example, are r�ucible to primitives meeting this condition, the greater the indeterminacy of grammars selected on the basis of primilry linguistic data. There is, however, little reason to suppose that such indeterminacy exists beyond narrow bounds. Insofar as this is true, we should be skeptical about theories with a primitive basis containing concepts that cannot plausibly be assumed to enter into the determination of the primary linguistic data, or about unrealistic assumptions concerning such' data (cf. references of note 8). Again, more complex versions of these considerations apply if we turn to accounts of language acquisition that proceed beyond the-ide�ization to instantaneous acquisition, say, along the lines discussed in Chomsky (1975, chapter 3), Marantz (op.cit.). 15. On the latter topic, see Wexler and Culicover (1980) and material reviewed there. Also Pinker (1979). 16. On the question of recursiveness and learnability, see Levelt (1974), Lasnik(1979, 1980), R. Matthews (1979). Also Chomsky (1980b, chapter 3). See Wexler and Culicover, op.cit., for interesting W()rk on this topic. 17. 18. See Chomsky (1965, p. 62), and the references ofrioi:e 16. As was done, for example, in the earliest generative grammar in the modem sense, 19. which had as its syntactic component a phrase structure grammar (which could perfectly well have been presented as a context-free grammar) with indices to express interrelations among scattered parts of syntactic structures (Chomsky, 1951). The class of such grammars (which generate context-free languageS, a fact of minimal significance) is extremely rich in descrip tive power and quite uninteresting for this among other reasons. It was an advance when subsequent work showed that these powerful, but clumsy and unrevealing systems, could be factored into two components (base and transformational), each with quite natural pro perties. 20. See the references cited in the preface for further discussion.
I.
Chapter 2
Subsystems of core
grammar
Let us now turn to the theory of UG, concentrating on core grammar. In chapter 1, several subsystems of rules and of principles were identified (cf. 1.(1 ), 1.(2)). We now turn to some properties of these. Since the subsystems interact so closely and are so interdependent, perhaps the most satisfactory procedure will be to develop them more or less in parallel, beginning with a first approximation tO each and then turning to further refinements, even at the cost of some redundancy and some internal conflict as early approximations are modified. ·
2 .1.
Levels of representation
Consider first the subsystems of rules 1.(1 ). At the most general level, assume UG to have three fundamental components, qrganized as in (1): syntax
(1)
.S-structure � /·�LF PF
The rules of the syntax generate S-structures. One system of interpretive rules, those of the PF- component, associates S-structures with representa tions in phonetic form (PF); another system, the rules of the LF-compo nent, associates S-structures with representations in "logical form" (LF), where it is understood that the properties of LF are . to be determined empirically and not by some extrinsic concern such as the task of determin ing ontological commitment or formalizing inference; tlie term "LF" is intended to suggest - no more-that in fact; the representations at this level have some of the properties of what is commonly called "logical form" from other points of view At the most general level of description, the goal of a grammar is to express the association between representations of form and representa tions of meaning. The system (l) embodies certain assumptions about the nature of this association: namely, that it is mediated by a more abstract S-structure and that the mappings of S-structure onto PF and LF are inde pendent of one another. .1
18
Lectures on government · and binding: the Pisa lectures
The system ( l ), when its elements are specified, will be a theory of UG, of the language faculty in a narrow sense of this term. It is re;:u;onable to suppose that the representatiqns PF and LF stand at the interface of gram-' matical competence, one mentally rep�esented syste�� and other systems: the conceptual system, systems of belief, of pragmatic competence, of speech production and analysis, and so on. Any particular theory of UG, say, a specification of (1 ), involves empirical assumptions of a somewhat abstract sort concerning legitimacy of idealization, assumptions that may prove incorrect but are unavoidable if we hope to gain some understanding of a system as complex as the language faculty, and more· generally, the human mind. I will put these questions aside, and continue with thdnves tigation of(l)in a way that embodies and sharpens one·system of assumptions about these questions.2 I assume further that each of the components of (1 ) -the syntax, the PF rules and the LF..:rules -include rules of the form Move-a, where a is some category, their exact nature and properties to be determined. Thus, among the PF-rules there may be rules of movement, rearrangement, etc., which are sometimes called "stylistic rules"; among the LF-rules is the rule of quantifier movement QR (quantifier rule)3; and in the syntax there is the single rule Move-a that constitutes the transformational component l .(liib). For reasons already noted, it is reasonable to suppose that the LF-rules are subject to very little variation among languages, but the option of employing Move-a may or may not be taken in each of the three sub systems, and if taken, may be subject to some parametric variation. Turning first to the syntax, apart from the rule Move-a, which is a poten tial elementof each of the three components, it consists of a base which .in turn consi�t��'of a categorial component and a lexicon. The base gene rates D-structures (deep structures) which are associated with S-structure by the rule Move-a. To clarify terminology, I will use the term "surface structure" in something like its original sense, referring to the actual labelled brack�ting of an expression at the level PF. Changing usage over the years has giv.en.rise to a fair degree of confusion� The term "surface · structure" has bee�· ilsed in much recent work to refer to a more abstract representation than th6 ·ac�ual labelled bracketing of an expression, the earlier sense of "surface sttticture," to which I return here. Let us consider now some properties of representations at the levels of surface structure (PF), LF and S-structure. Consider first the sentences (2): ·
·
(2) ·
(i) the students prefer for Bill to visit Paris (ii) ·the students prefer that Bill visit P�rls .
The �erb prefer; as an inherent lexical property, takes a Cla.usal oorriplem�nt which has a subject NP and a predicate VP (Bill and visft Paris, . respectively, in (2)), and an element (call it "INFL," stiggestirtg ''inflec tion';) indicating in particular whether the clause is finite or infinitiV al.Sup pressing the distinction between indicative and subjunctive, we..yill sayJiiat ·
'
.,
'·i':
19
Subsystems ofcore grammar
INFL has the values [±Tense], where [ +Tensel stands for finite and [-Tense] for infinitival. Following Bresnan (1970, 1972), we assume that a clause (S) consists of a complementizer COMP and a propositional com ponent (S); the latter is analyzed as NP-INFL-VP at LF. Thus at the EF level, the sentences (2) are represented as in (3), where in the case of (2i), COMP = for and INFL = [-Tense]; and in the case of (2ii), CO:rv,tP that and INFL [+Tense]: =
=
the students [VP prefer [s COMP [s Bill INFL [vpvisit Paris]]]]
(3)
The S-structure underlying (3) may be assumed to be identical with (3) . in this case, so that the mapping of S-structure to LF is trivial. Similarly, the mapping of S-structure to surface structure is quite straightforward in this case. Consider next the sentences of (4): (i) (ii) (iii) (iv)
(4)
the students want to visit Paris the students wanna visit Paris the students wantBillto visit Paris the students want that Bill visit Paris
Sentence (4iv) is not idiomatic English, but we may assume this to.be an accidental gap reflecting properties that are not part of c9re grammar; thus assume (4iv) to be fully grammatical at the relevant level of abstraction, as in the. analogous case of (2ii} and as in languages otherwi .. 'se similar to English. At. the level of LF, then, (4iv) is again of the form (3), with want in place of prefer. Turning to (4iii), it is analogous to (2i).exc(!pt that it lacks the COMP for. Consideration of such examples as· (5) reveals that this reflects an idiosyncratic property of the verb want in these dialects of English: '
(5)
.
.
.
(i) the students want very much for Bill to visit Paris (ii) what the students want is for Bill to visit Paris ,
Let us tentatively assume (subject to later discussion) that aJ;Ule ofthe PF component deletes for directly after want in these dialects, as is also pos-: sible directly after prefer, subject to idiosyncratic variation. Then the LF-representation of (4iii)is, again, of theform (3), with want in p1ace of prefer. Examples (4iii) and (4iv) differ at the LF"level in exactly the way that (2i) differs from (2ii), with the choice of COMP for and INFL = [-Tense] in one case, and COMP that and INFL = [+Tense]in the other, the regular association of these elements. The underlying S structures are again identical to the LF-representations. Surface structures are derived byfor-deletion in the case of (4iii) and other details that we need not consider. Now consider the examples (4i) and (4ii). At the level of surface struc=
=
20
Lectures on government and binding: the Pisa lectures
ture, assume these to be represented withthe categorial structure of (6i, ii), just as (4iii) is represented with the categorial structure of (6iii): 4 ·
(6)
(i) the students [a want[11 to [y visit Paris]]] (ii) the students [a wanna [ y visit Paris]] (iii) the students [ VP want [ s [ s Bill to [ VP visit Paris]]]]
Recall that (iii) derives from the underlying S-structure by deletion of the COMP for. Since surface structures observe (phonological) word boun daries, (6i) and (6ii) differ as indicate VP 1 being an infinitival verb phrase, as has occasionally been p'roposed in one or another variant. This amounts to adopting the analysis (24) in essence, replacing it by the considerably clumsier system of rules (27 ) : (27)
(i) VP --+ COMP VP 1 (ii) S --+ COMP S
26
Lectures on government and binding: the Pisii lectures · (iii) S -+ NP VP 1 when COMP = for ; S -+ NP VP2 otherWise (iv) VP 1 is to-VP and VP2 is Tense-VP
Furthermore, all rules introducing embedded S will have to be extended to include VP, which has the same distribution as S, lexical idiosyncrasies apart. This rule system can be stated in various ways, but it is clearly no genuine alternative to (24), which is to be rejected in favor of the simpler (25). I will return to variants of (24), which raise still further difficulties, in 2.4. Indepen9ent and in this case direct evi9ence in ( i.e., sub ject of S 1 ; and so on. Given the projection principle and other properties of syntactic re presentation, the S-structure of (8) will be something like {9): (9)
[ s 1 [ Np John] INFL [ VP be [ a 1 believe [ s2 t' INFL have been [a 2 kill t ]]]]]
We return to the status of a; each trace is a trace of John. Thus John is [NP,S 1 ], t' is [NP,S2] and t is [NP, a 2 ]in S-structure (where ais some thing like VP). . Note that the basic properties of the representation (9) are determined by the projection principle, sinc'e believe takes a chmsal complement as a leXical property. It remains to show that the empty category in (9) is indeed trace, not PRO, assuming the two to be distinct. . , We are now assuniing, following standard practice, that two factors enter into the determination of 0-role: intrinsic lexical properties oflexical items which are heads of phrase categories '(as the verb is the head of VP), '
Subsystems ofcore grammar
43
and GFs such as subject, object, clausal complement, head, etc. To assign 0-roles properly in the sentences (7), (8), for example, we must know that they is [NP,S] in (7i), underlying (7ii), that John is [NP,VPJ in (7) and that its trace is .[NP, a2J in (9), that kill is [V,VP] in (7) (a special case of the more general notion "head") and is head of a2 in (9), etc, We have been assuming that S-structures are formed from D-structures by the ruie Move a and are thus in effect factored into two components: the base and the transformational component .. On this assumption, D-structure is a level of representation at which the GFs relevant to assignment of 0-role and only these have arguments bearing them ; henceforth, let us refer to such a GF as a GF-0. This is the case trivially in (7), and it is true as well for (8), (9) on the assumption that John appears in the position of t in the D-struc ture underlying (9) and is moved successively to the position of t' and then to its S-structure position. To summarize: a property of D-structure, following from the projection principle, is that every .0-role determined obligatorily in the D-structure must be filled by some argument with the appropriate GF, and that each argument must fill exactly one 0-role as determined by its GF. In the D-structures underlying (7) and (8), for example, each NP fills one O�role and each 0-role is properly filled by an NP with the appropriate GF-0. Thus, D-structure is a direct representation of GF-0, among other properties. We may think of this property as constituting the 0-criterio'n for D-structures, as determined by the projection principle. It is clear that apart from GF-0 represented at D-structure, other GFs are also relevant to LF. In the sentence (9), the trace of John bears the relation object to the abstract predicate kill t just as John bears this relation to kill John in (7), but John in (9) bears the relation [NP,S 1J to the sentence (9) itself. Each of these GF� plays a role in determining proper ties of LF. The former determines the 0-role of John as patient or theme. To illustrate the contribution of the latter to LF, consider the sentences (10) : (10)
(i) it seems to each other that they are happy (ii) they seem to each other to be happy
Lexical properties of seem indicate that it takes an optional to-phrase and a clausal complement, so by the projection principle and the 0-criterion, the D-structure of both sentences of (10) must be (1 1) (order as,ide): (1 1)
[ s 1 NP* INFL [ vp[ v seem] [ PP to each other] [ s 2 they INFL be happy ]]]
where INFL in S2 is finite ([ +Tense]) in (lOi) and infinitiv&l ([ - Tense]) in (1Oii) and NP * is an empty NP assigned no 0-role and ultimately filled by pleonastic it or the subject of the embedded clause. In both sentences, the 0-role of they is subject of the predicate be-happy, as determined by the
44
Lectures on government and binding: the Pisa lectures
GF of they as [NP,Sz] in (1 1 ). But in (lOii), they can serve as antecedent for the reciprocal each other, whereas in (lOi) it cannot, sothat the lattet sentence is ungrammatical, with an anaphor lacking an antecedent. The reason, obviously, is that in (lOii), but not (lOi), they takes on a secondary GF: [NP,S 1 ] alongside of [NP,S2]. Each of these GFs thus contributes to LF. There are many other kinds of examples illustrating the contribution of such secondary, non-thematic GFs to LF; for example, in lllany lan guages, subjects (whether thematic or not) and only subjects serve as ante cedents for the reflexive element. It seems, then, that we have two notions of GF relevant to LF: GF-8 arid GF-8, where the former is the notion relevant to assigning 8-role and t1le latter is relevant to LF (if at all) only in other ways. GF-8 is represented at D-structure; GF-8 at S-structure. As noted earlier, the basic question'of how surface structures are related to LF reduces to the question of the relations of S-structure to 8-role assignment. In substantial part, this isthe question of how GF-8 representations are related to GF-8 representations. I have been assuming so far that the answer is given by the factoring of S-structure into the two components that generate it: D-structure and Move-a. Of course, the rule Move-a performs a range of other functions in the grammar: it may appear in the PF- and LF-components: and in the syntax, apart from associating GF-8 and GF-8, it also serves to relate the quasi-quantifier in a wh-phrase to the abstract variable it binds (as in (12)), to express the fact that in (13) the subject of the predicate is here is the abstract phrase a man whom you know and that in (14) an idiomatic inter pretation must be constructed involving take advantage of, along with much else: (12) (13) (14)
(i) who did you think would win (ii) for which person x, you thought [x would win ] a man is here whom you know advantage was believed to have been taken of John
One central assumption of transformational grammar, adopted here, is that the rules playing the essential role in assigning GF-8 to elements of surface form are rules of the same kind that serve many other functions in grammar - indeed, the very same rule: Move-a -rather than being rules of some new and distinct type, an important generalization if correct, as I think it is. In the case of sentence (9), we were led by the projection principle to assume that the rule Move-a applies twice, leaving the two traces t and t', successively. The original position of John indicated by t is relevant to LF by virtue of the GF-8 that it fills, and the final S-structure position of the antecedent of trace may also be relevant, as shown in (10). The GF-8 filled by medial traces such as t' in (9) may also be relevant to LF; for example in the sentence (15), with a D-structure analogous to (1 1), where the medial trace serves as the antecedent of each other, which requires an
Subsystems of core grammar
45
antecedent in the same clause in such cases in accordance with binding theory: (15)
they are likely [t' to appear to each other [t to be happy]]
To put it differently, they in (15) serves as antecedent of each other, via its trace, though it is neither the D-structure nor the S-structure subject of the clause in which each other appears, and thus is not in a position to serve as antecedent in either of these structures. Note that there is a good sense in which S-structure represents both GF-0 and GF-8 . In (9), for example, the antecedent John bears the GF [NP,S 1J by virtue of its actual position in (9), and bears the relations [NP,S2] and [NP, a2J by virtue of the positions of its traces t', t, respec tively. Suppose we associate with each NP in S-structure a sequence (p 1, . . . , Pn) which, in an obvious sense, represents the derivational history of this NP by successive applications of Move-a; thus p 1 is the position of the NP itself; P2 is the position (filled by a trace) from which it was moved to its final position ; etc., p nbeing the position (filled by a trace) occupied by the NP in D-structure. Correspondingly, let us associate with each NP in S-structure the sequence of GFs (GF1 , . . . , GFn), where GFi is the GF of the element filling position p i in the S-structure configuration: the NP itself for i 1 , a trace in each other case. IfNP was base-generated, then GF n is its GF at D-structure, a GF-0. IfNP is a non-argument inserted in the course of a syntactic derivation, then GF n is the GF associated with the position in which it is inserted, a GF-8. Let us call (GF h . . . , GF ,J the "function chain" of the NP filling GF 1· I have defined the function chain in terms of successive applications of Move-a, but it can in fact be recovered from S-structure itself, given other properties of syntactic representations ; this is a basic assumption of the Extended Standard Theory (EST) as represented in 2.1 .(1), with no direct connection between D-structure and LF. We may therefore think of S-structure as an enriched D-structure incorporating the contribution ·of D-structure to LF, thinking still of D-structure and the rule Move-a as the two components that interact to yield the full S-structure, D-structure being a representation of GF-0 determined by abstracting from the effects of Move-a. Returning to example (9), John is assigned the function chain (GF h GF:o GF3), where GF 1 is [NP,S 1] and GF3is [NP,a2J. Thus John acquires the 0-role assigned by kill to its object, and is the S-structure subject of the full sentence ; no 0-role is assigned to the latter position, which can there fore be filled by idiom chunks, as in (14), or by non-arguments, as in (16): =
(16)
it was believed that John was killed
Suppose that NP has the function chain (GF 1 , . . . , GFn) in some S structure. Then we have the following consequences of the 0-criterion and
46
Lectures on government and binding: the Pisa lectures
t4e projection principle: (17)
(i) if NP is an argument, then GF is a GF-0 (ii) for i :/= n, GF iis a GF-0 n
The projection principle yields (i) directly, since it implies that the 0-cri terion holds at D-structure, where each chain is of length one. For argument NP, (17ii) follows directly from (17i) by the 0-criterion, for if GF i is a GF-0 then the NPwill be doubly 0-marked. IfNPis a non-argument, (17ii) holds by virtue of the 0-criterion, for if GF i is a GF-0 then this non-argu ment will be assigned a 0-role by GFi. We return to some concrete examples in § § 2.6, 3:2.3. · Under a revision of the notion of 0-role assignment that we will adopt in chapter 3, this derivation of (17ii) from (17i) no longer goes through for the case of NP an argument. But we can reach the same conclusion in a slightly more indirect way, which will stand under later revisions. Suppose GFi to be a GF-0, for i =f. n. Then by the projection principle, it is assigned an argument at D-structure, and this argument is "erased" by application of Move- � to this position. But any reasonable version of the principle of recoverability of deletion will require that arguments cannot be erased by substitution ; in fact, the target of movement can only be [ aeJ lacking an index, a non-argument. Hence (17ii) follows from the projection principle and the principle of recoverability ofdeletion. It follows that movement must always be to a position to which no 0-role is assigned ; and apart from idiom chunks and other expressions that are not "referential" in the appropriate sense and thus do not serve as argu ments, movement must be initially from a position in D-structure to which a (genuine) 0-role is assigned. Again, D-structure serves as a representation of 0-role assignment (inter alia). For example, since a 0-role is assigned obligatorily to the complements of a verb in its verb phrase, there can be no movement to a position within VP. 19 But since the position of subject, which is not subcategorized by the verb of VP, may lack a 0-role, as in the examples (9), (10), (14), there can be movement from object or subject to subject, though not to the position of a subject assigned a B-role by its VP. These are general properties of the rule Move-a, or of its counterpart in non-configurational languages, to which we return in § 2.8. In § 2.4.6, we will consider the question of choosing between the version of EST expressed in 2.1 .(1) and an alternative theory that replacesthe rule Move-a by a new class of interpretive rules, with the same properties as Move-a, mapping S-structure to LF. The differences between these variants is quite subtle, and it may turn out (as suggested in the remarks in Chomsky (1977a) quoted in chapter 1 , above), that they will ultimately be shown to reduce to the same theory at the appropriate level of abstrac tion. We have evidence in favor of the variant that assumes the existence of D-structure and the rule Move-a, as in 2.1 .(1) (or some mere notational variant of it), whenever we can show that some empirical argument relies ·
·
Subsystems of core grammar
47
on a property of an independent level of D-structure. As we have just seen, the principles (17) rely on such a property ; namely, the property that the &-criterion b� satisfied at D-structure� In § 2.6 we will see that the principles (17) have direct empirical consequences (cf. 2.6.(35)-(37)). Therefore, we have an empirical argument supporting 2.1:(1) over an alternative that invokes .a new interpretive rule of the LF component in place of the syntactic .tule Move-a, relating two levels of syntactic representation. Other examples of a similar sort will appear as we proceed. It seems · to me that the weight of evidence supports 2.1 .( 1) over the alternative just mentioned, but it will be observed that the arguments sup porting this conclusion are highly theory-internal. Of course, all arguments are theory-internal, but there are important differences of degree. Thus, the behavior of non-arguments provides quite direct evidence that the subject position is obligatory in syntactic representation in English, and the distinction between S and NPwith regard to obligatoriness of 8-marked subjects provides indirect evidence, more theory-bound, that the same is true quite generally. It is inevitable that empirical arguments bearing on the .choice between theories that are very close in conceptual structure, as in the .case we are now considering, will be highly theory-i�ternal. To the extent that questions of this sort can be raised at all and confronted with evidence, we have indication of progress in theoretical understanding. The GFs discussed in the preceding remarks are those that belong to the thematic complex associated with the head of a construction, in the sense of Rouveret and Vergnaud (1980) ; namely, subject-of-S and comple ments of X (X = N, V, A, P). The positions in which these GFs are assigned .are sometimes called "argument positions" (cf. note 12), but since I am using the . term "argument" in a slightly different way, I will avoid this terminology, referring to them rather as "A-positions" and to the GFs ·determined in them as "A-GFs." An A-position is one in which an argu ment such as a name or a variable may appear in D-structure; it is a poten tial 8-position. The position of subject may or may not be a 8-position, depending on properties of the associated VP. Complements of X are al ways &-positions, with the possible exception of i a non-A-GF (A-GF) that we may denote "adjunct of COMP." Assume that there are two types of move ment rules: substitution and adjunction, the latter always forming a struc ture of the form [ f3 a {j] or [ f3 {j a], where a is adjoined to {j by Move-a. Then the only GFs are heads, complements, adjuncts and subject.20 A principled approach to the theory of GFs, which I will not undertake here, will begin by defining such general notions as "head," etc., then defining particular GFs in terms of them. Cf. Chomsky (1955) for an outline of such ·
\
48
Lectures on government and binding: the Pisa lectures
a theory. In such a theory, we should also express the fact that [NP, .X] is the GF direct object whether X is N, V or A; and that [NP, NP] and {NP, S] both express the GF subject; Note that the notation does not suffice for the case in which more than one term of category a appears as a complement, as in double-NP construe.; tions V-NP-NP (give-John-a book). In this case, let us simply use the nota tion [NP 1, VP], [NP 2, VP] (primary and secondary object); similarly, in other cases. As in the outline ofO-theory, I am omitting discussion of many other relevant questions: e.g., the status of predicate nominals, topics, heads of relatives. Note also an ambiguity of the notation in the case of adjuncts: thus the notation " [NP, VP1]" does not distinguish between (18i) and (1 8ii), the former being a case of adj.unction, the latter, direct object: (1 8)
(i) { VP, VP 2 NP] (ii) [ yp1 V NP ]
The problem can easily be resolved by means of a more careful develop ment of theory and notations in terms of heads, complements and adjuncts. No problems arise within the restricted class of constructions that I will be considering here, so I will simply put these matters aside, for considera tion elsewhere. In §2.8. I will turn to the question of how these notions can be appro priately generalized to languages in which GFs are not represented configurationally, as they are in English. 2.3.
The categorial component and the theories of case and government
A number
of assumptions have been made in the foregoing discussion about the categorial component of the base. Let us now review and extend these somewhat. Assume that the rules of the categorial component meet the conditions of some version of X-bar theory. Specifically, let us assume a variant based on two categories of traditional grammar: substantive ([ + N]), including nouns and adjectives, and predicate ([ + V]), in�luding verbs and adjectives. Let us refer to substantives and predicates as the "lexical categories." So we have a system based on the features [ ±N ], [ ±V], where [ + N, - V] is noun, [- N, + V] is verb, [ + N, +V] is adjec� tive, and [ - N, - V J is preposition, the first three being lexical categories. The basic rule for lexical categories is (1 ), where . . . is fixed for all lexical categories in the unmarked case: (1)
X � . . X . . . (X = [ + N, ±V] or [ + V, ±N]) .
That is, I am assuming that in the unmarked case, nouns, verbs and adjec tives have the same complement structures. Thus (2i) and (2ii) have the same form in the base, apart from lexical content, as do (3i) and (3ii) ; and
Subsystems ofcore grammar
49
I assume that (4ii) is derived from (4i), a "transitive adjective," by the same rule of ofinsertion that gives the surface forms of (2ii), (3ii), so that adjectives have close to the full range of verbal and nominal comple ment stnictures:21 (i) destroy the city (ii) destruction of the city (i) write the book (ii) writer (author) of the book (i) proud John (ii) proud of John
(2) (3)
(4)
Of course, in surface structure, verbal constructions differ from nominal and adjectival constructions in form. I assume that the reasons derive from Case theory.22 The crucial idea is that every noun with a phonetic matrix must have Case ; i.e., we assume the principle (5):
[
* N a ],
(5)
where a includes a phonetic matrix, if N has no Case
Assuming that Case is assigned to NPs by virtue of the configurations in which they appear and percolates to their heads, (5) follows from the Case Filter (6), which I will assume to be a filter in the PF-component: *NP if NP has phonetic content and has no Case
(6)
Only the empty categories trace and PRO may escape the Case Filter, appearing with no Case. Note that the Case Filter (6) is somewhat more general than (5), since it holds also for NPs that have no lexical N as head, for example, gerunds or clauses (if these are NPs23 ). Thus in positions in which no Case is assigned, say the subject of an infinitive in the unmarked case, neither gerunds nor NPs with lexical heads can appear, as illustrated in (7), where t is the trace of who: (7)
·
(i) * it is unclear (ii) * it is unclear
[ s who [ s[ NP reading books ] to interest t ]] [ s who [ s [ NP John ] to visit t ]]
In the embedded subject position of (7), we may only have an empty cate gory, either trace or PRO, by virtue of the Case Filter; other considerations to which we return restrict the choice to PRO. Note that the Case Filter gives a partial answer to the question raised earlier of where PRO may or must appear, since it excludes categories with phonetic content from such positions as embedded subject of (7). The Case Filter is much wider in application, however ; cf. OB. In English, only the [ - N J categories verb and preposition are Case assigners. In OB, it is assumed that verbs assign objective Case and that
50
Lectures on government and binding: the Pisa lectures
prepositions assign oblique Case, apart from marked properties, Let :us tentatively assume, following a suggestion of Kayne's, that in English both verbs and prepositions assign objective Case, the richer Case systems hav ing been lost (cf. § 5 .2). Furthermore, nominative Case is assigned to the subject of a tensed sentence and genitive Case is assigned in the context [ NP-X], as in "John's book," "his reading the book.'' In other languages, categories other than [ - N] are Case-assigners, and there are other Cases and other conditions under which Case is assigned, a matter to which we return in § 3.2.2, though not in any comprehensive way: Case-assignment is closely related to government, a crucial notion �o which we will return in chapter 3. Normally, Case is assigned to an NP by a category that governs it. As for the notion "government," for the moment let us simply assumethatthepotentialgovernors arethe categories [ ±N, ±V] and INFL, that a category governs its complements in a construction of which it is the head (e.g., V governs its complements in VP, etc.), and that INFL governs the sentence subject when it is tensed. 24 To illustrate, con' sider the example (8):
(8)
John [ INFL [ + Tense]] [ yp[ v think ] [ s that [ s he [ INFL [ + Tense ]] [VP [v leave ] [ NP his book ] [ pp[ p.on ] [NP the table ]]]]]] ("John thought that he left his book on the table")
The matrix verb think governs its complement S , but not any element (e.g., he) inside S. The embedded verb governs its complements his book and on the table, but does not govern any element (e.g., his or the table) withj.n these categories. Thus, his book and book receive objective Case (the latter, by percolation). The two occurrences of INFL govern fohn and he, as�ign ing them nominative Case. The preposition on governs and assigns objec tive Case to its complement the table. The genitive rule assigns genitive Case to the ungoverned element his. Returning to (7), the embedded subjects are ungoverned and therefore receive no Case. Some languages have marked rules that permit Case to be assigned to the subject of an infinitive in such structures. 25 I will not pursue these and other such topics here, though they raise many interesting questions. Let us now return to the examples (2)-(4). Since only the [- NJ catego ries verb and preposition are Case-assigners, it follows that in the comple ment . . . of X in (1), Case will not be. assigned by Xfor X :/:. [ - N]. Then the language requires some. other device if it is to allow these complement structures to surface. One device, typical of English-like languages that use prepositions instead of inflectional Case systems, is to insert an empty preposition devoid of semantic content as a kind of Case-marker to permit nominal complements, as in (2ii), (3ii), (4ii). Thus we have the rule (9): (9)
NP --+ [ p of] NP in env.: [ + N J -
Subsystems of core grammar
51
We leave open the exact formulation of this rule (e.g., does it adjoin ofto NP forming [NP of NP] ? ) ; (cf. note 2l).M. Anderson (1977) shows that as suming ofinsertion, the rule Move-a (under constraints that she discusses) will then yield such expressions as the city's destruction from destruction the city, but not, say, John's gift from gift to John or John's beliefto be afool from belief - John to be a fool (cf. John is believed to be a fool) ; see also Fiengo (1979), Kayne (1980b). Thus, the nominal correspondingto (2i) can surface in on.e of two forms: as (2ii), with ofinsertion, or under Move-a followed by genitive Case-assignment. In either case, the Case Filter is satisfied, but one or the other option must be taken or the filter is violated. 26 Such examples as (2)--{4) raise some technical questions concerning the projection principle. At D-structure, destruction (or, perhaps, its head destroy) subcategorizes and 8-marks the NP object the city in (2i). If the projection principle is valid, this must also be true at S-structure and LF. There is no problem if the ofinsertion rule is an adjunction rule forming the NP [ NP ofNP ], as in one of the options we have just considered, or if the ofphrase .is base-generated as a PP (cf. note 21). If the ofinsertion rule creates a PP, however; we must continue to hold that destruction sub categorizes its NP object at S-structure and LF. Note that still another pos sibility would be to assume that the ofinsertion rule forms a neutralized NP-PP of the form [ - V], in which case, again, no problem arises and this category will share properties of both NP and PP. Little seems to be at stake beyond terminology, so I will drop the matter. The examples (2)-(4) shed some light on the relations among the notions "subcategoriZe," "&-mark," "govern" and "Case-assign." Generallythese coincide, but not always ; · we have already noted the possible dissociation of subcategorization and 8-marking (cf. § 2.2). In such examples as (2ii), the NP the city is subcategorized by destruction or its verbal head, 8-marked by this element, governed by the inserted preposition of; and assigned Case by of. As the example indicates, government rather than subcatego rization is the relevant notion for Case-assignment - in this case, at S structure. But government is also the relevant notion for subcategorization in D-structure in case (2i), and generally. Thus the theories of subcatego rization, 8-marking and Case all fall within the general theory of govern ment, at least in their essentials. There has been much recent discussion of further projections of the basic categories to higher bar structures. I will not enter into these issues here, but will simply assume that there are maximal projections with the appropriate number of bars for each category. I will use the notation Xi for X with i bars, and will continue to use NP for the maximal projection of [ + N, - V], AP for the maximal projection of [ + N, + V], and PP for the maximal projection of [ - N, - V ]. One debated question is whether the S, S system 27 should be regarded as a projection of V, with verbs taken to be heads of clauses, or whether this is a separate system, perhaps with INFL as head. I will assume here that the S, S system is separate and will use the symbol VP for the maximal projec-
52
Lectures on government and binding: the Pisa lectures
tion of V, a .constituent of S. Some considerations bearing on this decision will arise later on. I have been assuming the expansion (10) for S in English (cf. § 2.1 ): (10)
S -+ NP INFL VP
The "inflectional" element INFL may, in turn, be [ ±Tense ], i.e., finite ([ + Tense]) or infinitival ([- Tense]). If finite, it will, furthermore, have the features person, gender and number; call this complex AGR ("agree ment"). The element AGR is basically nominal in character; we might consider it to be identical with PRO and thus to have the features [ + N, - V ]. If so, then we may revise the theory of government, taking AGR to be the governing element which assigns Case in INFL. Since [ + N, - V] is not generally a Case-assigner, we must extend the theory of Case so that [ + N, -V, + INFL] is a Case-assigner along with [- N], regarding [ + INFL] as basically "verbal," if we take AGR to be nominal. INFL governs the subject if it contains AGR, then assigning nominative Case by virtue of the feature [ + INFL]. It now follows that the only governors are categories of the form X0 . in the X-bar system (where X = [ ±N, ±V J ). Subjects are nominative when they agree with the matrix verb - technically, with its inflection. In some languages, e.g., Portuguese, AGR may also appear with infinitives, and the subject is indeed nomina tive in this case. The question of whether it is AGR or [ + Tense ] (or per haps some other property, either a configurational property or one involving features of the verb) that governs and assigns nominative Case is an important one with many consequences. We will return to this topic. Avoiding questions of markedness, let us assume that INFL may in prin ciple be the collection of features [[ ±Tense], (AGR) ]. 28 In surface structure, INFL may appear phonetically as part of a verbal affix system, but I will assume here that in S-structure the representation is as in (10). If the S-bar system is a separate system, it might be regarded as a projec tion of INFL. Let me stress again that many important questions are begged or simply omitted in the preceding account, some ofwhich will be considered in more detail below. I present these assumptions here as a concrete basis on which to proceed, pending later modifications. I have also been assuming that there is a rule (1 1) introducing S, where CO MP may be the specifier of S, or perhaps, as some have argued, the head of S: (11)
S -+ COMP S
What about the structure of COMP? I have been assuming -largely as a matter of execution -a theory such as that of OB and earlier work that pos tulates two positions in COMP, one that may be filled with a wh-phrase or other category (e.g., PRO) that has been moved to COMP, and one that is
53
Subsystems ofcore grammar
[ ± WH ], where [- WH J
= that (and analogues in other languages) and [ + WH ] is the abstract element that appears in direct or indirect questions, and might be base-generated with lexical content in the case of such elements as whether. In English there is also the complementizer for in infinitivals. 29 So we have something like (12), where X is a phrase moved to COMP (cf. 2. 1 .(20)):
(12)
[ coMP X
I
± WH ] for [
l
]
The order is not very well-motivated empirically. If (12) is adopted, we might think of it as arising from a rule of adjunction to COMP as suggested in OB, giving the more detailed structure (13) for COMP: ( 1 3)
[ COMP X[ COMP
+WH
I l -
for
JJ
If we. now assume that c-command is a necessary requirement for the ante cedent-trace relation, it follows that the internal COMP must be deleted if a wh-phrase is moved to COMP. Thus the doubly-filled COMP filter of Chomsky and Lasnik (1977) follows without stipulation, as noted by Luigi Rizzi. What of languages that appear to have doubly-filled COMP? It would now be necessary to assume that the overt complementizer is not in COMP, but rather in a r>re-S position within S, as suggested by Reinhart (1979b), where a theory of bounding is developed in these terms. Further consequences of these ideas are developed by Rizzi in work in progress. 30 In the theory outlined in Chomsky and Lasnik (1977), there was good reason to suppose that the wh-phrase in COMPwas subject to a rule offree deletion in COMP applying to that andfor as well. The latter rule was close ly connected to the * [NP-to-VP J filter, which is now largely eliminated in terms of the Case Filter, following a suggestion by J. -R. Vergnaud. In the OB theory, it was proposed that the wh-phrase in infinitives is deleted up to recoverability, along lines suggested by Kayne for French, and it was observed that one might eliminate the residue of the * [NP-to-VP ] filter in this way. 31· This approach has consequences with regard to the rule of free deletion in COMP ; namely, the motivation for it is weakened once there is a different source for the deletion of wh-phrases in infinitivals.32 We might therefore turn to a different approach to thestructure of COMP that was unacceptable in the framework of Chomsky and Lasnik (1977) but is compatible with the OB approach, namely, that the rule that expands COMP is optional ; if it does not apply, then COMP will simply lack a complementizer in declaratives. This has the effect of free deletion of that and for. There are many consequences to this assumption, among them, weakening of the motivation for taking wh-phrases to be in COMP. 33
54
Lectures on government and binding: the Pisa lectures
I will tentatively adopt the assumption that the rule expanding COMP is optional, continuing to assume that wh-movement is to COMP. Thus, tensed clauses may have that or no complenientizer in D-structure, and infinitives may havefor or no coinplementizer. We now have no need for a rule of deletion in COMP apart from the wh-phrase of rell,ltives ("theman (who) you saw"), some residual problems concerning/orin English; and the trace of wh-phrases in COMP. As for the first of these, assume this to be a marked property of English, perhaps governed by a filter as outlined in Chomsky and Lasnik (1 977) or in the different manner develOped in Peset sky (1978b). We return to the other cases. This approach to COMP bears on the formulation of selectional restric tions between matrix verbs and embedded clauses. There are evidently relations between COMP and INFL (that-tense for-to), and matrix verbs differ with regard to the complements they take (declarative or interroga tive, finite or infinitival).34 1f COMP may be empty, we might permit dire