Language and the Lexicon: An Introduction

  • 62 588 7
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Language and the Lexicon: An Introduction

Language and the Lexicon This page intentionally left blank LANGUAGE AND THE LEXICON An Introduction DAVID SINGLETO

1,313 526 13MB

Pages 257 Page size 336 x 497.28 pts Year 2010

Report DMCA / Copyright


Recommend Papers

File loading please wait...
Citation preview

Language and the Lexicon

This page intentionally left blank


DAVID SINGLETON Associate Professor of Applied Linguistics Trinity College Dublin

A member of the Hodder Headline Group LONDON Co-published in the United States of America by Oxford University Press Inc., New York

First published in Great Britain in 2000 by Arnold, a member of the Hodder Headline Group, 338 Euston Road, London NW1 3BH Co-published in the United States of America by Oxford University Press Inc., 198 Madison Avenue, New York, NY10016 © 2000 David Singleton All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronically or mechanically, including photocopying, recording or any information storage or retrieval system, without either prior permission in writing from the publisher or a licence permitting restricted copying. In the United Kingdom such licences are issued by the Copyright Licensing Agency: 90 Tottenham Court Road, London WIP OLP. The advice and information in this book are believed to be true and accurate at the date of going to press, but neither the author nor the publisher can accept any legal responsibility or liability for any errors or omissions. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress ISBN 0340 73173 7 (hb) ISBN 0340 73174 5 (pb) 12345678910 Production Editor: Anke Ueberberg Production Controller: Bryan Eccleshall Cover Design: Terry Griffiths Typeset in 10/12 Sabon by Saxon Graphics Ltd, Derby Printed and bound in Great Britain by MPG Books Ltd, Bodmin, Cornwall

What do you think about this book? Or any other Arnold title? Please send your comments to [email protected]

In memoriam This book is dedicated to the memory of my much-missed friend Andrew Corrigan, unvanquished jazzman, wordsmith extraordinaire and irreplaceable companion.

Too irreverent for the clergy, Too irrelevant for the sages, I tinker on, Lost in a maze of words That hide their meanings In a jovial host Of irreverence And irrelevance.

Andrew Corrigan, 'Disbelief Between', October 1973




1 Introduction: the lexicon - words and more 1.1 Some preliminary definitions 1.2 Words and language 1.3 What's in a word? 1.4 The domain of the lexicon 1.5 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

1 1 1 5 10 12 13 14

2 Lexis and syntax 2.1 Colligation 2.2 The computational perspective 2.3 The 'London School' perspective 2.4 The Valency Grammar perspective 2.5 The Lexical-Functional perspective 2.6 The Chomskyan perspective 2.7 Summary Sources and suggestion for further reading Focusing questions/topics for discussion

17 17 18 20 21 22 23 28 29 30

3 Lexis and morphology 3.1 The inner life of words 3.2 Morphemes and allomorphs 3.3 'Lexical' morphology and inflectional morphology 3.4 Inflectional morphemes and the lexicon 3.5 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

33 33 34 35 38 42 42 44

4 Lexical partnerships 4.1 Collocation: the togetherness factor 4.2 Collocational range

47 47 48


Contents 4.3 4.4 4.5 4.6 4.7 4.8

Fixed expressions and compounds Collocations and the dictionary Corpora and collocations Creativity and prefabrication in language use Collocations, the lexicon and lexical units Summary Sources and suggestions for further reading Focusing questions/topics for discussion

49 51 52 55 56 58 58 61

5 Lexis and meaning 5.1 Words making the difference 5.2 Meaning seen as reference or denotation 5.3 Structuralist perspectives on meaning 5.4 Componential analysis 5.5 Cognitive approaches to meaning 5.6 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

63 63 64 66 75 77 80 80 83

6 Lexis, phonology and orthography 6.1 Lexis and 'levels of articulation' 6.2 Phonemes, stresses and tones 6.3 Lexical phonology as a reflection of lexical grammar and lexical meaning 6.4 Association between particular sounds and particular (categories of) lexical items 6.5 Lexis and orthography 6.6 Orthography as a reflection of lexical grammar and lexical meaning 6.7 Association between particular written signs and particular (categories of) lexical items 6.8 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

85 85 85

99 100 100 102

7 Lexis and language variation 7.1 Variety is the spice of language 7.2 Language variation: sociolinguistic perspectives 7.3 Lexical aspects of geographical variation 7.4 Lexical aspects of social variation 7.5 Lexical aspects of ethnic variation 7.6 Lexical aspects of gender-related variation 7.7 Lexical aspects of context-related variation 7.8 Lexical variation, culture and thought 7.9 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

105 105 106 109 111 114 116 119 123 127 127 131

88 89 91 97



8 Lexical change 8.1 Language in motion 8.2 The comparative method and internal reconstruction 8.3 Changes in lexical form 8.4 Changes in lexical meaning 8.5 Changes in lexical distribution 8.6 Lexical changes associated with language contact 8.7 The case of proper names 8.8 Lexical engineering 8.9 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

133 133 134 138 143 146 148 150 152 156 156 158

9 Acquiring and processing lexis 9.1 The'mental lexicon' 9.2 Meeting the lexical challenge 9.3 Before the first words 9.4 First words and beyond 9.5 Models of lexical processing 9.6 L2 dimensions 9.7 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

161 161 162 163 167 170 180 184 184 190

10 Charting and imparting the lexicon 10.1 Dictionaries and didactics 10.2 Lexicography: a potted history 10.3 The lexicon - lexicographer's bane! 10.4 Approaches to lexis in the language classroom 10.5 Lexical learning and other aspects of language learning 10.6 Summary Sources and suggestions for further reading Focusing questions/topics for discussion

193 193 194 201 211 226 229 229 234





Thits page intentionally left blank


A number of people have made valuable contributions to the process of which this book is the product. I especially need to acknowledge the absolutely indispensable assistance I received in the initial stages of planning and writing the volume from Lisa Ryan, now Research Fellow in Linguistics at University College, Dublin. Without Lisa's input, some of the early chapters might never have reached any kind of conclusion, never mind a happy one. I am also indebted to Jennifer Pariseau for using some of the time she spent teaching in Thailand in 1998-99 to collect examples in Thai, which she was then good enough to pass on to me for recycling in the pages that follow. Various other colleagues have shown their usual generosity in agreeing to cast a critical eye over parts of the book relating to areas where their expertise easily outstrips mine. In this connection I should like to thank Nicola McLelland of the Department of Germanic Studies, Trinity College Dublin, Breffni O'Rourke of the Centre for Language and Communication Studies, Trinity College Dublin, Vera Regan of the Department of French, University College Dublin, and Carl Vogel of the Department of Computer Science, Trinity College, Dublin. A further very helpful sounding board was provided by the editorial team at Edward Arnold, and in particular by Christina Wipf Perry, to whom I am most grateful indeed not only for the unflagging encouragement she offered but also for her great sensitivity and aplomb in guiding me away from all manner of blind alleys. At an institutional level my thanks are due to the Trinity College Dublin Arts/ESS Benefactions Fund for financial support of my research in the lexical area and to the Board of Trinity College Dublin for granting me a sixmonth leave of absence while I was preparing this book. Finally, I should like most warmly to thank my wife, Emer, and my sons, Christopher and Daniel, for various kinds of help they have given me over the past two years and perhaps above all for their forbearance! I am all too aware that, despite all the advice and assistance I have received from the above, the volume is very far from perfect and contains


PrPreface eface

many faults and failings, for which I am entirely unabashed to recognize my sole responsibility. After all, in the words of the Chevalier de Boufflers, perfectibility is to perfection what time is to eternity. Greystones, County Wicklow February 2000

1 Introduction: The lexicon - words and more 1.1

Some preliminary definitions

This book is about the lexicon. Lexicon is the Anglicized version of a Greek word (Af^i/cov), which basically means 'dictionary', and it is the term used by linguists to refer to those aspects of a language which relate to words, otherwise known as its lexical aspects. Lexicon is based on the term lexis (Af'^i^), whose Greek meaning is 'word', but which is used as a collective expression in linguistic terminology in the sense of 'vocabulary'. The study of lexis and the lexicon is called lexicology. In fact, as we shall see in the course of the next 200 pages or so, almost everything in language is related in some way or other to words. We shall also see that, conversely, the lexical dimension of language needs to be conceived of as rather more than just a list of lexical items.


Words and language

'In the beginning was the Word ...' This opening pronouncement of the Gospel of John in the New Testament may or may not be a true claim about the origins of the cosmos. However, if taken as a statement about where our thinking about language started from (and continues to start from) it is hard to fault. The original version of John's Gospel was written in Greek, and in this version the term used for 'word' is logos (Ao'pg), which, significantly enough, meant (and in Modern Greek still means) 'speech' as well as 'word'. This kind of association between the concept of word and a more general concept of speech or language is by no means confined to Greek culture. For example, to stay with the Gospel of John for just a little longer, the Latin translation of the above quotation is : 'In principio erat Verbum . ..', where hoyoq is replaced by verbum, an expression which, like the Greek term, was


Language and the lexicon

applicable to speech as well as to individual words. Thus, for example, one way of saying 'to speak in public' in Latin was verbum in publico facere (literally, 'word in public to make'). A similar association between 'word' and 'speech' is to be found in many other languages. For example, this dual meaning attaches to French parole, Italian parola and Spanish palabra. Similarly, in Japanese the term kotoba ('word', 'phrase', 'expression') is often abbreviated to koto or goto and is used as a suffix in expressions referring to speech such as hitorigoto o iu (literally, 'by-oneself-word say' = 'talk to oneself) and negoto o iu (literally, 'sleep-word say' = 'talk in one's sleep'); in Swedish the expression en ordets man (literally, 'an of-the-word man') is used to refer to a skilled speaker; and in German one way of saying 'to refuse someone permission to speak' translates literally as 'from someone the word to remove' - einem das Wort entzieben. In English, too, the association between the word and language in use is very much a feature of the way in which linguistic events are talked about in ordinary parlance, as the following examples illustrate: That traffic warden wants a word with you. A word in the right ear works wonders. When you are free for lunch just say the word. The Prime Minister's words have been misinterpreted by the media. The wording needs to be revised. Nor is it particularly surprising that words should loom so large in people's understanding of what language is. After all, words are vital to linguistic communication, and without them not much can be conveyed. For instance, a visitor to a Spanish-speaking country anxious to discover where the toilets are in some location or other may have a perfect command of Spanish pronunciation and sentence-structure, but will make little progress without the word servicios (in Spain) or sanitarios (in Latin America). It needs perhaps to be added that awareness of words is not limited to literate societies. The American linguist Edward Sapir, for example, conducted a great deal of fieldwork among Native Americans in the early part of the twentieth century. His goal was the transcription and analysis of Native American languages which had not previously been described. He found that although the Native Americans he was working with were illiterate and thus unaccustomed to the concept of the written word, they nevertheless had no serious difficulty in dictating a text to him word by word, and that they were also quite capable of isolating individual words and repeating them as units. Interestingly, a child acquiring language appears to develop an awareness of words earlier than an awareness of how sentences are formed. For example, research has shown that children in the age group 2-3 Vi correct themselves when they make errors with words before they start self-

Introduction: the lexicon - words and more


correcting in the area of sentence construction. Thus, examples like the first one given below will begin to appear earlier than examples like the second one cited. you pick up . . . you take her (substitution of take for initial wordchoice pick up) The kitty cat is . . . the . . . the spider is kissing the kitty cat's back (reordering of elements in order to avoid the passive construction The kitty-cat's back is being kissed by the spider) With regard to the specialist study of language, this too has been highly word-centred. For instance, in phonology, under which heading fall both the sound-structure of languages and the study of such sound-structure, a major focus of attention is the identification of sound distinctions which are significant in a particular language. Anyone with any knowledge of English, for example, is aware that in that language the broad distinction between the '£sound' and the 'p-sound' is important, whereas no such importance attaches to the distinction between an aspirated t (i.e. a t-sound pronounced with a fair amount of air being expelled) and an unaspirated t sound (i.e. a £-sound pronounced without such a voluminous expulsion of air). This last distinction is, in English, determined simply by the particular environment in which the t-sound occurs; thus, aspirated t occurs at the beginnings of words like ten, tight and toe, whereas unaspirated t occurs after the s-sound in words like steer, sting and stool. Phonologists talk about environmentally conditioned varieties of the Z-sound in a given language as belonging to or being realizations of the /t/ phoneme, and label them as allophones of the phoneme in question. (Notice that the convention in linguistics is for phonemes to be placed between slashes - /t/ -, whereas allophones are placed between square brackets - the transcription of the aspirated allophone of /t/, for example, being [th]). To return to the role of words in all this, one of the crucial tests for phonemic distinctions is that of lexical differentiation - that is, the test of whether a particular sound distinction differentiates between words. This can be tested by use of minimal pairs - pairs of words which differ in respect of just one sound (pin/tin, top/tot, gape/gate etc.). Distinctions between sound segments which serve to differentiate between words in this way such as the difference between the English p-sound and the English f-sound - are called phonemic distinctions, whereas distinctions between sound segments which do not differentiate between words - such as degrees of aspiration of English consonants are described as non-phonemic. It should be noted, incidentally, that in other languages (such as Sanskrit and its modern descendants) the distinction between aspirated and unaspirated consonants, which in English is merely allophonic, is as important in differentiating between words as the distinction between /p/ and /t/ in English. There are other ways of studying the sounds used in human languages ways which do not need to refer to phonemes and hence have no particular


Language and the lexicon

connection with lexical issues. For example, it is perfectly possible, without getting involved in questions of word differentiation and without any regard to the semantic implications of using one sound rather than another in a particular language, to study the acoustic properties of human speech (in terms of the physics of sound) or the physiological aspects of speech production (the interplay of the lips, the tongue, the vocal cords etc.). These kinds of phenomena and their investigation go under the heading of phonetics. The Greek root phon6 ((/)ovr\ - 'sound', 'voice',), is shared by both phonetics and phonology, but whereas phonology deals with the sound systems of individual languages (and any universal organizational principles which may emerge from such investigations) - and in doing so uses lexical differentiation as an important reference point - phonetics is concerned with speech sounds without reference to linguistic system and meaning. Thus it can be said that what differentiates phonology from phonetics is an interest in lexical differentiation in the above sense. At the grammatical level too, the distinction between two major areas of interest essentially revolves around words - although in a somewhat different manner. Grammar has traditionally been seen as having two branches - syntax and morphology - and in both cases the very definition of terms is lexically based. Thus, the term syntax, a derivative of the Greek word syntaxis (ovvragig- 'putting together in order'), denotes the whole range of regularities which can be observed in the combination of sentence components (and the study of such regularities), and it turns out that these components are largely identifiable as words and groups of words. For example, the distinction between the syntax of statements in English (e.g. John can swim.) and the syntax of questions (e.g. Can John swim?) is, at least from one perspective, a distinction between different ways of ordering words. The term morphology, for its part, owes its origins to the Greek root morphe* (nopQrf - 'form', 'shape') and denotes the internal structure of words (and the study thereof) - that is, how words are built up out of basic units (known as morphemes) which may or may not be capable of standing alone as words in their own right (e.g. un-fust-ly, de-nation-al-ize, re-enact-ment etc.). A third area right at the heart of linguists' interests, namely semantics that is, the domain of meaning (and its investigation) - is also very much bound up with words. Although the coverage of the term semantics (from the Greek s$ma (af^ia) - 'sign', 'signal') extends well beyond the limits of the lexicon, and semanticists certainly do not confine their attention to the meanings of individual words, the lexical level of meaning has always been the starting point for semantic study and theorizing, and remains a focus for debate. Thus, for instance, there is continuing discussion over whether the meaning of a word like man should be seen as an aggregate of the relations between man and words such as animal, woman, child etc., whether it should be treated as decomposable into smaller atoms of meaning (human, male, adult etc.), whether it should be envisaged as some kind of idealized or

Introduction: the lexicon — words and more


stereotypical mental image against which actual instances of men are compared, or whether all three approaches should be integrated in some way.


What's in a word?

Although, as is clear from the above, the word is central to the way in which non-specialists and specialists alike think about language, defining what a word is poses a problem or two. To begin with, what we mean by word will depend very much on whether we are talking about actual occurrences of any items that might qualify or whether we are intent on grouping or classifying items in some way or other. To illustrate this, let us begin by looking at the chorus from the Beatles' song She Loves You: She loves you, yeah, yeah, yeah; She loves you, yeah, yeah yeah; She loves you, yeah, yeah, yeah, yeah. How many words are there in these three lines? If we take actual occurrences of any items - word tokens - as the basis of our count, we shall come up with 6 words in the first line, six in the second, and seven in the third. That is 19 overall. On the other hand, if we base our count on word types - items with different identities - the overall figure for the entire extract will be just four (she, loves', you and yeah). Similarly, the phrase going, going, gone will be considered a three-word expression on a count of tokens but will be considered to contain only two words (going and gone) on a count of types. In another sense of word, the sequence going, going, gone may be thought of as containing just one word - the verb go, represented by two of its forms (going and gone). This approach to the notion of the word - seeing it as a 'family' of related forms or as an abstract unit which is realized by one or other of these forms as the linguistic environment demands - calls to mind the concept of the phoneme and its allophones (see above). This linkage with the phoneme idea is expressed terminologically: the notion of the word as a family of forms or as an abstract unit is captured in the term lexeme, while a lexeme's concrete representatives or realizations are referred to as wordforms. When we want to refer to a given lexeme in, for instance, a dictionary-entry, we typically do so using just one of its various forms, and the choice of this form, known as the citation form, is determined by convention, which varies from culture to culture and language to language. For example, the citation form of a French verb lexeme is its infinitive form (donner, sortir, prendre etc.), whereas the citation form of a Modern Greek verb is the first-person singular form of the present indicative (kdno/KavG), thelo/ds^o), akuo/amvcoetc.). We can also see words in different perspectives according to the particular level of linguistic classification we are applying. For example, if we look at


Language and the lexicon

the English word thinks from the point of view of the English orthographic (spelling) system we shall see it as a series of letters -t + h + i + n+ k + s; if we consider it as a phonological entity we shall perceive it as a sequence of phonemes - /i)/ + N + /n/ + /k/ + /s/ - one of which, A)/, corresponds to the letters th in the English writing system; if we view thinks in grammatical terms, we shall focus on the fact that we have before us the third-person singular present form of a verb; and if we approach it as a carrier of meaning, we shall be led to relate it to (among other things) the synonyms which can replace it in different contexts, for example: I think/believe I can do it. The philosopher's task is to think/cogitate. I'll think about/consider your suggestion. Mention of meaning brings us to the distinction which has been drawn between what are termed content words (also called full words or lexical words) and form words (otherwise known as grammatical words, empty words or function words}. Words described as content words are those which are considered to have substantial meaning even out of context, whereas words described as form words are those considered to have little or no independent meaning and to have a largely grammatical role. Some examples of content words are: bucket, cheese, president-, some examples of form words are: a, it, of. This distinction is not unproblematic, since many so-called form words - such as prepositions like around and towards and conjunctions like although and whereas - are clearly far from empty of semantic content. In any case, we need to be careful with the idea of 'semantic content'. We have to keep in mind that it is a metaphor, and that people not words are the sources of meanings, even if words are used as instruments to signal such meanings. Actually, a more satisfactory way of distinguishing between content words and form words is in terms of set membership: grammatical words belong to classes with more or less fixed membership (at least during any individual speaker's lifetime), while content words belong to open classes whose membership is subject to quite rapid change, as new terms come into being and others fall into disuse. In the light of all that has been said so far in this section, it is hardly surprising that linguists' attempts to provide a general characterization of the word have made reference to quite a wide variety of possible defining properties. The main lines of these different approaches are set out below.

The orthographic approach In the orthographic approach the word is defined as a sequence of letters bounded on either side by a blank space. This definition works up to a point for languages using writing systems such as the Roman or Cyrillic alphabet,

Introduction: the lexicon - words and more


but is not at all useful in relation to languages (like Chinese and Japanese) whose writing-systems do not consistently mark word-boundaries or in relation to language varieties which do not usually appear in written form (e.g. local varieties of Colloquial Arabic) or which have never been written down (e.g. many of the indigenous languages of the Americas). Also, there seems to be something rather odd about defining words in terms of the written medium given that, as we have seen, the word is in no sense a product of literacy, and given that, both in the history of human language and in the development of the individual, written language arrives on the scene well after spoken language. We can note further that defining words in terms of letter-sequences and spaces is very much a form-oriented, token-oriented exercise which takes absolutely no account of more abstract conceptions of the word.

The phonetic approach Another possible way of trying to define the word is to look for some way in which words might be identifiable in terms of the way they sound - irrespective of the particular sound-systems of specific languages. It might perhaps be imagined, for example, that words are separated from each other in speech by pauses. Alas, life is not that simple! In fact, individual words can rarely be pinpointed in physical terms in the ordinary flow of speech, which is in the main a continuous burst of noise. (Anyone who needs to be convinced of this should tune to a radio station broadcasting in a totally unfamiliar language.) Indeed, the lack of phonetic independence of individual words is precisely what explains linguistic changes such as the loss from some words in English of an initial /n/ (because this was felt to belong to the preceding indefinite article, e.g. auger from Old English nafu-gar; apron from Old French naperon) and the addition of a 'stolen' /n/ in some other cases (e.g. a newt from an ewt; a nickname from an eke-name). It is, of course, true that pausing is possible between words, and that linguists in the field working on hitherto undescribed languages may sometimes be able to make use of the 'potential pause' criterion when gathering data from native speakers - as Sapir did (see above) - but, since speakers do not normally pause between words this criterion has rather limited value.

The phonological approach At first glance a more promising approach to defining words on the basis of sound is to think in terms of the characteristics of words in particular soundsystems. For example, in some languages - English being a case in point words tend to have only one stressed syllable, which may occur in various positions (e.g. renew, renewable, renewability etc.). Another instance of a


Language and the lexicon

word-related phonological feature is that of vowel-harmony in languages such as Estonian, Finnish, Hungarian and Turkish. In this case the nature of the vowel in the first part of a word determines the choice of vowels in what follows. This is illustrated by the following two Hungarian words : kegyetlen-seg-uk-ben (literally, 'pity-less-ness-their-in' = 'in their cruelty') versus gond-atlan-sdg-uk-ban (literally, 'care-less-ness-their-in' = 'in their carelessness'). Also to be considered in this context is the fact that in a given language a particular phoneme or combination of phonemes may be found only rarely or not at all in a specific position in the word; for instance, in English /z/ is seldom to be found at the beginning of words and the lng sound' (/rj/) never occurs at all in this position. One problem with phonological characterizations of words is that, of their very nature, they relate to specific languages or, at best, to specific language-types. Also, such characterizations often have to be seen as descriptions of broad tendencies rather than as absolutely reliable; thus, with regard to stress in English, many units that are recognized as words in that language typically do not actually take stress in ordinary speech, e.g. and, but, by, if, the - and, on the other hand, in some groups of words which constitute fixed expressions (see below, Chapter 4) only one main stress is applied in the entire group, e.g. building worker, dancing lesson, lifeboat crew.

The semantic approach If definitions in terms of sound have their limitations, what about definitions in terms of meaning? Might it not be possible, for example, to define words as the basic units of meaning in language? The answer to this question is unfortunately 'No'. There are admittedly individual units of meaning which are expressed in single, simple words. For example, the English words ant, bottle and shoe are individual and indivisible forms which convey specific individual meanings. However, the relationship between single words and particular meanings is not always quite so straightforward. Let us consider, for instance the English word teapot. This is written as a single item and can be thought of as denoting a single entity, but, on the other hand, it does actually contain two elements which are words in their own right - tea and pot. Similarly, there are combinations which are not necessarily written as one word, such as: public house, cricket pavilion, icecream kiosk etc. Actually, if we think more carefully about the meanings of such combinations we can recognize the semantic contributions of each individual word, but the image which each combination of words first brings to mind is unquestionably that of a single building or type of building. A further obvious point to be made about the idea of words being minimum units of meaning is that there are actually units below the level of the word which function as semantic units. Reference has already been made to the fact that words may contain units that cannot stand alone as words in their

Introduction: the lexicon - words and more


own right. For example, the word un-just-ly has the word just as its core but also contains two elements (un- and -ly) which are vital to its meaning - un meaning roughly 'not' and -ly meaning something like 'in a ... manner'.

The grammatical approach The characterization of the word that seems to be least problematic is that which defines words in grammatical terms. The grammatical approach uses the criteria of 'positional mobility' and 'internal stability'. Words are said to be 'positionally mobile' in the sense that they are not fixed to specific places in a sentence. For example, in a sentence like The cat drowsily stretched her elegant forelegs we can re-order the words in various ways without removing or disrupting anything essential. The cat stretched her elegant forelegs drowsily. Drowsily the cat stretched her elegant forelegs. Her elegant forelegs the cat drowsily stretched. 'Internal stability' refers to the fact that within words the order of morphemes remains consistent. Thus the morphemic constituents of, for example, forelegs (fore + leg + s) cannot be altered - so that *sforeleg, *slegfore, *legfores, foresleg and legsfore are not possible versions of the word in question. Definition of the word as units which are positionally mobile but internally stable works well across languages. However, even this on the whole successful definition needs some qualification. For example, the English definite article the would normally be considered a word, but its positional mobility is distinctly limited. That is to say, except when it is being talked about as an object of study (as it is now), it has to be part of a noun phrase, occurring before the noun and any other elements that are included to qualify the noun; thus, the wolf, the large wolf, the extremely large wolf etc. Interestingly, the words that have such tight restrictions imposed on their possible positions in sentences are typically grammatical words, notably, definite articles (the), indefinite articles (a, an), prepositions (in, on, to, from etc.), which, as we have seen, have traditionally been regarded as lesser species of words, not 'full words'.

Defining the word: a summary Having looked at a number of possibilities for defining the word, then, what can we say about this problem? Well, one thing is clear: there is not just one way of looking at words. We can see them as types or tokens; we can see them as lexemes or word-forms; we can see them as orthographic units,


Language and the lexicon

phonological units, grammatical units or semantic units. We can also make a distinction between content words and form words. Regarding the various approaches to providing a general characterization of the word, it is clear that the grammatical approach in this connection is not only the least problematic but also the one that works best across languages. Phonetic and semantic perspectives offer little in the way of definitional criteria, but they do suggest some procedures which may be of use to the field linguist working with informants. As far as orthographic and phonological approaches are concerned, the criteria which emerge from these approaches apply in different ways and degrees to different languages. One result of particular sets of criteria operating differently from language to language is that words in one language may have some characteristics which have little or nothing in common with the characteristics of words in another language. For example, a word in Finnish - with word-stress and vowel-harmony - is rather different from a word in French, a language in which neither word-stress nor vowel-harmony operates. This does not mean, though, that it is inappropriate to use the term word in a cross-linguistic context. Finnish words and French words are recognizable on the basis of other criteria - grammatical criteria, the 'potential pause' criterion etc. - which are not tied to any particular language or language-group.


The domain of the lexicon

We have seen how the word is not perhaps as easy to characterize as one might have imagined before starting to reflect on this problem. Alas, even when we have arrived at some reasonably satisfying conclusions about how to define words, we are still rather a long way from defining what the lexicon is. As we noted earlier, lexicon is the Anglicized version of a Greek word meaning 'dictionary'. It may be instructive, then, in the context of a discussion of the domain of the lexicon, briefly to consider what kind of information is typically to be found in a dictionary. The following example is drawn - more or less at random - from the pages of the Concise Oxford English Dictionary. kin /km/ n. &c adj. One's relatives or family. - predic. adj. (of a person) related (we are kin\ be is kin to me) (see also AKIN) nkith and kin see KITH, near of kin closely related by blood, or in character, next of kin see NEXT, nnkinless adj. [OE cynn f. Gmc] What is interesting about such an entry is that, although the focus of the dictionary-maker is obviously on the individual word - in this specific instance on the word kin - a broader range of information seems inevitably to find its way into the picture. Thus, as well as information about the spelling (kin), sound-shape (/kin/) and meaning ('one's relatives or family') of the particular

Introduction: the lexicon - words and more


item in question, we are provided with information about its various grammatical roles - n. [= noun] & ad}. [=adjective], some examples of how it is used as a predic. adj. [= predicative adjective] (we are kin; he is kin to me), some examples of expressions in which it occurs (kith and kin, near of kin, next of kin), an example of a word formed by adding a suffix to kin (kinless), and a potted history of kin — OE cynn f. Gmc [ = Old English cynn from Germanic]. And so it is generally when one begins to look closely at any given individual word. Other issues simply cannot be kept at bay - especially issues having to do with how the word in question interacts with other elements. Take the very simple and unremarkable word dog, for instance. As soon as we home in on this word we have to recognize that part of its essential profile is that it is both a noun and a verb. Its grammatical categorization in these terms implies that it can appear in sentences like We all pat the dog as well as in sentences like The President was dogged by misfortune. We also have to recognize that dog is a participant in a wide range of frequently occurring combinations, or collocations, with other words, not all of which have meanings which are easily relatable to canineness - dog in the manger (= 'a person who refuses to let others have something for which he/she has no use'), dog's dinner (= 'a mess'), raining cats and dogs (= 'raining hard') etc. One especially interesting aspect of such interaction between a word and its linguistic environment is the way in which the choice of one word may have one set of repercussions in this environment, while the choice of another word - even a word with a fairly similar meaning - may have quite a different set of repercussions. The examples below - from English, French and German respectively - illustrate this point. We are forbidden to leave the building after midnight. We are prohibited from leaving the building after midnight. [choice of forbid entails choice of to + VERB; choice of prohibit entails choice of from + VERBmg] Nous esperons qu'elle chantera. (literally, 'We hope that she will sing.' = 'We hope she will sing.') Nous voulons qu'elle chante (literally, 'We want that she sing.' = 'We want her to sing.') [choice of verb esperer entails choice of future indicative form of verb - chantera - in following clause; choice of verb vouloir entails choice of present subjunctive form of verb - chante - in following clause] Sie hat mir geholfen. (literally, 'She has me helped.' = 'She (has) helped me.') Sie hat mich getrostet. (literally, 'She has me comforted.' = 'She (has) comforted me.'


Language and the lexicon [choice of verb helfen - past participle geholfen - entails choice of dative form of object pronoun - mir; choice of verb trosten - past participle getrostet - entails choice of accusative form of object pronoun - mich]

This discussion of the interplay between lexis and other aspects of language continues in the chapters that follow. However, even from the foregoing brief excursion into this topic we can draw the conclusions that, on the one hand, any plausible conception of the lexicon has to be broad enough in scope to include elements other than just individual words, and that, on the other, aspects of language not customarily thought of as lexical - notably grammatical phenomena - have to be recognized as at least having a lexical dimension.



This chapter has noted the extent to which language is popularly conceived of in terms of words - even in the absence of literacy - and of the extent to which awareness of language as words features in child language development. It has also pointed to evidence of 'lexico-centricity' in the way in which linguists have traditionally approached language as an object of study. It has shown that, despite all of this, it is no easy matter to define what a word actually is, illustrating this point by reference to possible phonological, orthographic, semantic and grammatical perspectives on the problem. It has then offered some first thoughts on the proposition that words cannot be seen in isolation from other aspects of language. With regard to the content of the remaining chapters: • Chapter 2 continues the discussion begun in the present chapter on the relationship between lexis and syntax. • Chapter 3 looks at the ways in which words are structured. • Chapter 4 focuses on habitual lexical combinations - collocations. • Chapter 5 explores various approaches to lexical semantics. • Chapter 6 examines the relationship between the lexicon and the phonology and orthography of particular languages. • Chapter 7 scrutinizes the ways in which the lexicon relates to social, regional and situational variation in language. • Chapter 8 describes and exemplifies different types of lexical change in the historical development of languages. • Chapter 9 addresses the question of what is involved in the construction of a 'internal' or 'mental' lexicon in the context of the acquisition of a language and also discusses ways in which the mental lexicon might be organized and accessed. • Chapter 10 surveys the evolution of dictionary-making - lexicography from its origins down to its very recent, technologically based manifestations and offers an account of how lexis has been treated in the context of language teaching.

Introduction: the lexicon - words and more 13

Finally, the Conclusion draws together the threads of the various parts of the discussion in some final comments on the expanding perception of the extent and the role of the lexicon.

Sources and suggestions for further reading See 1.2. Edward Sapir's comments on his work with Native Americans can be found on pp. 33-4 of his book Language: an introduction to the study of speech (New York: Harcourt Brace & World, 1921). The source of the examples of children's self-corrections is a paper by E. V. Clark and E. Andersen entitled 'Spontaneous repairs: awareness in the process of acquiring language', which was presented at the Biennial Meeting of the Society for Research in Child Development, San Francisco, 1979. The paper is summarized and discussed in S. Bredart and J-A. Rondal, L'analyse du langage chez I'enfant: les activites metalinguistiques (Brussels: Pierre Mardaga, 1982). See 13. The discussion of different approaches to defining the word draws heavily on the relevant sections in: R. Carter, Vocabulary: applied linguistic perspectives (second edition, London: Routledge, 1998); D. Cruse, Lexical semantics (Cambridge: Cambridge University Press, 1986); J. Lyons, Introduction to theoretical linguistics (Cambridge: Cambridge University Press, 1968) and S. Ullmann, Semantics: an introduction to the science of meaning (Oxford: Blackwell, 1962). The examples of lexical change in the section dealing with the phonetic approach to defining the word and the Hungarian examples in the section on the phonological approach are borrowed from Ullmann. See 1.4. The kin entry in the Concise Oxford Dictionary (eighth edition, edited by R. E. Allen, Oxford: Oxford University Press, 1991) is to be found on p. 650. Readers in search of further reading matter on some of the issues raised in this chapter may like to consult some or all of the following: R. Carter, Vocabulary: applied linguistic perspectives (second edition, London: Routledge, 1998); G. Finch, Linguistic terms and concepts (Houndmills: Macmillan, 2000); H. Jackson, Words and their meaning (London: Longman, 1988), especially Chapter 1; M. Lewis, The lexical approach (Hove: Language Teaching Publications, 1993), especially Chapter 5; J. Lyons, Linguistic semantics: an introduction (Cambridge: Cambridge University Press, 1995), especially Chapter 2; F. Palmer, Grammar (Harmondsworth: Penguin, 1971), especially Chapter 2;


Language and the lexicon S. Pinker, The language instinct (New York: William Morrow 8c Co., 1994), especially Chapter 5; H. G. Widdowson, Linguistics (Oxford: Oxford University Press, 1996), especially chapters 3, 4 and 5.

Focusing questions/topics for discussion 1. In this chapter a number of expressions were cited - expressions like / want a word with you — which show that our everyday conception of language is very much bound up with words. Think of some further examples of such expressions - in English or any other languages you know. 2. It was mentioned in the chapter in connection with phonology that lexical differentiation was one of the tests for phonemic distinctions. For example, in the 'minimal pair' tie/dye, the two words are differentiated by the distinction /t/ and /d/ and by that distinction alone. Which of the following pairs of words are 'minimal pairs' in which lexical differentiation similarly depends on a single phonemic distinction? beat - peat breath - breathe deep - sleep dot - doll phoney - pony

role - bowl scope - rope wreath - wreathe witch - filch wreck - neck

3. We saw in the chapter that the smallest units of meaning are not words but morphemes. For example, in the word unwise there are two morphemes, un and wise, the second of which is a word but the first of which is not. Try to analyse the following expressions into their constituent morphemes: antidepressant bowler disembarked encage hateful

misfire poetically resting unl ulfaw wedding-bells

4. 'Positional mobility' was presented in the chapter as one of the grammatical criteria for defining words. Put together a list of English words including both 'content words' and 'form words' - and then examine these words in the light of the 'positional mobility' criterion. Are some of the words more 'positionally mobile' than others? Are the equivalents of these words in other languages you know more or less 'positionally mobile' than the English words, or about the same?

Introduction: the lexicon - words and more


5. It was noted in the chapter that choosing one lexical item may have one set of repercussions on other choices in the sentence in question, while choosing a different item (with a similar meaning) may have a different set of repercussions. Thus, for example: The residents protested against the development plan vs. The residents objected to the development plan. Try to think of some further instances - in English and in any other languages you know - of different lexical choices having different implications for the form of the sentence in which the relevant words are situated.

Thits page intentionally left blank

2 Lexis and syntax 2.1


We saw in the previous chapter that particular syntactic patterns are associated with particular lexical items. This kind of association has sometimes been labelled colligation - from the Latin cum ('with') and ligare ('to tie'), the image underlying this term being that of elements being 'tied together' by, as it were, syntactic necessity. In the past the notion of colligation has tended to be applied to a fairly restricted range of rather 'local' syntactic relationships - such as the relationship between a verb and the form of the verb that follows it (its verbal complement], for example: She will eat chocolate tonight, [will + VERB/ She wishes to eat chocolate tonight, [wish + to + VERB/ She intends to eat/intends eating chocolate tonight, [intend + to + VERMntend + VERBmg/ She regrets eating chocolate tonight, [regret + VERBmg/ She is indulging in eating chocolate tonight, [indulge + in + VERRing] She is refraining from eating chocolate tonight, [refrain + from + VERRing] However, the recent trend in linguistics has been towards a much wider conception of the interaction between lexicon and syntax - to the point, indeed, where it is becoming increasingly difficult to pronounce with any confidence on the question of where lexicon ends and syntax begins. In this chapter we shall look briefly at the way in which the relationship between syntax and the lexicon has been approached in a number of different varieties of linguistics, notably computational linguistics, the 'London School', the Valency Grammar tradition, Lexical-Functional Grammar and Chomskyan linguistics.


Language and the lexicon

2.2 The computational perspective Computational linguistics refers to more or less everything that goes on at the intersection between computer science and the study of language. One dimension of computational linguistics is its interest in the relationship between what computers can do and what we humans do when we acquire and use language. Thus, some computational linguists spend their time trying to model aspects of language acquisition and processing on computers, often with very practical objectives in mind - automatic translation, speech synthesis etc. Another aspect of computational linguistics is the use of computers as an aid in the analysis of language. For example, computers are now widely used in the analysis of very large collections (corpora - singular corpus) of naturally occurring language in order to provide information about the frequency of particular items or the frequency with which certain items co-occur with certain other items. From both kinds of computational linguistics there emerges a strong sense of the difficulty of neatly separating the lexicon from syntax. With regard to the language-modelling aspect of computational linguistics, an interesting instance of such research is the work that is being undertaken at the Laboratoire d'Automatique Documentaire et Linguistique (LADL) in Paris, where the object is to design systems which will enable computers to perform operations (such as machine translation) on texts. The systems that the LADL researchers are endeavouring to put in place have to be capable of recognizing, decoding, selecting and combining words without the online assistance of human speakers. It transpires that the principal problems which emerge from the construction of such electronic lexicons have to do with the difficulty of separating lexis and grammar. Thus, very annoyingly, from the LADL researchers' point of view, sentences which are identical in structure and perhaps quite close in meaning do not necessarily behave identically when it comes to adjusting them in various ways, such behaviour seeming to be entirely dependent on the particular words used, for example: Cette question concerne Pierre. (This question concerns Pierre.')

(Works also in the passive - in both French and English: Pierre est concerne par cette question. 'Pierre is concerned by this question.')

Cette question regarde Pierre. ('This question regards Pierre.')

(Does not work in the passive in either French or English: *Pierre est regarde par cette question. '* Pierre is regarded by this question.')

With regard to the light shed on the lexis-syntax interface by the use of computer technology as a tool of linguistic analysis, an obvious example to cite

Lexis and syntax


here is the research carried out under the auspices of the Collins Birmingham University International Language Database (COBUILD), which will be discussed at greater length in Chapter 4. The relevant point to emerge from such research with reference to the present context is that there is a strong tendency for particular words or particular senses of words to be associated with particular syntactic structures. For example, the word yield has two main senses - 'give way/ submit/surrender' and 'produce'. It turns out that the first sense is almost always associated with uses of the word as an intransitive verb (verb without a direct object), for example: But we did not yield then and we shall not yield now. Love yields to business . . . In Sweden the authorities yielded at once to the threats ... The second sense, on the other hand, is mostly associated with uses of the word as a noun, for example: ... a nuclear shell with a 15 kiloton yield. .. .. . more fertilizer than Europe to achieve similar yields ., . ... Bangladesh's low annual yields ... A particular approach to syntax which is very widely used in computational linguistics is Head-driven Phrase Structure Grammar (HPSG). HPSG is very widely used in machine translation, especially in Europe. Its particular usefulness to computational linguists derives from the fact that it attempts to provide a totally explicit specification of how syntax operates. With regard to the lexicon, HPSG, in common with Valency Grammar and Lexical-Functional Grammar, sees words as extremely rich in grammatical information and as playing a key role in determining the syntactic shape of the sentences in which they occur. This is the sense of head-driven in the expression Head-driven Phrase Structure Grammar. The concept of the structure of the phrase in HPSG is that the head of a given phrase, such as a noun phrase or a verb phrase (i.e. the single word - the noun, the verb etc. - around which it is built), has attributes out of which crucial properties of the surrounding syntax are derived. For example, the lexical entry for the verb bakes would have to specify that it takes a subject noun, that it may also take a direct object noun and that where both a subject and a direct object are involved the relation between them is that of agent (doer of an action) and patient (undergoer of an action). Accordingly, the head of the verb phrase components of the following sentences determines the legitimacy of the nouns present in the sentence, and also determines their grammatical functions and the relations between them. VERB PHRASE Joanna [bakes]. HEAD (VERB)

Language and the lexicon



Joanna [bakes bread], HEAD (VERB)

2.3 The 'London School' perspective The COBUILD project took its inspiration from the work of a mid-twentiethcentury British linguist, J.R. Firth, founder of the so-called 'London School' of linguistics, who took the view that the meaning of a word could be equated with the sum of its linguistic environments, and that, therefore, linguists could essentially find out what they needed to know about a word's meaning by exhaustively analysing its collocations. Firth's general approach to the study of language continues to have echoes in modern linguistics through the work of eminent heirs to the 'London School' tradition such as John Sinclair, the leading light in the COBUILD project, and Michael Halliday. We have already begun to look at Sinclair's work and shall return to it in Chapter 4. With regard to Halliday and his followers, they see lexis and syntax not as separate entities but rather as merely different ends of the same continuum, which they label the lexicogrammar. In the Hallidayan perspective, a lexical distinction such as that between man and woman is seen in terms of the different environments in which they are likely to occur, just as the distinction between, for instance, a count or countable noun (e.g. dog) and a mass noun (e.g. mud) is seen in terms of the different syntactic frames in which these categories can occur. Thus, a count noun can occur after numerals (She has two dogs. He drank three litres of water.} and after quantifiers like several and many (The child had to have several stitches. We've visited Ireland on many occasions.), whereas a mass noun cannot occur after numerals nor after several/many, but can occur after a quantifier such as much (There was too much mud and not enough grass for a decent game. We didn't get much enjoyment out of it.}. Similarly, man but not woman can occur in the close company of a word like prostate (in a sentence such as The poor man had prostate problems.}, while woman but not man can occur in the close company of a word like pregnant (in a sentence such as That woman is pregnant.}. One argument that has been put against Halliday's contention that syntactic distinctions are not qualitatively different from lexical distinctions is that, whereas syntax is a 'purely' linguistic phenomenon, lexical distinctions are based on the nature of the real world. For example, one can argue that the fact that we do not juxtapose man and pregnant has simply to do with the limitations of male physiology. However, it also possible to argue that syntactic categories and processes are the way they are at least in part because of how things are in the world. To return to the case of the distinction between count nouns and mass nouns, for instance, it would be perfectly plausible to

Lexis and syntax


say that the reason why we do not normally put numerals in front of words like mud, air, enjoyment, darkness etc. is that the very nature of the substances or experiences to which they refer encourages a perception of them as continuous wholes rather than individual entities, a notion which receives support from the fact that across languages where the count/mass distinction exists, while there are certainly many differences in the detail of classification, the same kinds of substances and experiences tend to be referred to with mass nouns. For example, the translation-equivalents of mud and air in French (boue, air), German (Schlamm, Luft), Spanish (barro, aire] and Modern Greek (AaoTrrj, aspag) all (in those senses) normally function as mass nouns. The most sensible position would seem to be that the nature of both the lexicon and the syntax of any given language are determined by an interaction between extra-linguistic reality (the way things are 'outside language') and intra-linguistic reality (the way things are 'inside language').


The Valency Grammar perspective

Valency Grammar is particularly associated, historically, with German linguistics, but it has a wide influence on thinking about grammar generally. The term valency in this context derives from its application in chemistry, where a given element's valency is defined in terms of its capacity to combine with other elements. In linguistics valency refers to the number and types of bonds syntactic elements form with each other. Valency Grammar traditionally presents the verb as the fundamental or central element of the sentence and focuses on the relationship between the verb and the elements which depend on it (which are known as its arguments, expressions, complements or valents). The relevance of Valency Grammar in the present discussion is that it recognizes the shape of sentence structure as a consequence of lexical choice, that is, the choice of a particular verb with a particular valency. Some examples of verb valencies follow. Exist, snore, vanish Verbs like these require only a subject. Poverty exists. He was snoring. The problem vanished. In traditional terms they are labelled intransitive. In valency terminology they are said to be monovalent, having a valency of 1. Annoy, damage, scrutinize Verbs like these require both a subject and a direct object. You annoy me.


Language and the lexicon The storm damaged the sea-wall. We have scrutinized the documents.

In traditional terms they are labelled transitive. In valency terminology they are said to be bivalent, having a valency of 2. Bestow, give, inform Verbs like these require a subject, a direct object and one further valent. The king bestowed a knighthood on him. Jeremy gave the parcel to his aunt. The police informed Jack of Jill's safe return. In traditional terms they are labelled ditransitive. In valency terminology they are said to be trivalent, having a valency of 3. As has been mentioned, traditionally the notion of valency has been applied to verbs. However, a number of recent approaches to grammar, which take much of their inspiration from Valency Grammar and which are grouped together under the general heading of Dependency Grammar, extend the basic valency idea to other lexical categories such as adjectives and nouns. It is clear, for example, that the valency of the adjective tall (which can 'stand alone' in qualifying a noun) differs from that of the adjective susceptible (which requires something further): The professor is tall. The professor is susceptible to pressure. Similarly with the nouns problem and propensity. He has a problem. He has a propensity to violence.


The Lexical-Functional perspective

Lexical-Functional Grammar (LFG) developed in the 1980s as a kind of offshoot of the Chomskyan approach to syntax - one which attempted to bring the theoretical and descriptive treatment of syntax closer to what was known about the psychological processes involved in producing and understanding utterances. As its name suggests, Lexical-Functional Grammar places the lexicon right at the heart of its account of syntax. In LFG every item in the lexicon is seen as coming equipped not only with indicators of how it sounds, how it is written and what it means but also with indicators of the roles of the elements to which it relates in a given sentence (its argument structure) and of the grammatical functions assigned to

Lexis and syntax


these roles. For example, the verbs walk and stroke can be portrayed within this framework as follows: walk

(subject) (agent)

(assignment of grammatical function) (argument structure)

(subject) (object) stroke (agent, theme)

(assignment of grammatical function) (argument structure)

In walk the argument structure consists merely of an agent argument (the role of doer of an action) which is associated with the subject of the verb, as in: (subject of the verb walk) Eric

was walking.

(agent - doer of the walking) In stroke the argument structure consists of an agent argument (the role of doer of an action) which is associated with the subject of the verb and a patient or theme argument (the role of undergoer of the action) associated with the object of the verb, as in: (subject of the verb stroke) Jill

(object of the verb stroke) stroked

(agent - doer of the stroking)

the cat. (theme - undergoer of the stroking)

Thus, like HPSG, Valency Grammar and the various forms of Dependency Grammar, LFG presents lexical choice as the shaper of the syntax of any given sentence. A sentence is seen as involving lexical structure, constituent structure (or c-structure) and functional structure (or f-structure). Because each lexical element of a sentence is held to specify an argument structure, the lexical structure of the sentence is seen as determining its constituent structure (the component parts which make up the sentence and how these component parts relate to each other); and, because the various roles (agent, theme etc.) attached to particular lexical items are viewed as associated with grammatical functions (subject, object etc.), functional structure too is seen as dependent on lexical structure.


The Chomskyan perspective

We come finally to what would until fairly recently have been considered the most syntactic of syntactic models, namely that which is associated with the name of Noam Chomsky. In the very earliest version of the Chomskyan model the lexicon was not recognized as an autonomous component at all; words were considered to be merely the observable elements through which


Language and the lexicon

syntax manifested itself - the outward signs of inward syntax - to borrow a theological expression. However, the evolution of Chomskyan linguistics since its beginnings in the 1950s has consistently been in the direction of ascribing more and more importance to the lexicon. Phenomena which had previously been represented as purely syntactic processes were by this time being treated with reference to the lexicon. A good example of this is the case of passivization. In Chomsky's first book, Syntactic Structures (published 1957), syntax was represented as generated in the first place by phrase structure rules of the type: S -»NP + VP [A SENTENCE CONSISTS OF A NOUN PHRASE AND A VERB PHRASE]


The basic or kernel structures which were specified by the phrase structure rules were then subject to various kinds of transformation, including passive transformation. The passive transformation rule looked roughly like this: ACTIVE SENTENCE


+ V + NP2 [VERB]





+ be

+ V + by [VERB]



This can be exemplified as follows: ACTIVE SENTENCE


The man hit the ball


The ball was hit by the man



+ V + NP2

+ be + V + by

+ NPt

It later came to be recognized by Chomskyans, however, that passivization was not something that could be dealt with simply in terms of a syntactic generalization. Such a generalization might run something like the following: NP2 in the active sentence moves to NPt position in the passivized sentence and vice versa. This would explain how we get: A picture was taken of Brett by the official photographer from: The official photographer (NP,) took a picture (NP2) of Brett (NP3). It also accounts for the questionable status of -Brett was taken a picture of by the official photographer, where the noun phrase which is 'moved' to subject position is NP3 in the corresponding active sentence (i.e. not the direct object). However, in some cases non-direct objects can be 'moved' to subject position in passive sentences. For example, all three of the following

Lexis and syntax


sentences are perfectly acceptable, even though in the third sentence John is NP3, and does not represent the direct object of the active version of the sentence but rather the object of a preposition. They (NPJ took advantage (NP2) of John (NP3). Advantage was taken of John. John was taken advantage of. The only way to explain this seemed to be in terms of a lexical restructuring rule which would allow certain whole expressions like take advantage of optionally to be restructured as a sort of complex transitive verb. Optional restructuring of this kind turns out to be highly idiosyncratic; thus, it works perfectly with take care of (e.g. The fob was taken care of] but not so well with take offence at (-The chairman's remarks were taken offence at}. Accordingly, specific lexical choice can be seen to determine the possibility or otherwise of lexical restructuring, which in turn determines the permissibility of certain kinds of passivization. By the early 1980s the lexicon was being seen as having a crucial influence on syntactic structure. The so-called 'Projection Principle' of the 'Government and Binding' (GB) version of Chomskyan syntax current in the 1980s states that the properties of lexical entries 'project onto' the syntax of the sentence - which essentially coincides with the perspective of HPSG, Valency Grammar and LFG in the matter of the lexis-syntax interface. The Projection Principle can be illustrated as follows. As we have seen, in early versions of Chomsky's model, the phrase structure component of the syntax fully specified the basic constituents of the sentence. Thus, for example: S -* NP + VP [A SENTENCE CONSISTS OF A NOUN PHRASE AND A VERB PHRASE]


In this version of things the expansion of the VP element was: VP -* V (+ NP) [A VERB PHRASE ALWAYS CONTAINS A VERB AND MAY OR MAY NOT INCLUDE A NOUN PHRASE]

On the other hand, the lexical entries for verbs specified whether or not they could be followed by a noun phrase. For example, the entry for a transitive verb such as hit would have contained the information:


The entry for an intransitive verb like snooze, on the other hand, would not have contained the specification of this particular environment.


Language and the lexicon

Accordingly, there is duplication between the information provided by the phrase structure rules and that provided by the lexicon. If we take it that, as the Projection Principle states, lexical properties intervene in the shaping of syntax, then the notion of having a general statement at the syntactic level about the optionality of occurrence of a noun phrase in the verb phrase no longer makes sense, since the lexicon supplies the information for each particular verb as to whether or not it may be followed by a noun phrase. Subsequent developments in Chomskyan linguistics went even further in a lexicalist direction. One of the major distinctive features of Chomsky's view of language is that every human being is born with a language faculty, and that it is this language faculty which enables the child to acquire language. A fundamental corollary of this view is that human languages are essentially structured along the same lines, lines which reflect the structure of the language faculty. If this were not the case, clearly, the notion of a language faculty would be unable to explain the fact that a human children will acquire any human language to which they are given adequate exposure. Chomsky labels the structural common core of languages which he posits Universal Grammar. According to the Chomskyan model of the 1980s, Universal Grammar consists of, on the one hand, a set of principles, applicable to all languages, and, on the other, a set of parameters, that vary from language to language within very specific limits. An example of a principle has already been given, namely the Projection Principle (see previous two paragraphs). An example of a parameter is the Head Parameter, which states, basically, that within a particular phrase (prepositional phrase, verb phrase etc.) the head of the phrase (preposition, verb etc.) occurs consistently either to the left or to the right of the other elements (the complement). Thus, English is said to be a 'head-first' language on the basis of data such as: PREPOSITIONAL PHRASE


(Preposition head to the left of its complement in a prepositional phrase)


[am Japanese]

(Verb to the left of its complement in a verb phrase)


Japanese, on the other hand, is said to be a 'head-last' language on the basis of data such as: PREPOSITIONAL PHRASE

[Nihon ni] (Preposition head to the right of its complement in a HEAD prepositional phrase) (PREPOSITION) [literally, 'Japan in']

Lexis and syntax



[nihonjin desu] HEAD (VERB)

(Verb head to the right of its complement in a verb phrase)

[literally, 'Japanese am'] However, towards the end of the 1980s it began to be suggested that parameters were not properties of principles, but rather properties of individual lexical items, a view known as the lexical parameterization hypothesis. Let us look briefly at some evidence bearing on the lexical parameterization hypothesis from English and German prepositional phrases. Both English and German would be considered 'head-first' languages on the basis of the positioning of heads in prepositional (and other) phrases. For example: ENGLISH PREPOSITIONAL PHRASE

[in Germany}

(Preposition head - in - to the left of its complement)


[with me]

(Preposition head - with - to the left of its complement)



[in Deutschland] HEAD (PREPOSITION)

(Preposition head - in - to the left of its complement)


[mit mir]

(Preposition head - mit - to the left of its complement)


['with me'] However, in both languages there are, in fact, prepositions which may occur to the right of their complements: ENGLISH PREPOSITIONAL PHRASE

[your objection notwithstanding} HEAD (PREPOSITION)

(Preposition head - notwithstanding - to the right of its complement)


Language and the lexicon


[der Schule gegeniiber] (Preposition head - gegenuber - to the right HEAD of its complement) (PREPOSITION) ['opposite the school' literally, 'the school opposite'] Examples such as these last two seem to indicate that the positioning of heads of prepositional phrases is not something which is set globally for all cases within a given language, but rather, as the lexical parameterization hypothesis suggests, that such positioning is determined by the lexical properties of each particular preposition. The lexicalizing tendency in Chomskyan linguistics reaches its logical conclusion in Chomsky's latest version of his model, the 'Minimalist Programme'. In this model the whole process of deriving a syntactic structure is represented as beginning in the lexicon, since Chomsky and his followers now accept, alongside many other schools of linguists (see earlier sections), that the particular lexical elements which are selected in any given sentence will be the principal determinants of both the content and the form of the sentence. The minimalism of the Minimalist Programme refers precisely to the fact that syntactic levels and operations are in this model reduced to an absolute minimum, with many of the most familiar features of earlier models being discarded, while the lexicon is viewed as driving the entire structure-building system. Thus, for example, instead of the syntactic rules beginning with sentence-level and then filling in what the sentence consists of, as, for example, in S -* NP + VP, the minimalist model begins by building individual structures around individual lexical items and then merges these individual, lexically based structures into larger structures.

2.7 Summary This chapter has shown that syntacticians from a very wide range of theoretical traditions view the lexicon as having a vital, determining role in the structuring of sentences. In some instances, for example 'London School' linguistics and Valency Grammar, the interpenetration of lexis and syntax was recognized from the outset; in others, for example in computational linguistics, the acknowledgment of such interpenetration was an inevitable inference arising from working with the 'nitty-gritty' of data; and in still others, for example the later Chomskyan models, increasing importance was attributed to the lexicon in respect of syntactic structure as the models in question developed in response to their perceived inadequacies. The consensus across all of the above schools (and many others) is that the syntactic

Lexis and syntax


shape of any sentence is very largely a function of the properties of the lexical elements out of which it is composed.

Sources and suggestions for further reading See 2.2. The LADL examples were taken from B. Lamiroy, 'Ou en sont les rapports entre les etudes de lexique et la syntaxe?' (Travaux de Linguistique 23, 1991, 133-9). The account of the lexico-syntactic findings of the COBUILD project was based on Chapter 4 of J. Sinclair, Corpus, concordance, collocation (Oxford: Oxford University Press, 1991). Sources for the discussion of Head-Driven Phrase Structure Grammar were: R. D. Borsley, Modern phrase structure grammar (Oxford: Blackwell, 1996); C. Pollard and I. A. Sag, Head-driven phrase structure (Chicago: University of Chicago Press and Stanford: CSLI Publications, 1994); I. A. Sag and T. Wasow, Syntactic theory: a formal introduction (Stanford: CSLI Publications, in press). Also consulted in this connection was the website The bakes example was loosely derived from R.L. Humphreys, 'Lexicon in formal grammar' (in K. Brown and J. Miller (eds), Syntactic theory, Oxford: Elsevier Science, 1996). See 2.3. Discussion of M. Halliday's approach to the lexicogrammar was based on statements and arguments to be found in his very early work, such as 'Categories of the theory of grammar' (Word 1961, 17, 241-92), but also in his recent work - for example, Functional grammar (second edition, London: Edward Arnold, 1994). The counter-argument to the Hallidayan position came from G. Sampson, in Schools of linguistics: competition and evolution (London: Hutchinson, 1980, p. 233). See 2.4. The principal sources for the discussion of Valency Grammar were D. J. Allerton's book Valency and the English verb (London: Academic Press, 1982), and his article 'Valency grammar', in E. F. K. Koerner and R. E. Asher (eds), Concise history of the language sciences (Oxford: Kidlington, 1995). Material on a broad range of Dependency Grammar approaches was also consulted at the website See 2.5. Sources for the section on Lexical-Functional Grammar included: J. Bresnan and R. Kaplan, 'Introduction: grammars as mental representations of language', in J. Bresnan (ed.), The mental representation of grammatical relations, Cambridge, MA: MIT Press, 1982); S.C. Dik, Functional Grammar (Dordrecht: Foris, 1981); C. Neidle, 'Lexical-Functional Grammar', in K. Brown and J. Miller (eds), Syntactic theory, Oxford: Elsevier Science, 1996). The websites and http: // were also consulted.


Language and the lexicon

See 2.6. The discussion of Chomskyan models drew on both editions of Chomsky's Universal Grammar: an introduction (first edition - authored by V. J. Cook - Oxford: Blackwell, 1988; second edition - co-authored by V. J. Cook and M. Newson - Oxford: Blackwell, 1996) and on A. Radford's books, Transformational syntax (Cambridge: Cambridge University Press, 1981), Syntax: a Minimalist introduction (Cambridge: Cambridge University Press, 1997), Syntactic theory and the structure of English: a minimalist approach (Cambridge: Cambridge University Press, 1997). The relevant original N. Chomsky sources were: Syntactic structures (The Hague: Mouton, 1957), Lectures on Government and Binding (Dordrecht: Foris, 1981) and The Minimalist Program (Cambridge, MA: MIT Press, 1995). The lexical parameterization hypothesis was the brainchild of P. Wexler and R. Manzini ('Parameters and learnability in Binding Theory', in T. Roeper and E. Williams (eds), Parameter setting, Dordrecht: Foris, 1987). Readers who wish to explore syntax further might profitably begin with one or other of the following: L. Thomas, Beginning syntax (Oxford: Blackwell, 1994); C.L. Baker, English syntax (Cambridge, MA: MIT Press, 1995). Good introductions to specifically Chomskyan syntax are provided by: V. J. Cook and M. Newson, Chomsky's Universal Grammar: an introduction (Oxford: Blackwell, 1996); A. Radford, Syntax: a minimalist introduction (Cambridge: Cambridge University Press, 1997); L. P. Shapiro, 'Tutorial: an introduction to syntax' (journal of Speech, Language and Hearing Research, 40, 1997, 254-72). Further discussion of the interface between the lexicon and syntax can be found in: R. L. Humphreys, 'Lexicon in formal grammar', in K. Brown and J. Miller (eds), Syntactic theory, Oxford: Elsevier Science, 1996); T. Stowell, 'The role of the lexicon in syntactic theory', in T. Stowell and E. Wehrli (eds), Syntax and semantics. Volume 26. Syntax and the lexicon, San Diego: Academic Press, 1992); T. Stowell and E. Wehrli, 'Introduction', in T. Stowell and E. Wehrli (eds), Syntax and semantics. Volume 26. Syntax and the lexicon, San Diego: Academic Press, 1992).

Focusing questions/topics for discussion 1. At the beginning of the chapter the following examples were given of different patterns of verbal complementation: will eat [will + VERB]-,

Lexis and syntax


wish to eat [wish + to + VERB]; intend to eat/intend eating [intend + to + VEKB/intend + VERBmg]; regret eating [regret + VERBmg]; indulge in eating [indulge + in + VERBmg]; refrain from eating [refrain + from + VERBmg]. For each of the above cases try to find two verbs whose verbal complementation pattern parallels that of the example given (e.g. must follows the same pattern as will). 2. We saw in section 2.3 that lexical distinctions, like certain grammatical distinctions, can be seen in terms of elements with which lexical items may and may not co-occur. It is also true that co-occurrence possibilities are affected by context and by the meaning intended. Thus, with regard to the words man and woman, both of the following sentences would be unremarkable in Anglican contexts (where female as well as male priests are often encountered), but the second would be impossible in Roman Catholic or Greek Orthodox contexts (where female priests are never encountered). The priest was a fine man. The priest was a fine woman. Similarly, in relation to countable and mass nouns, whereas not many waters would be perfectly possible in the context of a supermarket with only a limited range of brands of mineral water, not much water would have to be the expression used in the context of a reservoir whose water level had been affected by drought. Try to think of some other instances of contextual influence on co-occurrence possibilities. 3. In 2.4 we saw that Valency Grammar ascribes valencies to verbs according to the number of arguments they take. Thus intransitive verbs like exist, sleep, vanish are labelled monovalent, having a valency of 1, transitive verbs like annoy, damage, scrutinize are labelled bivalent, having a valency of 2, and ditransitive verbs like give, inform, characterize are labelled trivalent, having a valency of 3. Try to find five further verbs belonging to each of the above three valency categories. In performing this exercise, note the way in which some verbs have different valencies in different contexts. For instance, a normally monovalent verb like dream may in some contexts be bivalent (e.g. He dreamt a strange dream). 4. On the basis of the discussion of HPSG in 2.1 and of LFG in 2.5, identify the agent and, where applicable, the patient in each of the following sentences, and use the information so obtained to specify the argument structure of the verb in each case.


Language and the lexicon The folk-dancers slapped their thighs. John was working. Every February we ski in the French Alps. Mr McVeigh sliced the tomatoes very fine. Christopher was tuning his guitar.

5. In the account of the development of the Chomskyan model in 2.6 we encountered the notion of parameter, which was illustrated by means of the Head Parameter. Another parameter discussed by Chomskyans is the Pro-Drop Parameter, which relates to whether or not subject pronouns may be 'dropped' before verbs. Thus Spanish is said to be a pro-drop language, on the basis that, for example, yo [T] in an expression like yo entiendo ['I understand'] may be, and usually is, 'dropped' - so that the usual way of saying 'I understand' in Spanish is simply entiendo. In French, on the other hand, the je [T] of je comprends ['I understand'] cannot be dropped, and on the basis of this and myriad similar examples, French is said to be a non-pro-drop language. Where, then, would English stand in relation to the Pro-Drop Parameter? Is there any variation in the 'droppability' of the subject pronoun depending on context and/or on the particular verb selected?

3 Lexis and morphology 3.1

The inner life of words

We saw in Chapter 1 that morphology is derived from a Greek word meaning 'form' or 'shape' and that it denotes the internal structure of words (and the study thereof). A given word is not necessarily just a sequence of sounds or letters with an overall, indivisible meaning and or grammatical function; a word may be made up of a whole collection of meaningful components, of which some may in other contexts stand alone as words in their own right, and others may be used only as parts of words. Consider, for instance the words underlined in the following sentences: The enormous fish looked rather fearsome. If you are going to dance the cancan you will need some fishnet stockings. This is a story about three little fishes. There's a fishy smell in here. Fish is obviously a word which can stand alone in its own right. However, it can also be conjoined with other elements which can function as independent words - such as net. Furthermore, it can also be combined with elements which have no independent existence as words, but which clearly have meaning and function. Thus the -es ending in fishes signals that more than one fish is involved and the -y ending in fishy turns the noun fish into an adjective. Similarly in other languages: for example, the German translations of fish, fishnet and fishes are respectively Fisch, Fischnetz (Fisch + Netz - both words in their own right) and Fische (Fisch + the non-word plural ending -e). The basic building blocks of meaning and grammar are not, therefore, words but rather the irreducible components out of which words are composed - that is to say, elements which cannot be further decomposed into anything relevant to their meaning or grammatical function. These irreducible entities are known as morphemes. In this chapter we shall examine how morphemes function, how they are customarily classified and how they relate to the lexicon.

Language and the lexicon


3.2 Morphemes and allomorphs Morphemes can be defined as the smallest elements of any language which have semantic and/or grammatical significance. As we have seen, some morphemes are also whole words - fish, for example, whereas others are units below the level of the word which nevertheless have their own meaning and/or grammatical function in the context of the words in which they occur, for example -es in fishes, -y in fishy. Morphemes which can stand alone as words are known as free morphemes while morphemes which can only be meaningful or functional as parts of words are known as bound morphemes. Bound morphemes very often manifest themselves as prefixes - elements attached at the beginnings of words (e.g. dis- as in disobey] - or as suffixes - elements attached at the ends of words (e.g. -ize as in idolize}. Prefixes and suffixes are known collectively as affixes. Other kinds of affixes to be found in the world's languages include infixes and circumfixes. Infixes are elements attached within the free morpheme bases of words; for example in Bontoc, a language of the Philippines, the infix -um- makes a verb out of an adjective or noun (thus, fikas -'strong', fumikas 'to be strong'). Circumfixes are elements which 'surround' the relevant base; for instance, in Chickasaw, a Native American language, the negative is formed by the alteration of the base both fore and aft (thus, lakna - 'it is hot', iklakno - 'it is not hot'). As we shall see later, bound morphemes may also sometimes be represented by nothing at all in the outer forms of words, and they may also be expressed as processes rather than or as well as additions. Some further examples of free and bound morphemes from English and French follow. FREE MORPHEMES




(as in predispose]



(as in idolize]



(as in sailing]



(as in considers)


pomme ('apple')


(as in reinventer- 'to reinvent')

jaune ('yellow')


(as in poliment - 'politely')

la ('there')


(as in parlions - '(we) talked')

sur ('on')


(as in vins - 'wines')

A morpheme may be realized by different forms - its morphemic alternants or allomorphs - according to the particular environment in which it occurs.

Lexis and morphology


For instance, the English past tense morpheme may be realized as /id/ (as in wanted], as /d/ (as in stayed], as /t/ (as in jumped], and in other ways besides. An example of morphemic alternation from Italian is the case of the free morpheme a, which means 'to', 'at' or 'in' (depending on context). A is the form used before words beginning with consonants, but where the following word begins with a vowel the form a d is used - as is illustrated by the sentences below. Andiamo a Milano.

('We are going to Milan')

Andiamo ad Athene.

('We are going to Athens')

A not dissimilar example from French - in this case involving a 'zero allomorph' is the way in which the bound plural morpheme attached to nouns behaves: when followed by a word beginning with a consonant it is not pronounced, whereas when followed by a vowel sound it may or may not be realized as /z/ - depending on speech style and tempo. For instance the s in the written form of the French word for 'cars' - voitures - is silent in the first of the sentences below but may be pronounced as /z/ in the second if the speech is fluent rather than halting and if the speech style is relatively formal. Les voitures ralentissaient.

('The cars were slowing down.')

Les voitures allaient tres vite.

('The cars were going very fast.')


'Lexical' morphology and inflectional morphology

Since, as we have seen, morphology in general relates to the structure of words, it would be not unreasonable to conclude that all morphology is lexical. However, many morphologists are inclined to make a distinction between morphological phenomena that have to do with word formation to which specifically they attach the term lexical morphology - and aspects of morphology which have rather to do with the grammatical modification of words - which they label inflectional morphology. An example of lexical morphology on this definition would be the addition of the bound morpheme -ness to the adjective kind to form the abstract noun kindness', and an example of inflectional morphology would be the addition of the bound morpheme -s to the verb run when that verb is preceded by he, she or it in its present tense. 'Lexical' morphology, as defined above, can itself be seen as divisible into two further subcategories: composition or compounding on the one hand and derivation on the other. Composition/compounding is customarily applied to instances of word formation where the formation process involves free morphemes. For example, the free morpheme light and the free morpheme house combine to form the compound word lighthouse, which draws on the meanings of its component morphemes but which has a specific meaning all of its

Language and the lexicon


own. Derivation, for its part, is applied to instances of word formation in which bound morphemes play a role. For example, the word unwise is a different word with a different meaning from the word wise, and it is formed simply by the prefixing of the bound morpheme un- to the free morpheme wise. Some further examples (from English and Dutch) of words formed by composition/compounding and words formed by derivation are given below. COMPOSITION/ COMPOUNDING




payment (bound derivational morpheme: -ment)


enrage (bound derivational morpheme: en-)


smallish (bound derivational morpheme: -ish)


swiftly (bound derivational morpheme: -ly) DUTCH

zeeman (seaman )

grootheid ('greatness'; bound derivational morpheme: -held)

lichtbruin ('light brown')

verhuizen ('to move house'; bound derivational morphemes: ver-, -en]

eenmaal ('one time')

katje ('little cat', 'kitten'; bound derivational morpheme: -je)

welkom ('welcome')

onwel ('unwell'; bound derivational morpheme: on-)

As can be seen from these examples, bound morphemes involved in derivation may or may not change the grammatical class of the free morphemes to which they are attached. Thus, the addition of the bound derivational morpheme -ment to the verb pay yields the noun payment (cf. arrange!arrangement, excite/excitement, resent/resentment), whereas, on the other hand, the addition of the bound derivational morpheme -ish to the adjective small yields another, different, adjective smallish (cf. grey/greyish, slow/slowish, warm/warmish). With regard to inflectional morphology, this can be further exemplified again from English and Dutch - as follows. INFLECTIONAL MORPHOLOGY ENGLISH


(as in The trees are lovely; bound inflectional morpheme: -s)

Lexis and morphology recognizes

(as in She recognizes me; bound inflectional morpheme: -s)


(as in//// lifted her head; bound inflectional morpheme: -ed)


(as in The doctor was vaccinating the ten-year-olds-^ bound inflectional morpheme: -ing]




('books', as in De boeken zijn thuis - 'The books are at home'; bound inflectional morpheme: -en]


('buys', as in Hij koopt sijn krant in de winkel - 'He buys his newspaper in the shop'; bound inflectional morpheme -t)


('boiled', as in Het water kookte - 'The water boiled'; bound inflectional morpheme -te)


('good', as in Mijn vader was een goede man - 'My father was a good man'; bound inflectional morpheme -e; compare Mijn vader was goed - 'My father was good')

As the above examples illustrate, inflectional morphemes are not involved in word formation, and they never change the actual grammatical category of the free morphemes to which they are attached. Rather, they make small adjustments to words which have important grammatical consequences signalling, for instance, tense, person and number in verbs, and number and grammatical case in nouns. The following examples, from various languages, provide further illustration of these various roles. Present-past distinction in German: er lebt ('he lives') vs. er lebte ('he lived') First person-second person distinction in Spanish: regreso ('I return') vs. regresas ('you (sing.) return') regresamos ('we return') vs. regresais ('you (plur.) return') Singular plural distinction in French: elle chantera ('she will sing') vs. elles chanteront ('they (fern.) will sing') Singular-plural distinction in Swedish: apelsin ('orange') vs. apelsiner ('oranges') Nominative (subject case) vs. genitive (possessive case) in Modern Greek: to nero (TO vepo - 'the water') vs. tu neru (TOV vepov- 'of the water', 'the water's')


Language and the lexicon

Because of the grammatical nature of their contribution to word-structure, and because, at first sight at least, they seem to be assignable by rule rather than dependent on the particularities of lexical items, inflectional morphemes have been considered by some linguists to lie outside the domain of the lexicon and to belong rather with the grammatical rules of a language. We have seen already, with regard to syntax, that making a hard and fast distinction between lexicon and grammar is no easy task. In the case of morphology, as will become clear in the next section, such a distinction makes no sense whatsoever.


Inflectional morphemes and the lexicon

A first very basic problem about a claim that derivational morphemes are lexicon-based while inflectional morphemes are not is that it is not at all clear in some instances whether a particular morpheme is derivational or inflectional. A commonly cited illustration of this problem is the case of the positive, comparative and superlative forms of adjectives in English, exemplified below: POSITIVE












On the one hand, this seems like a highly rule-governed pattern, and most native English speakers would probably think of, e.g. quicker and quickest as forms of quick rather than as different words - which seems to argue for the morphemes involved (-er and -est) being inflectional. On the other hand, the changes involved do not seem to involve fitting the adjectives to their grammatical environment in the same way that, for example, adding a plural ending to a noun does - which seems to argue for regarding the morphemes in question as derivational and as having the same kind of status as a morpheme like -ish (in, e.g., quickisb). Another problem in relation to making a hard and fast distinction between morphemes that are in the lexicon and morphemes that are supposedly excluded from it is that a particular morpheme may, in some contexts, have an inflectional role while having a derivational role in others. For example, the bound morpheme -ing is the marker of progressive aspect in English verbs. That is to say it marks a verb as referring to ongoing process or activity rather than a stable state or completed process or action, for example: She is being awkward. (as opposed to: She is awkward.}

Lexis and morphology


I was working. (as opposed to: / worked.) In this kind of instance -ing has certainly to be seen as having an inflectional role. However, -ing can also be used in the formation of verbal nouns - just like derivational morphemes such as -ion and -ment, for example: Two fudges were responsible for the administering of the oath. (compare: Two judges were responsible for the administration of the oath.) The deferring of the meeting had some unfortunate consequences. (compare: The deferment of the meeting had some unfortunate consequences.) In this case -ing is clearly involved in word formation and has to be considered derivational. Are we going to say that -ing is sometimes supplied by the lexicon and sometimes not? A third problem in this connection is that inflectional morphology does not conform to rules to anything like the extent that it is believed to. For example, the morphology of the pluralization of English nouns has some highly idiosyncratic features, as the examples listed below show. Indeed, there do not seem to be noticeably fewer divergences from the regular plural pattern in nouns than from the normal (derivational) pattern of adverb formation in English (ADJECTIVE + bound derivational morpheme -ly, for example: dark/darkly, delicateldelicately, spiteful/spitefully etc., but fastlfast). One would have thought that in both cases the lexicon would have to contain at least information as to whether a given word was subject to the normal pattern or not, and, if not, what the relevant particularities of its morphology were. SINGULAR























Language and the lexicon formula

formulae (or formulas)










plateaux (or plateaus)


cherubim (or cherubs)

Finally, on the question of whether or not inflectional morphemes are lexicon-based, let us consider the ways in which the addition of inflectional morphemes affects the free morphemes to which they are attached. In very many instances there is no perceptible effect at all, in spoken or written form. Thus, the addition of -ed to the base forms fill, jump, stay or want changes nothing in these base forms. On the other hand, as can be seen from the collection of English noun plurals above, there are plenty of other cases where noticeable alterations to the form of the free morpheme do occur. Sometimes there is just a slight change 'at the join', as it were. The plural of wife is a case in point; here the /f/ of the singular form is replaced by a Ivl in the plural form - wife (/waif/) ->• wives (/waivz/). Sometimes - especially in plurals borrowed from Greek and Latin - the last syllable of the singular form is replaced by one or more different syllables in the plural, as in: stimulus (/'stimjubs/) -> stimuli (/'stimjulai/). And sometimes pluralization occasions changes in the very core of the pluralized word. This is the case for children, where the plural is signalled not only by the attachment of the anomalous morpheme -ren but also by a change in the quality of the vowel sound (written i) in child from /at/ in child to lIl in children. In man - men (/mam/ —> /men/) the change in the vowel sound in the core of the word from Isel to /£/ is the only way in which pluralization is signalled. All of the plurals referred to in the preceding paragraph represent a challenge for linguists. As long as the morphemes they are dealing with are neatly sequential (jump + ed, cat + s, sing + ing etc.), and as long as the allomorphs of such morphemes resemble each other and are predictable from the phonological environment, morphologists can provide analyses which are relatively straightforward and concrete. In the cases of wives, children, men, stimuli and the like, however, the morpheme takes on a more abstract quality, and its allomorphs have to be treated partly or wholly in terms of processes rather than simply in terms of an added element whose variant forms are determined by phonological environment. Thus, the allomorph of the English noun plural morpheme represented in wives involves not only the addition of -s but also a process whereby the last consonant of the word changes from /f/ to /v/, which in turn means that the -s ending is pronounced as 111. Similarly, in children, the allomorph involved is the anomalous ending -ren (/ran/) plus the process which changes /ai/ to /i/ in child. In man — men

Lexis and morphology


and stimulus - stimuli only a process allomorph is involved, the changes from /«/ to /e/ and from /as/ to /ai/ respectively. It is clear, then, that inflectional morphemes are not necessarily just elements that are tacked on to free morphemes without affecting the forms of those free morphemes. Let us compare the case of inflectional morphology with that of derivational morphology in this connection. Some derivational morphemes are said to be 'neutral' in their phonological effects, that is they do not change anything in the word to which they are attached. For example, the above-mentioned adverbializing morpheme -ly has no impact on the forms to which it is attached; thus, the warm element of warmly is constant in both words. Similarly, 'neutral' derivational morphemes include: -ment (as in pave - pavement), -ness (as in white - whiteness] and en- (as in cage - encage). Other derivational morphemes, on the other hand, are recognized as 'non-neutral' in their impact on the free morpheme base, and are accordingly thought of as at a different level of relationship with that base and as more deeply 'lexical'. For instance, the addition of -ic to meteor changes the stress pattern of this latter; thus: meteor, but meteoric As for the derivational morpheme -ion, it causes all manner of changes in the base form - sometimes occasioning a shift in stress, sometimes bringing about changes in or additions to the sound segments making up the word to which -ion is attached, sometimes causing both, for example: admire










To return to the case of inflectional morphemes, these can be every bit as 'nonneutral' in the above sense in relation to the words to which they are affixed as can derivational morphemes. We have seen how the English noun plural morpheme may be expressed via significant alterations in the forms of the nouns pluralized. The morpheme which marks the simple past tense (what is sometimes known as the preterite) in English is associated with changes in the base forms of verbs which are no less far-reaching. Thus, alongside play - played, hope - hoped, want - wanted etc. we have present-preterite oppositions, such as: bear











Language and the lexicon

In sum, then, although in principle one can see the point of distinguishing between morphology which is involved in word formation and morphology which is not, it has always to be borne in mind that this distinction is by no means clear-cut. It also needs to be recognized that inflectional morphology is quirky and lexically determined in the same way that 'lexical' morphology is. Finally, it cannot be ignored that inflectional morphemes may have just as large an impact on the forms of words to which they are attached as 'lexical' morphemes. All in all, there seem to be absolutely no good grounds for suggesting that inflectional morphemes lie outside the domain of the lexicon; and to the extent that the term lexical morphology can be interpreted as implying that there is a morphology which is non-lexical, it needs to be treated with caution.



This chapter has explored the internal structure of words. It began by noting that the morphemes of which words are made up may be either free, that is units that may stand alone as words in their own right, or bound, that is units that can occur only as parts of words. The phenomenon of morphemic alternation - the way in which morphemes may be realized in varying ways - was also dealt with. The chapter then moved on to discuss the distinction between 'lexical' morphology - morphology involved in word formation and inflectional morphology - morphology involved in fitting words to their grammatical environment. It was shown that the distinction between these two categories of morphology is not entirely clear-cut, that some morphemes sit astride the two categories, and that inflectional morphemes may have just as great an impact as 'lexical' morphemes on the base forms of words to which they are attached. In the light of these facts it was argued that there are no good grounds for considering one particular category of morphemes, i.e. inflectional morphemes, to lie outside the domain of the lexicon.

Sources and suggestions for further reading The account of morphemes and allomorphs presented here draws on the broad tradition of received wisdom in morphological studies as represented in works such as: F. Katamba, Morphology (Basingstoke: Macmillan, 1993); J. Lyons, Introduction to theoretical linguistics (Cambridge: Cambridge University Press, 1968, Chapter 5); P. H. Matthews, Morphology: an introduction to the theory of word-structure (Cambridge: Cambridge University Press, 1974). See 3.2. The Bontoc and Chickasaw examples in 3.1 are taken from V. Fromkin and R. Rodman, An introduction to language (Sixth edition, New York: Harcourt Brace, 1998, Chapter 3). The distinction between 'lexical'

Lexis and morphology


morphology and inflectional morphology sketched follows that to be found in P. H. Matthews, Morphology: an introduction to the theory of wordstructure (Cambridge: Cambridge University Press, 1974). Other works adopting a similar approach include F. Katamba, Morphology (Basingstoke: Macmillan, 1993 ). The notion that inflectional morphology is grammatical rather than lexical has a long history - dating back to Bloomfield (see L. Bloomfield, Language, New York: Holt, 1933, 274) and beyond. See 3.4. The ambivalence of the -ing morpheme is referred to by A. Akmajian, R. A. Demers, A. K. Farmer and R. M. Harnish in their book Linguistics: an introduction to language and communication (Cambridge, MA: MIT Press, 1990). The general difficulty of distinguishing between derivational and inflectional morphology is very widely acknowledged by linguists - even if at times a little grudgingly - see, for example, A. Spencer, Morphological theory: an introduction to word structure in generative grammar (Oxford: Blackwell, 1991). The concept of 'neutrality/non-neutrality' in the context of morphology is discussed by F. Katamba (Morphology, Basingstoke: Macmillan, 1993 ) and by J. Harris (English sound structure, Oxford: Blackwell, 1994}; see also P. Kiparsky 'Word formation and the lexicon', in F. Ingemann (ed.), Proceedings of the 1982 Mid-America Linguistics Conference, Kansas: University of Kansas Press, 1983). The case in favour of the idea that all morphology is essentially lexically based is put by, among others, J. T. Jensen and M. Strong-Jensen (1984) 'Morphology is in the lexicon' (Linguistic Inquiry 15, 74-98). Arguments tending broadly in the same direction are also to be found in M. Aronoff and F. Ashen, 'Morphology and the lexicon: lexicalization and productivity', in A. Spencer and A. M. Zwicky (eds), The handbook of morphology (Oxford: Blackwell, 1998). Accessible presentations of basic morphological concepts are to be found in a number of introductions to linguistics. Particularly recommended in this connection are: V. Fromkin and R. Rodman, An introduction to language (sixth edition, New York: Harcourt Brace, 1998); R. H. Robins, General linguistics: an introductory survey (London: Longman, 1989). Among works treating morphology in greater depth and which would be suitable for more advanced reading the following is recommended (in addition to the Matthews and Katamba volumes mentioned above): A. Carstairs-McCarthy, Current morphology (London: Routledge, 1991). For those looking for particular perspectives on morphology the following titles may be of interest: On the recent history of morphology: P. H. Matthews, Grammatical theory in the United States from Bloomfield to Chomsky (Cambridge: Cambridge University Press, 1993).


Language and the lexicon

On cognitive dimensions of morphology: C. J. Hall, Morphology and the mind: a unified approach to explanation in linguistics (London: Routledge, 1992). On the role of morphology in the Chomskyan framework: A. Spencer, Morphological theory: an introduction to word structure in generative grammar (Oxford: Blackwell, 1991). Also to be noted is the very comprehensive work edited by A. Spencer and A. M. Zwicky, The handbook of morphology (Oxford: Blackwell, 1998).

Focusing questions/topics for discussion 1. In the following passage try to find five examples of free morphemes and five examples of bound morphemes. I wanted to go to Philip's party, but Robert, my boyfriend, hates spending time around that particular crowd, so we decided instead to sample one of the latest delights on offer down at the local multiplex cinema. As it turned out, we both enjoyed the film in question, but we definitely did not appreciate the two guys sitting in front of us, who were very tall and very noisy. 2. Consider the following words. Try to identify which of them respectively exemplify compounding, derivation and inflection. grapes




















3. In 3.3 we saw some examples of English noun plural forms which departed from the normal (+ -(e)s ) pattern. Try to think of five further examples of such irregular noun plurals in English and try also to think of five examples of English past tense forms which do not conform to the normal (+ -ed) pattern. Also, reflect on the inflectional morphology of any language you know other than English and try to identify some examples of inflectional irregularity in this other language too. 4. Group the following words into those which show a 'neutral' impact on the base form on the part of the relevant bound morpheme and those which show a 'non-neutral' impact. active








Lexis and morphology














5. The following words are often thought of as posing problems for morphologists. Can you say why this might be? Can you think of some further words that might pose similar kinds of problems? bilberry




















This page intentionally left blank

4 Lexical partnerships 4.1

Collocation: the togetherness factor

We saw in Chapter 1 and again in Chapter 2 that a great deal of what was traditionally seen as coming under the heading of grammar is now considered to be essentially lexical in nature. The basic point we noted was that once a particular lexical choice has been made in a given sentence, this choice has a major impact on the determination of what else may or may not occur in the sentence in question. In addition to this strongly determinant aspect of lexical choice there is also an effect in respect of word selection which is rather more probabilistic in nature. This latter effect has to do with the fact that - for a variety of reasons - particular words are frequently to be found in the company of certain other words. In such cases the selection of one or more of the words concerned in a given context is quite likely - or even very likely - to be accompanied by the selection of another word or other words from its habitual entourage. For instance, if a radio or television presenter uses the word breaking, we are anything but surprised if the word news follows. Similarly with: at this moment in


law and


the Middle East peace


As has already been indicated in earlier chapters, this phenomenon of words 'keeping company' together is referred to as collocation. Collocation comes from two Latin words, the word cum ('with') and the word locus ('place'). Words which form collocations are repeatedly 'placed with' each other; that is to say, they often co-occur within a short distance of each other in speech and in written texts. In this chapter we shall briefly explore the notion of collocational range, look at fixed expressions and compounds, examine the role of collocational information in traditional dictionaries, review some of the recent corpus-based research into collocations, consider the question of the extent to which language use is based on prefabricated multi-word


Language and the lexicon

chunks, and discuss some of the implications of the collocation phenomenon for our understanding of the notion of lexical unit.


Collocational range

Even the most casual reflection on the way in which we put words together in the languages we know will lead us to an awareness of the fact that some words enter into a great number and variety of lexical partnerships, whereas other words are, as it were, a great deal more 'choosy' about the combinations they become involved with. At the many-partnered end of the scale is, for example, the English word nice. The list of items with which this word frequently co-occurs seems to be almost endless; the following is a tiny sample of the vast array of nice collocations: nice body, nice day, nice food, nice bouse, nice idea, nice job, nice manners, nice move, nice neighbourhood, nice person, nice time, nice weather. At the other end of the scale is the word addled, which in its literal sense of 'rotten' collocates only with egg(s), and which in its metaphorical sense of 'muddled' collocates only with words such as brain(s) and mind. The term used to refer to these different patterns of combinability is collocational range; thus, nice would be said to have a very wide collocational range, whereas addled would be said to have a very restricted collocational range. One obvious issue that arises in the context of looking at collocational range - indeed in the context of collocational research generally - is how far away from each other two words can be in a piece of speech or writing and still be regarded as 'keeping company'. For example, taking the word garden as our starting point or node, which other words in the following sentences are to be considered as occurring close enough to garden to qualify as candidates for having a collocational relationship with that word. They invited me to a garden party. County Wicklow is sometimes called the Garden of Ireland. The children were playing in the garden. None of these houses has a decent garden. The garden was totally devoid of flowers. These gardens are famous for their exotic trees. I planted those tulip bulbs I bought in Amsterdam in the garden this year. Party in the first sentence obviously counts, since there are no intervening words between it and garden, but just how many intervening words between the node and a potential lexical partner are we prepared to accept? If one is the answer, then Ireland in the second sentence comes into the frame, if up to

Lexical partnerships


five, then so do playing (sentence 3), houses (sentence 4), flowers (sentence 5) and trees (sentence 6). What about tulip and bulbs in the final sentence? Can we accept six or seven intervening words and still talk about 'keeping company'? Different researchers will set the limits differently in this connection, but it is clear that there is no straightforward solution to this problem, and that whatever decision is taken will be open to debate.


Fixed expressions and compounds

A particular grouping of words may recur so frequently in a language that it comes to be seen as a fixed expression. Some examples of fixed expressions in English are: once in a blue moon seeing is believing the more the merrier the other side of the coin to throw in the towel Obviously, some fixed expressions are more fixed than others. In some of the above instances, for example, almost no change to either the order of the words or the actual words used is possible without the general meaning or the acceptability of the expression being affected. Thus, in seeing is believing, it might just be possible to insert an adverb before is (e.g. seeing really is believing), but otherwise the expression has to be used as it is. Similarly with the more the merrier; here the only admissible change is the placing of an intensifying word (usually a taboo word or a euphemism for a taboo word) before merrier (e.g. the more the bleedin' merrier). In other cases changes in the syntax and in the actual components of the expression can be made without the force of expression being undermined. Thus, the other side of the coin can be manipulated in various ways while still maintaining its essential identity: Moving on to the cost of the project, here we see the negative side of the coin. Of the French economy it has been remarked that this is a coin that has two very different sides. As for the present political situation, well, which side of the coin shall I begin with? Fixed expressions vary also in relation to the extent to which their overall meanings can be arrived at by simply adding together the meanings of the words out of which they are composed. For example, seeing is believing is


Language and the lexicon

interpretable simply on the basis of a knowledge of the normal meanings of the individual words involved in this expression. However, in the case of to throw in the towel, it would not be possible to interpret this as 'to give up', 'to surrender' unless one actually knew that this meaning attached to the whole expression - or unless one knew enough about boxing (where a towel thrown into the ring has traditionally been a way of conceding defeat) to be able to decode the metaphor. Expressions such as these which are 'semantically opaque' in this kind of way are generally referred to as idioms. Lexical items which very frequently co-occur with each other often fuse together into compound words (see above, Chapter 3). Examples of this are blackboard (black + board), keyhole (key + hole) and paintbrush (paint + brush). In such instances the relationship between the meaning of the compound word and the meanings of its individual constituent words is not always a simple one. Thus, for example, blackboard does not denote any old board which is black, but a very specific kind of black board, usually found in classrooms, on which it is possible to write (and make excruciating noises!) with chalk. The rule of thumb commonly appealed to for distinguishing between compound words and fixed expressions is based on an orthographic criterion. If two words are joined together in written form we tend to label them as a compound word; if not, we tend to treat them as participating in a fixed expression. However, this is a highly arbitrary distinction. Within a particular language a given expression may be transcribed in various ways. For example: air bag



coffee shop coffee-shop coffeeshop gold mine



It is also worth saying that, as we saw in Chapter 1, some languages are written down using systems which do not mark word boundaries, and some languages are not written down at all; clearly, in these cases the orthographic approach to distinguishing between fixed expressions and compounds would be totally irrelevant. A phonological approach to this conundrum does not get us very far either. As, again, we saw in Chapter 1, whereas, for example, in most English words we can identify one syllable carrying the main stress, in many multi-word expressions that on the orthographic criterion, and according to native speakers' own intuitions, would not be classed as compound words, only one main stress occurs over the whole group. Thus: barber shop group feel good factor skin care ointment

Lexical partnerships


We might also note that phonological usage in this regard varies within language communities. The expression New Year (as in Happy New Year!), for instance, is given just one main stress by some speakers of English (New Year), while other speakers of English place a stress on both words (New Year). Nor does there seem to be a simple way of distinguishing between compound words and fixed expressions in semantic terms. We have seen some examples of compounds whose meanings are not straightforwardly computable from the meanings of the words which compose them. However, as we have also noted, it is equally easy to find examples of collocations with similarly peculiar semantics: heavy smoker is not typically understood as 'overweight nicotine-user'; criminal lawyer is in most contexts taken to mean something other than 'law-breaking attorney'; and artificial florist will not usually be interpreted as 'flower-seller of unnatural origin'! On the other hand, fixed expressions as well as compounds often mean exactly what they look as if they might mean. Thus, heavy vehicle uncomplicatedly denotes a vehicle which is heavy; criminal behaviour denotes behaviour which is criminal; and artificial additive denotes an additive beyond Mother Nature's range. Similarly, coalminer denotes someone who mines coal, sunlight denotes the sun's light, and workplace denotes the place where one works.


Collocations and the dictionary

Whether it is possible to differentiate rigorously between compound words and collocations, and whether the meaning generated by the co-occurrence of two or more particular lexical items is a straightforward sum of the individual meanings of the items concerned, it is clear that the combinations into which a given word may enter and the meanings that attach to the various combinations in question are important elements in that word's profile. This is recognized at a practical level by dictionary-makers, as is demonstrated by the fact that (leaving aside the very tiniest pocket dictionaries) dictionary entries have traditionally not only treated the individual words concerned but have also referred to items with which they frequently co-occur. The following entry from the 1940 edition of the Harrap's Shorter French and English Dictionary is fairly representative. fatigue [fatig], s.f. 1. (a) Fatigue, tiredness, weariness. Tomber de fatigue, to be dropping with weariness. Brise de fatigue, dog-tired; dead-beat, (b) Souliers de fatigue, strong walking shoes. Habits de fatigue, working clothes. Cheval de fatigue, cart-horse. Mec. E: Pieces de fatigue, parts subject to strains. 2. Wear and tear (of machines, clothes etc.). As was mentioned in Chapter 2, the suggestion was made many years ago by the British linguist J. R. Firth that investigating the lexicon was essentially a


Language and the lexicon

matter of exhaustively investigating collocations, and, in fact, he specifically referred to lexicography (i.e. dictionary-making) in this context. The idea that dictionary-making needs to be founded on collocational research is a point of view which continues to have its champions today. Indeed, it is an idea which has been gaining ground over the last 10-15 years. Moreover, since Firth's time information technology has developed to the point where it is now possible - through the use of computerized corpora (see above, 2.2) to undertake the kind of exhaustive investigation of collocations that Firth called for, and such corpora are indeed drawn on in the preparation of dictionaries, as well as being exploited in many other ways.


Corpora and collocations

The present view of many linguists is that the investigation of collocations is inextricably bound up with the exploitation of computerized corpora, for the simple reason that only through the use of such corpora - with their vast amounts of authentic data and their concordancing software - is it possible to come to any reliable conclusions about which words 'keep company' with which. Collocations were certainly studied before the advent of electronic corpora; the work of J.R. Firth in the 1950s has already been mentioned in this connection, and before him, in the 1930s, another British linguist, H.E. Palmer, was already deep into collocational research; however, there is no doubt that the creation of such corpora has enabled this area of research really to come into its own. The potentialities of electronic corpora in this regard have been dramatically demonstrated by the COBUILD project. COBUILD (Collins Birmingham University International Language Database) involves a partnership between the Collins (now HarperCollins) publishing house and the School of English of the University of Birmingham. It has assembled a vast and still growing corpus of naturally occurring English data, now known as the Bank of English. Recent reports indicate that the corpus currently runs to more than 320 million words of spoken and written English text. There were, admittedly, corpora in existence before COBUILD, and other corpora were developed alongside and after COBUILD', however, the COBUILD project went further than its predecessors in showing how useful a corpus could be not only to researchers focused on language description but also in very practical domains such as the production of dictionaries and languageteaching materials, and, in so doing, it blazed a trail for the many corpusbased projects that followed and imitated it. It should perhaps also be noted in the present context that the director and leading light of the COB UILD project, John Sinclair, was deeply involved in collocational research long before the project was ever thought of, and that he always saw one of the principal attractions of the project as being its capacity to shed light on collocational issues.

Lexical partnerships


Materials and language descriptions arising out of the COBUILD project base their definitions and illustrations on the combinatorial patterns discernible in the corpus. The following is a typical COBUILD dictionary entry. The meaning it assigns reflects an exhaustive analysis of the environments in which the word in question has been found to occur in the corpus — some of which are cited in the entry. veritable /veritabsl/ is used to emphasize a description of something and used to suggest that, although the description might seem exaggerated, it is really accurate. EG The water descended like a veritable Niagara ... I'm sure the audience has a veritable host of questions ... ... a veritable passion for the cinema. We can see the same kind of approach in the Collins COBUILD English Grammar, as the following excerpt demonstrates. Many nouns can be used after 'make'. . . . There is usually a related verb which can be used followed by a reported clause. She made a remark about the weather. Allen remarked that at times he thought he was in America. Now and then she makes a comment on something. Henry Cecil commented that the ground was too firm.

Here is a list of nouns which are used after 'make' and have a related reporting verb: arrangement










Other nouns used with 'make' express speech actions other than reports or describe change, results, effort, and so on. /'// make some enquiries for you. They agreed to make a few minor changes. McEnroe was desperate to make one last big effort to win 'Wimbledon again.


Language and the lexicon Here is a list of other nouns which are used after 'make': appeal
















Theoretical/descriptive linguists drawing on the COBUILD Bank of English use it as a basis for making statements about how words are combined that go beyond syntactic generalizations. For example, faced with a sentence such as The bushes and trees were blowing in the wind, but the rain had stopped, a syntactician would wish to analyse it in terms of finite clauses, noun phrases and verb phrases; the collocationally oriented corpus linguist, on the other hand, would be inclined to look at the whole range of instances in the databank in which combinations like blow-wind, rain-stop occurred in order to be able to comment on the lexical frame 'SOMETHING blowing in the wind' (which, as it turns out, is a great deal more likely to occur than the lexical frame 'the wind blowing SOMETHING') or to be able to note that rain followed by stop is much more typical that rain followed by end. Some further electronic English-language corpora which are frequently referred to in the lexicological literature, and which to a greater or lesser extent have been used in collocational research, are mentioned below. We shall be revisiting some of them, as well as the COBUILD corpus, in Chapter 10 when we return to the topic of dictionary-making. • the Brown Corpus (Brown University Standard Corpus of Present-Day American English), started in 1961, comprising one million words of written American English; • the LOB Corpus (London, Oslo, Bergen Corpus), compiled between 1970 and 1978, involving the collaboration of the University of Lancaster, the University of Oslo and the Norwegian Computing Centre for the Humanities at Bergen, comprising one million words of written British English; • the London-Lund Corpus, available since 1987, based mostly on the University of Lund's Survey of Spoken English (1975), which in turn was mostly based on the (non-computerized) Survey of English Usage compiled at University College London (1959), comprising approximately half a million words of spoken English; • the Longman-Lancaster Corpus, dating from 1996, comprising 30 million words of spoken and written English from British and American sources; • the JBNC (British National Corpus], compiled between 1991 and 1995, involving collaboration between Oxford University Press, Longman

Lexical partnerships


Chambers Harrap, the University of Lancaster, the British Library and Oxford University Computing Service, comprising 90 million words of written British English and 10 million words of spoken British English; • the C/C (Cambridge International Corpus - formerly known as the Cambridge Language Survey], available since 1996, an initiative of Cambridge University Press, comprising 95 million words of written English (the spoken language annexe of C/C, compiled in collaboration with the University of Nottingham and comprising five million words, is known as the CANCODE - Cambridge and Nottingham Corpus of Discourse in English}.


Creativity and prefabrication in language use

Linguists have put a good deal of emphasis in the last three or four decades on what Noam Chomsky calls the 'creative' dimension of language use - on the fact that knowledge of a language enables one to 'understand an indefinite number of expressions that are new to one's experience . . . and ... to produce such expressions'. While it is undoubtedly true that we can and do use language innovatively and open-endedly in precisely the way Chomsky suggests, it certainly is not the case that our use of language is exclusively 'creative' in this sense. Large numbers of the sequences of words that we deploy and encounter in everyday speech and writing are clearly combinations that we have available to us as more or less prefabricated chunks such combinations ranging from fixed idiomatic expressions like cats and dogs (= 'hard' as in It's raining cats and dogs) to 'semi-fixed' combinations such as to know one's onions/stuff and to know/be up to all the tricks. An analysis of authentic data in preparation for the Oxford Dictionary of current idiomatic English, for example, yielded literally thousands of such stable multi-word units. Similarly, it has been estimated that the Oxford Dictionary of phrasal verbs and the Oxford Dictionary of English idioms between them contain some 15,000 multi-word expressions. There is also psycholinguistic evidence to suggest that fixed expressions and formulas have an important economizing role in speech production; that is to say that they enable us to produce speech which is very much more fluent than it would be if we had to start from scratch and build up piece by piece every expression and every structure we use. This notion has been taken a stage further by Sinclair, on the basis of his experience with the COBUILD data, and developed into the so-called 'idiom principle'. (The term idiom is used here with a much broader application than in 4.3, where mention was made of its more usual usage as a label for fixed expressions with meanings that cannot be deduced from the meanings of their component parts). The idiom principle states that, when we are putting together phrases in a language we know, although it may look as if we operating on the basis of open choices at every stage (the only constraints


Language and the lexicon

being that what we produce has to be broadly grammatical and make sense), what we are doing most of the time is drawing on our knowledge of pre-constructed or semi-preconstructed phrases that constitute single choices, varying lexical content within the chosen patterns to a fairly limited extent. Why we do this, rather than going through the process of constructing new phrases out of individual words every time, may, says Sinclair, have to do with our capitalization on the fact that similar situations recur in life and tend to be referred to in similar ways; it may have to do with the fact that we in any case prefer to economize on effort whenever possible; and/or it may have to do with the fact that the demands made on us by the extreme rapidity of speech production are such that we have to exploit every opportunity to make savings on processing time. Some examples of the kind of thing Sinclair has in mind are: • the phrase set eyes on, which usually has a pronoun subject and which is usually associated with either never or an expression such as the moment, the first time - as in I've never set eyes on him; The first time he set eyes on her he knew he would always love her etc.; • the phrasal verb set about, which (in the sense of 'begin') tends to be associated with a following (usually transitive) verb in the -ing form - as in We set about packing our bags; Bill finally set about earning some real money etc.; • the verb happen, which tends to occur in a particular kind of semantic environment - one where unpleasant occurrences, such as accidents, are being referred to - as in No one knew how the catastrophe has happened; Such appalling events can never be allowed to happen again etc.


Collocations, the lexicon and lexical units

What are the implications of collocational patterning for our conception of the lexicon and in particular for our understanding of what constitutes a lexical unit? If the lexicon represents that part of our knowledge of language that revolves around words, then, clearly, collocations have to be seen as included in the lexicon. It is obvious from all that has been said that we need to know about collocational patterns in order to function smoothly in lexical terms in either our mother tongue or any other language we may know. Anyone listening to news reports in English about recent military conflicts, for example, who did not know the terrible meanings that emerge when ethnic 'keeps company' with cleansing, collateral with damage or friendly with fire would be deeply mystified by what they heard. Similarly, and on a lighter note, anyone trying to express great excitement and pleasure in English who used a combination such as I'm on top of the moon (rather than I'm on top of the world or I'm over the moon) would certainly run the risk of incomprehension. With regard to defining the lexical unit, one approach is to take the word as the typical lexical unit and to say that a group of words can be considered

Lexical partnerships


as a lexical unit only if its meaning is associated with the group as a whole rather than a sum of the individual meanings of the constituent words. According to this view black is a lexical unit; so is blackbird (as opposed to black bird), since blackbird denotes a particular species of bird (turdus merula) rather than just a bird of a particular colour; and so is in black and white (as in He wants it in black and white], since the meaning of this whole expression ('written', 'in writing') cannot be arrived at simply by combining the normal meanings of the individual items out of which it is formed. There are at least two possible objections to this approach. On the one hand, the issue of semantic transparency or opacity in relation to multi-word expressions (i.e. whether or not the meaning of a expression can or cannot be seen as a straightforward composite of its component words) is somewhat problematic. It is not the case that multi-word expressions are either self-evidently transparent or self-evidently opaque. There are degrees of opacity. Thus, blackbird is less opaque than ladybird (which in many varieties of English is the word used for the insect that in American English is called ladybug); and ladybird (given that ladybirds do at least fly like birds!) is less opaque than strictly for the birds (= 'trivial', 'uninteresting'). Even many apparently transparent examples like fish and chips turn out on closer inspection to have opaque aspects; thus, in order to qualify to be described as fish and chips a culinary product has to involve one of a particular range of types of fish (sardine, trout or tuna will not do) and has to have been cooked and presented in a particular way. Another problem is that using a purely semantic criterion is a rather narrow way of looking at the matter. It leaves out of account the question of whether in the use of a particular expression - whatever its degree of semantic opacity - the individual words are selected and are perceived to function singly or together. For example, the following expressions are all relatively transparent, but there is little doubt that they are selected and understood as wholes rather than being processed in a word by word manner. midnight good-natured diesel engine bread and butter say it with flowers As we have seen, it has been suggested that most of our use of language relies on the exploitation of collections of words that to a greater or lesser extent function together as entire packages. Whether or not this is true, it does seem clear that groups of words which are transparent in their meaning may nevertheless operate as units. To sum up, even a conservative approach to the question of what counts as a lexical unit based on a criterion of semantic unitariness has to concede


Language and the lexicon

that there are lexical units which consist of more than one word. An approach which makes reference to the broader issue of the selection and perception of multi-word expressions as wholes (whatever their degree of semantic transparency/opacity) yields the conclusion that many multi-word expressions which are semantically transparent are none the less to be seen as lexical units.



This chapter looked at the commonly observed fact that certain words habitually 'keep company' with certain other words. It showed that a particular word may have a wider or more restricted collocational range, that is, enter into frequent partnership with a greater or lesser quantity and variety of other words; it explored the relationship between compound words and fixed expressions, concluding that there was no hard and fast way of distinguishing between these two categories of collocation; it touched on collocational description in traditional lexicography; it discussed the way in which collocational research has been enhanced by the advent of electronic corpora; it reported on evidence from corpus-based research that language users incorporate very large numbers of preconstructed and semi-preconstructed multi-word expressions into their speech, and noted a suggestion that most language use relies on sequences of words that are to a greater or lesser extent prefabricated; and, finally, it examined the implications of the results of collocational research for our understanding of the nature of lexical units.

Sources and suggestions for further reading See 4.2. The treatment of the notion of collocational range, which originates in A. Mclntosh, 'Patterns and ranges' in A. Mclntosh and M. A. K. Halliday, Patterns of language: papers in general, descriptive and applied linguistics (London: Longman, 1966), owes much to Chapter 3 of R. Carter's book Vocabulary: applied linguistic perspectives (second edition, London: Routledge, 1998). Carter's chapter was in fact also a valuable source for much of the rest of the discussion of collocations. See 4.3. The heavy smoker, criminal lawyer and artificial florist examples are borrowed from F. Palmer's Grammar (Harmondsworth: Penguin, 1971, 45,54). See 4.4. The illustrative dictionary entry in 4.4 are taken from the 1965 reprint of J. E. Mansion (ed.), Harrap's Shorter French and English Dictionary (London: George G. Harrap & Company, 1940, 259). A concise account of Firth's collocational approach to lexicographical issues - in his

Lexical partnerships


own words - is to be found on pages 26-7 of his article 'A synopsis of linguistic theory' in Studies in linguistic analysis (Special Volume of the Philological Society, Oxford: Blackwell, 1957). See 4.5. On the question of the connection between electronic corpus-based studies and collocation research, a typical pronouncement is that of Moon: 'collocation studies are now inevitably associated with corpus studies, since it is difficult and arguably pointless to study such things except through using large amounts of real data' (R. Moon, 'Vocabulary connections: multiword items in English', in N. Schmitt and M. McCarthy (eds), Vocabulary: description, acquisition and pedagogy, Cambridge: Cambridge University Press, 1997, 41). H. E. Palmer's work, and in particular his Second interim report on English collocations (Tokyo: Institute for Research in English Teaching, 1933) is cited by G. Kennedy in his Introduction to corpus linguistics (London: Longman, 1998, 108). See 4.5. The two main sources for the description of the COBUILD project in 4.5 are: J. Sinclair (ed.), Looking up: an account of the COBUILD project in lexical computing and the development of the Collins COBUILD English Language Dictionary (London: Collins, 1987) and J. Sinclair, Corpus, concordance, collocation (Oxford: Oxford University Press, 1991). The figure of 320 million words is quoted by R. Carter (Vocabulary, London: Routledge, 1998, 167). The COBUILD dictionary entry is cited and discussed by R. Krishnamurthy in his article 'The process of compilation' (in J. Sinclair (ed.), Looking up, London: Collins, 1987). The extract from the Collins COBUILD English grammar (London: Collins, 1990, 150-1) is taken from the section entitled 'Verbs with little meaning: delexical verbs'. The brief discussion of the sentence The bushes and trees were blowing in the wind, but the rain had stopped is based on R. Moon's comments on p. 41 of her article 'Vocabulary connections: multi-word items in English' (in N. Schmitt and M. McCarthy (eds), Vocabulary: description, acquisition and pedagogy, Cambridge: Cambridge University Press, 1997). See 4.6. The Chomsky quote is to be found on p. 100 of N. Chomsky, Language and mind (enlarged edition, New York: Harcourt Brace Jovanovich, 1972). The report on the research leading to the Oxford Dictionary of current idiomatic English (eds A. Cowie, R. Mackin and I. A. McCaig, two volumes, Oxford: Oxford University Press, 1975/1983) is that of A. Cowie in his article 'Stable and creative aspects of vocabulary use' (in R. Carter and M. McCarthy (eds), Vocabulary and language teaching, London: Longman, 1988). The Oxford Dictionary of phrasal verbs and the Oxford Dictionary of English idioms are both published in Oxford by Oxford University Press (1993); the quantitative figure put on their content (15,000 multi-word expressions) is cited by R. Moon ('Vocabulary connections: multiword items in English', in N. Schmitt and M. McCarthy (eds), Vocabulary: description, acquisition and pedagogy, Cambridge: Cambridge University Press, 1997, 48). Psycholinguistic evidence regarding the facilitating role of


Language and the lexicon

prefabricated patterns in speech production is referred to by, among others, A. Peters in The units of language acquisition (Cambridge: Cambridge University Press, 1983). The discussion of John Sinclair's idiom principle is based on the section entitled 'The idiom principle' in his book Corpus, concordance collocation (Oxford: Oxford University Press, 1991,110-15); the examples used in this context are taken from pp. 111-12 of this book. See 4.7. The importance of the contribution of collocational knowledge to linguistic competence, referred to in 4.7, is discussed by, among others, M. Benson ('Collocations and idioms', in R. Ilson (ed.), Dictionaries, lexicography and language learning, Oxford: Pergamon/The British Council), G. Kjellmer ('A mint of phrases', in K. Aijmer and B. Altenberg (eds), English corpus linguistics: studies in honour of Jan Svartvik, London: Longman, 1991), and T. Van Der Wouden (Negative contexts: collocation, polarity and multiple negation, London: Routledge, 1997). The semantically based approach to the definition of lexical units summarized in this section is essentially that proposed by D. A. Cruse in his book Lexical semantics (Cambridge: Cambridge University Press, 1986, Chapter 2). The blackbird, ladybird and fish and chips examples are all borrowed from Cruse, this source. Good introductions to the collocational aspect of the lexicon are to be found in Chapter 3 of R. Carter's Vocabulary: applied linguistic perspectives (second edition, London: Routledge, 1998), in Chapter 9 of E. Hatch and C. Brown's Vocabulary, semantics and language education (Cambridge: Cambridge University Press, 1995), in R. Moon's above-mentioned article in N. Schmitt and M. McCarthy's edited volume Vocabulary: description, acquisition and pedagogy (Cambridge: Cambridge University Press, 1997), and in the first section of T. Van Der Wouden's Negative contexts: collocation, polarity and multiple negation (Amsterdam: John Benjamin, 1997). Numerous books on the use of corpora in collocational research (and linguistic research generally) are now available. Some of the relevant titles are: K. Aijmer and B. Altenberg (eds), English corpus linguistics: studies in honour of Jan Svartvik (London: Longman, 1991); G. Kennedy, Introduction to corpus linguistics (London: Longman, 1998); T. McEnery and A. Wilson, Corpus linguistics (Edinburgh: Edinburgh University Press, 1996); J. Sinclair (ed.), Looking up: an account of the COBUILD project in lexical computing and the development of the Collins COB UILD English Language Dictionary (London: Collins, 1987); J. Sinclair, Corpus, concordance, collocation (Oxford: Oxford University Press, 1991); J. Thomas and M. Short (eds), Using corpora for language research: studies in the honour of Geoffrey Leech (London: Longman, 1996).

Lexical partnerships


Focusing questions/topics for discussion 1. At the beginning of the chapter some cliches commonly used in journalism were mentioned (breaking news, law and order etc.). Try to think of five more such cliches in English and also try to think of one or two in any other language you know. 2. In 4.2 we looked at the notion of collocational range, comparing the very wide collocational ranges of nice with the very restricted collocational range of addled. Consider the following words and try to categorize them likewise according to their collocational range - that is to say, into items with very wide collocational ranges and items with much more restricted ranges. In each case give examples of collocations in which they occur. big












3. In our discussion of fixed expressions and compound words in 4.3 we noted that some compounds and fixed expressions are semantically transparent (i.e. have meanings which are essentially combinations of the meanings of their component parts) and that others are semantically opaque (i.e. have meanings which are not simply sums of the meanings of their component parts). Consider the following compounds and fixed expressions and try to decide which are semantically transparent and which are semantically opaque. In the case of those which are semantically opaque demonstrate their opacity by providing definitions of their meanings. air circulation system

eye strain

to pop off


to see eye to eye

to pop the question

blue skies


to look on the sunny side

blue language

to go fox-hunting


a weekend in the country

good grief!

to sing out of tune

country music

grievous bodily harm

he who pays the piper calls the tune

4. In 4.4 we saw some examples of the way in which information about collocational patterns have been incorporated into traditional dictionary entries. Imagine you are writing dictionary entries for the following words and decide what kind of collocational information and examples you would include in these cases. day











Language and the lexicon

5. It seems that some kinds of writing are full of well-worn expressions and phrases, while others are characterized by a relative absence of frequent collocations. Horoscopes tend to fall into the former category and poetry into the latter. Have a look at some horoscopes and some poems and try to decide why the writers of these texts took the approach they did in relation to the use of collocations.

5 Lexis and meaning 5.1

Words making the difference

It is quite obvious to any user of any language that there is an intimate connection between the lexicon and meaning. A brief glance at the following two brief passages - which are identical but for one word - will persuade anyone who needs convincing just how much difference to the meaning of an entire stretch of language a single word can make. The interrogating officer moved closer to the prisoner. 'Let's see how you like this', he said. He then hit the prisoner with a truly vicious question. The interrogating officer moved closer to the prisoner. 'Let's see how you like this', he said. He then hit the prisoner with a truly vicious truncheon. Of course, the use of different sequences of words does not always yield vastly different overall meanings. Indeed, the English expression in other words normally introduces a phrase or a sentence which is differently formulated from but similar in meaning to what went before it, for example: / worship the ground you stand on, dearest Patricia. I bless the day that you were born, and I rejoice in every breath you take. In other words, sweet Patty, I love you. Usually, in such cases, as in the above example, some kind of summary of the preceding material is involved. There is also the fact that individual words may resemble each other semantically to the point where they are synonymous, i.e. can replace each other in some contexts without any noticeable change in meaning being involved, for example: They stumbled into the sitting room and collapsed on to the couch. They stumbled into the sitting room and collapsed on to the sofa.


Language and the lexicon The questions on this paper were too hard for us to answer. The questions on this paper were too difficult for us to answer. Josie and I are the best of pals now. Josie and I are the best of friends now.

However, it is generally true to say that the meaning of what we say or write is carried to a very large extent by the words that we choose, and that changing words more often than not changes meanings, for example: Sue lives up North, well in the Midlands really, not too far from Leicester. It says here in the paper that he lived off 'immortal earnings'. I suppose they mean 'immoral'. I used to jog around the park, but now I just walk! In what follows we shall explore some of the ways in which linguists have tried to come to grips with the relationship between words and meaning. We shall start by looking at the notion that lexical meaning is essentially about expressions being applied to objects, places, people, attributes, states, actions, processes etc. in the 'real world'. We shall then consider that dimension of meaning which has to do with relations between words. Our next port of call will be the suggestion that the meaning of any given word can be analysed into a set of sense components. Finally, we shall examine some 'cognitive' approaches to word meaning - that is, approaches which are based on the idea that the ways in which linguistic meanings are constructed and organized come out of our experience of the world and our perception and processing of that experience.


Meaning seen as reference or denotation

It is self-evident that language conveys meaning partly by as it were pointing to various kinds of phenomena in the 'real world'. In fact, physically pointing to something can often perform the same function as naming it. For example, if I am in the queue for lunch in the university canteen, and, on reaching the servery, I am asked for my order, I may say 'The egg curry please', or I may point to the steaming concoction in question, or I may do both. When a linguistic expression 'points' in this way in a particular context to a specific entity, attribute, state, process etc., linguists talk about an act of reference, the phenomenon thus identified being labelled as the referent. There is another way in which linguistic expressions can be applied to 'real world' phenomena. Instead of picking out a specific phenomenon in a particular context, an expression may identify a whole class of phenomena.

Lexis and meaning


For instance, in the following sentence the expression the wolf does not refer to one particular wolf but to an entire category of mammals. The wolf is a much misunderstood animal. Similarly with baked beans and Sunday night in the sentences below. Even though they taste nice, baked beans are actually quite good for you. Sunday night is as quiet as the grave around these parts. Many linguists call this kind of meaning denotation, labelling the class of entities to which an expression is applied as its denotatum (plural: denotata). (However, it should be noted that the terms refer, reference and referent are often used in a broad sense to cover both reference as defined earlier and denotation.) Traditionally, language has been seen as communicating meanings via concepts constructed out of our experience of the relevant denotata. On this view, each linguistic form is associated with a concept, and each concept is the mental representation of a phenomenon in the 'real world'. This notion is sometimes represented diagrammatically as shown in Figure 5.1.

Figure 5.1 Linguistic forms associated with 'real world' phenomena

One difficulty with this kind of representation is that, in implying that each particular form is uniquely associated with a single particular concept, it fails to provide any account of cases where more than one expression is associated with a single meaning or of cases where a single expression is associated with a more than one meaning (see below) and there is also the problem that this whole approach leads to an 'atomistic' view of semantics which treats each form and its meaning as isolated and self-contained. There are other reasons too for taking a wary approach to the notion that meaning is only about expressions being applied to 'real world' phenomena, whether referentially or denotationally. For one thing, there are words


Language and the lexicon

whose meaning simply cannot be accounted for in this way - words like if, and, should, nevertheless. All of these items have meaning, but certainly not by virtue of identifying observable phenomena or classes of phenomena in the 'real world'. There are also expressions that relate to phenomena which do not exist - mermaid, tooth-fairy, unicorn etc. Can we say that such expressions have no meaning just because they have no corresponding denotata in the 'real world'? Certainly not. Also worth noting is that two (or more) expressions may be applied to exactly the same phenomenon and yet have different meanings. The classic example of this is the designation of the planet Venus as both the Morning Star and the Evening Star (because - owing to its brightness - Venus is still visible at dawn and already visible at dusk). The expressions Morning Star and Evening Star clearly have different meanings, and yet they are applied to precisely the same object. Some further illustrations of expressions with different meanings being applied to the same phenomenon follow. the Lionheart


to tell lies

King Richard I of England


to be economical with the truth


Structuralist perspectives on meaning

Much of the discussion in previous chapters has been concerned with structure of various kinds - sentence-structure, the internal structure of words, sound-structure etc. This is very much the hallmark of the whole approach to language taken by modern linguistics, which is usually taken to date from the work of the Swiss linguist Ferdinand de Saussure, the generally recognized 'founding father' of what became known as structuralism. According to the structuralist conception, in the words of the British linguist John Lyons, 'every language is cut to a unique pattern', and the units of a given language 'can be identified only in terms of their relationships with other units in the same language'. What this view implies in respect of lexical meaning is that it has to be seen in the light of relations between expressions in the same language system. This is not to say that structuralism denies the relationship between linguistic forms and phenomena in the 'real world'. It does, however, insist that this relationship is only part of the story. Saussure draws an analogy in this connection with monetary systems. Just as the value of a given coin (e.g. five francs) is based, he says, both on the kinds of goods it will buy and on its relationship with other coins in the same system (e.g. one franc), so, says Saussure, the 'value' of a linguistic unit derives both from the concepts for which it may be 'exchanged' and from its set of relationships with other words in the language. The first manifestation of structuralist semantics was lexical field theory. This is an approach based on the idea that it is possible to identify within the

Lexis and meaning


vocabulary of a language particular sets of expressions (lexical fields) covering particular areas of meaning (semantic fields) where the lexical organization is such that the relevant lexical units precisely mark out each other's territory, so to speak. One of the early exponents of lexical field theory, Jost Trier, wrote in terms of 'a net of words' cast over meaning 'in order to capture and organize it and have it in demarcated concepts'. A much-cited example of a semantic field is that of colour. Colour is an undifferentiated continuum in nature; it is organized into red, orange, yellow, green etc. by the words which are used to identify particular areas of the spectrum. Moreover, different languages divide up colour differently. For example, Russian recognizes two colours in the blue range where English recognizes only one; the words Russian goluboj and sinij - which are customarily translated as 'light blue' and 'dark blue' respectively - are in fact understood as identifying quite distinct colours, not different varieties of the same colour. The fact that lexical field theory talked so much about concepts was, however, off-putting for some structuralist linguists, especially North American structuralists who took their inspiration from the work of the American linguist Leonard Bloomfield. Bloomfield was determined to see linguistics recognized as a fully-fledged science and so he and his followers were interested only in those areas of language which were amenable to rigorous objective analysis. Meaning, defined in terms of unobservable concepts, did not come into that category as far as the Bloomfieldians were concerned, and they saw the scientifically accurate definition of meaning in terms of the 'real world' phenomena to which words were applied as being possible for only a minority of expressions. Bloomfield claimed, for example, that defining the names of minerals was relatively straightforward thanks to the resources made available by chemistry and mineralogy; the problems arose in the cases of words like love or hate, 'which concern situations that have not been accurately classified', these latter being 'in the great majority'. This is a very limited and naive view of meaning. On the other hand, the dependence of lexical field theory on the notion of conceptually defined semantic fields is undoubtedly a weakness. There are certainly some areas of meaning - like colour, the human body etc. - which have a clearly identifiable objective reality which can be detached from other areas of meaning, but what about a semantic field such as the 'intellectual domain of meaning', on which Jost Trier did his pioneering work? For Trier the 'intellectual domain' covers a whole range of types of knowledge - scholarly, social, mystical, technical, aesthetic - but his definition of the domain is essentially arbitrary; for some researchers the 'intellectual domain of meaning' might be much more narrowly defined, and for others it might be more broadly defined. It was the above-mentioned British linguist, John Lyons, who found a widely acceptable way forward for a structuralist approach to lexical meaning. He acknowledges that aspect of meaning which derives from some


Language and the lexicon

expressions' relationship with the world beyond language - their application in terms of reference and denotation as defined earlier. However, in common with Saussure and the lexical field theorists, Lyons also recognizes that the meaning of an individual expression crucially depends on the network of relations with other expressions into which it enters. This latter aspect of meaning Lyons labels sense, and his approach to the analysis of sense is such that it does not require the prior identification of a conceptual area or semantic field; the lexical field or subsystem in this perspective is defined in terms of the observable relations between lexical expressions within particular contexts. This last point about context needs emphasizing because of the fact that a given expression may have more than one meaning. For example, the word mouth may refer to a facial feature in some contexts (e.g. He has rather a small mouth) and to a geographical feature in other (e.g. The mouth of this river is difficult to navigate). Where the meanings attached to a given form are clearly connected in this kind of way, linguists are happy to regard them as meanings of the same word and to talk about multiple meaning or polysemy. There are, however, other cases where a particular form is associated with more than one meaning and the meanings in question are totally unrelated (e.g. bank denoting a financial institution and bank denoting the edge of a river, canal etc.). In this sort of instance, linguists consider that two distinct words are involved which simply happen to coincide formally, the term used to signify this situation being homonymy. Homonyms may be completely identical - as in bank-bank; they may be identical only at the phonological level - as in meet—meat (in which case they are called homophones); or they may be identical only at the orthographic level - as in row /rau/ = 'propel a boat using oars' and row /rau/ = 'quarrel' - (in which case they are called homographs). Unfortunately, it is not necessarily always crystal-clear in specific instances whether polysemy or homonymy is involved. With regard to the kinds of relations Lyons has in mind, he distinguishes between those which are paradigmatic (or substitutional) in nature and those which are syntagmatic (or combinatorial). Paradigmatic relations are defined as those which hold 'between intersubstitutable members of the same grammatical category', and syntagmatic relations are defined as those which hold 'typically, though not necessarily, between expressions of different grammatical categories (e.g. between nouns and adjectives, between verbs and adverbs etc.), which can be put together in grammatically wellformed combinations (or constructions)'. Syntagmatic sense-relations are clearly one aspect of the colligational and collocational dimensions of the lexicon, which have already been discussed (in chapters 2 and 3, respectively). For example, the fact that the adjective rancid combines with only a limited range of nouns (butter, lard, oil etc.) can be seen as a set of semantic relationships, since the meanings of the nouns in question are clearly the determining factor. As far as paradigmatic relations are concerned, Lyons focuses on synonymy, hyponymy and incompatibility, which he defines and

Lexis and meaning


demonstrates in terms of logical relations between sentences, or meaning postulates. The two important logical notions that Lyons uses in his approach are those of entailment and negation. One sentence entails another sentence in a given context if the one necessarily implies the other (e.g. / am a man entails I am a human being where the two Is refer to the same individual. One sentence negates another in a given context if the one necessarily denies the truth of the other (e.g. / am a man negates / am a centipede where the two Is refer to the same individual).

Synonymy The relation of synonymy has already been briefly mentioned in 5.1. It is defined by Lyons in terms of minimally different sentences entailing each other. Where two or more sentences entail each other and differ by only one expression, the distinguishing expressions are taken to be synonymous. For example, the following sentences all entail each other. Ethelred the Unready died in 1016. Ethelred the Unready expired in 1016. Ethelred the Unready passed away in 1016. Ethelred the Unready popped off in 1016. Ethelred the Unready kicked the bucket in 1016. Ethelred the Unready snuffed it in 1016. They differ by only the expressions underlined, and so, according to the terms of the above definition, all of these expressions are synonymous. The above examples illustrate two further points which are relevant to the rest of the discussion of lexical relations. The first is that such relations can hold between individual words (e.g. die, expire], between individual words and multi-word expressions (e.g. die, snuff it] and between multi-word expressions (pass away, kick the bucket}. The second point is that it is not a condition for the establishment of a particular semantic relation that it should hold in all contexts. For example, there are instances where the expression kick the bucket is interpreted literally, as in: The window-cleaner tripped and kicked the bucket which was standing at the bottom of his ladder, spilling water all over the pavement. Obviously, this last sentence does not entail The window-cleaner tripped and expired which was standing at the bottom of his ladder, spilling water all over the pavement; accordingly, in this context kick the bucket is not synonymous with expire, die, pass away etc. Issues of contextual appropriacy also arise: the contexts in which we might use snuff it in the above sense would tend not to be the same as those in which we would use expire. For these reasons, statements about semantic relations between lexical expressions always have to take context into consideration.


Language and the lexicon

Two further examples of sets of synonyms are set out and illustrated below. Aid, assistance, help The crisis cannot be solved without the aid of the international community. The crisis cannot be solved without the assistance of the international community. The crisis cannot be solved without the help of the international community. Fast, quickly, speedily, swiftly He was travelling so fast that everything around him became a blur. He was travelling so quickly that everything around him became a blur. He was travelling so speedily that everything around him became a blur. He was travelling so swiftly that everything around him became a blur.

Hy pony my Hyponymy, the relation between more specific (hyponymous} terms (e.g. spaniel] and less specific (superordinate) terms (e.g. dog) is defined in terms of one-way rather than two-way entailment. Thus / own a spaniel entails I own a dog, but I own a dog does not entail / own a spaniel. Hyponymous relations can be represented as inverted tree diagrams in which the lower intersections or nodes represent terms which are hyponymous to the ones above them, and these latter in turn are hyponymous to the ones above them. Thus in the (incomplete) Figure 5.2 below cocker spaniel is hyponymous to spaniel, which is in turn hyponymous to dog, which is in turn hyponymous to mammal, which is in turn hyponymous to animal. Another characteristic of hyponymy is that it is what semanticists call transitive, in the sense that the relation can be seen as 'in transit' all the way along the line, so that if X is hyponymous to Y and Y is hyponymous to Z then X is hyponymous to Z. Thus, cocker spaniel is hyponymous not only to spaniel but also to dog, mammal and animal. Further examples of expressions in hyponymous-superordinate relationships are given below. Claret, wine, drink You'll find some claret on the table. You'll find some wine on the table. You'll find some drink on the table.

Lexis and meaning


Figure 5.2 Hyponymous relations.

Claret is hyponymous to wine; wine is hyponymous to drink; and claret is also hyponymous to drink. Hatchback, car, vehicle The firm bought him a new hatchback. The firm bought him a new car. The firm bought him a new vehicle. Hatchback is hyponymous to car; car is hyponymous to vehicle; and hatchback is also hyponymous to vehicle.

Incompatibility With regard to incompatibility, this can be defined in general terms, and also more specifically for particular types of incompatibility, namely, complementarity', polar antonymy and converseness. Incompatibility in general is simply defined in terms of negative entailment: Johnny's shirt is pink entails Johnny's shirt is not green: Johnny's shirt is green entails Johnny's shirt is not pink: and so pink and green can be taken to be incompatible. Similarly with: Metal, wood The chair is entirely made of metal. The chair is entirely made of wood.

Language and the lexicon


Plain, striped The tie I was wearing was plain. The tie I was wearing was striped.

Complementarity Turning now to particular subcategories of incompatibility, let us begin with the relation of complementarity (also known as simple antonymy or binary antonymy), which is a sort of 'one or the other' relation. In the case of complementarity not only does the assertion of one lexical item in a complementary pair (such as alive and dead) imply the denial of the other but the denial of the one implies the assertion of the other. Thus Nessie is alive entails Nessie is not dead, and Nessie is not dead entails Nessie is alive. Some further examples follow. Pass, fail Janet passed the exam.

Janet failed the exam.

Janet did not pass the exam.

Janet did not fail the exam.

Janet passed the exam entails Janet did not fail the exam; Janet failed the exam entails Janet did not pass the exam; Janet did not pass the exam entails Janet failed the exam; Janet did not fail the exam entails Janet passed the exam. True, false What he says is true.

What he says is false.

What he says is not true.

What he says is not false.

What he says is true entails What he says is not false: What he says is false entails What he says is not true; What he says is not true entails What he says is false; What he says is not false entails What he says is true.

Polar antonymy Polar antonymy (also known as gradable antonymy) differs from complementarity by virtue of the fact that the items in question are not in a 'one or the other' relationship but imply the possibility of gradations between them. The assertion of one of a pair of polar antonyms (e.g. rich and poor) implies the denial of the other, but the denial of the one does not necessarily imply the assertion of the other. Liz is rich entails Liz is not poor, and Liz is poor entails Liz is not rich. However, Liz is not poor does not entail Liz is rich, and Liz is

Lexis and meaning


not rich does not entail Liz is poor, since it is fairly easy to think of expressions identifying states somewhere between being rich and being poor (e.g. comfortably off)-, rich and poor are therefore said to be polar antonyms with respect to each other. Where polar antonyms are used there is always some kind of implicit or explicit standard or norm involved against which judgments are made and in the light of which qualities are attributed. For instance, the same person, let us say a teacher by the name of Rothschild, may be described as rich when compared with other members of his/her profession but poor when compared with other members of the Rothschild family. Whenever we use the terms rich, poor, comfortably off etc. we always have some kind of yardstick in mind on the basis of which we make the evaluations signalled by the words used. Similarly with the following examples. Big, small Tom's house is big.

Tom's house is small.

Tom's house is not big.

Tom's house is not small.

Tom's house is big entails Tom's house is not small, but Tom's house is not small does not entail Tom's house is big. Tom's house is small entails Tom's house is not big, but Tom's house is not big does not entail Tom's house is small. Intermediate terms between big and small exist, e.g. middle-sized. Hot, cold The water is hot.

The water is cold.

The water is not hot.

The water is not cold.

The water is hot entails The water is not cold, but The water is not cold does not entail The water is hot. The water is cold entails The water is not hot, but The water is not hot does not entail The water is cold. Intermediate terms between hot and cold exist, e.g. tepid.

Converseness Finally under the heading of incompatibility, we come to converseness (otherwise known as relational oppositeness). This is the relation that holds between expressions in sentences (differing only in respect of the converse expressions in question) which imply the denial of each other but which, after particular kinds of syntactic permutation have been effected, actually entail each other: Fred lent the flat to Michael entails the denial of Fred borrowed the flat from Michael (and vice versa), but Fred lent the flat to Michael entails and is entailed by Michael borrowed the flat from Fred, and so lend and borrow are taken to be converses of each other. Converseness is further exemplified below.


Language and the lexicon Buy, sell Rick bought the car from Sarah.

Rick sold the car to Sarah.

Sarah bought the car from Rick.

Sarah sold the car to Rick.

Rick bought the car from Sarah entails the denial of Rick sold the car to Sarah (and vice versa). Sarah bought the car from Rick entails the denial of Sarah sold the car to Rick (and vice versa). Rick bought the car from Sarah entails Sarah sold the car to Rick (and vice versa). Rick sold the car to Sarah entails Sarah bought the car from Rick (and vice versa). Husband, wife Hilary is Vivian's husband.

Hilary is Vivian's wife.

Vivian is Hilary's husband.

Vivian is Hilary's wife.

Hilary is Vivian's husband entails the denial of Hilary is Vivian's wife (and vice versa). Vivian is Hilary's husband entails the denial of Vivian is Hilary's wife (and vice versa). Hilary is Vivian's husband entails Vivian is Hilary's wife (and vice versa). Hilary is Vivian's wife entails Vivian is Hilary's husband (and vice versa).

Meronymy A lexical relation not focused on particularly by Lyons but discussed at length by other lexical semanticists is that of meronymy. This relation covers part-whole connections. X is a meronym of Y if it can form the subject of the sentence An X is a part ofaY.Y in such a case is labelled a holonym of X. For example, finger is a meronym of hand, and hand is a holonym of finger on the basis of the way in which the two words feature in the sentence: A finger is a part of a hand. As in the case of hyponymy, it is possible to represent meronymholonym relations in inverted tree diagrams, where meronymy is represented as the relationship between a lower node and a higher node. Thus, in the diagram on page 75 (Figure 5.3), finger is a meronym of hand, which in turn is a meronym of arm, which in turn is a meronym of body. However, meronymy is not consistently transitive in the way that hyponymy is. For example, despite the fact that finger is a meronym of hand and hand is a meronym of arm, we might have some hesitation about the sentence A finger is a part of an arm.

Lexis and meaning


Figure 5.3 Meronymy.

Two further examples of meronym-holonym pairs follow. Petal, flower A petal is a part of a flower. Petal is a meronym of flower. Flower is a holonym of petal. Roof, house A roof is a part of a house. Roof is a meronym of house. House is a holonym of roof.


Componential analysis

Some linguists have tried to take structuralist semantics a stage further by trying to analyse lexical meaning into components, otherwise labelled semantic markers or semantic features, which might underlie sense-relations. For example, in a componential analysis the relations between human being, man, woman, boy, girl and lad might be accounted for in terms of plus or minus values attaching to the components HUMAN, MALE and ADULT. Thus: human being man






















Language and the lexicon

In this perspective the synonymy between boy and lad would, for example, be seen as explicable in terms of the fact that their features and their featurevalues totally match (+ HUMAN, + MALE, - ADULT); the hyponymy between man and human being would be seen as explicable in terms of the fact that man shares a feature and feature value (+ HUMAN) with human being and, despite being endowed with other features besides, exhibits no feature-values which are at odds with the componential profile of human being; and the incompatibility between man and woman would be seen as explicable in terms of the fact that the two words differ in terms of the respective values attached to the feature MALE. This approach to lexical meaning obviously has strong similarities to the traditional dictionary definition. For example, a typical dictionary definition of girl would be 'female child' (i.e. - MALE, - ADULT, in the above terms). Componential analysis has long been used in anthropological linguistics in, for example, studies of kinship terms, and it has also been associated with broadly Chomskyan perspectives, but it has also been favoured by semanticists without any specific research task preoccupations or theoretical predispositions. Despite its apparently wide appeal, componential analysis has been subject to a fair amount of criticism. Perhaps most controversial has been the claim made by some linguists that the semantic components on which componential analysis is based are universal - in other words that they underlie the expression of meaning in all languages and cultures. This claim is undermined by the fact that even concepts which in common sense terms look as if they might be independent of particular cultures turn out on closer inspection not to be. For example, the feature MALE, which, in view of its association with a clear biological category, looks as if it might well be a candidate for universality, appears distinctly less universal when one considers the fact that - at least as far as human maleness is concerned - the concept of maleness is also a product of socio-cultural traditions and perceptions which diverge widely from society to society. For example, males are involved to vastly differing degrees in nurturing and rearing children from culture to culture; the extent to which and manner in which they 'beautify' themselves is also highly culture-dependent, as is their role in courtship and in the sexual arena generally. Componential analysts insist that the labels are language-neutral and indeed that they could be replaced by arbitrary symbols (+*>,- •!•, + V etc.). However, in practice, real words from natural languages are used (human, male, adult etc.) which inevitably carry the particular cultural baggage of the language communities with which they are associated. Moreover, because in the binary system of values (+ or -) often adopted by componentialists just one term is chosen to carry either value, componential analysis constantly runs the risk of seeming to be sexist, ageist and indeed many other 'ists'. How many women, for instance, are content to see the meaning of the word woman being characterized as including the feature - MALE?

Lexis and meaning


A further frequent charge levelled at componential analysis is that it treats meaning in too 'cut and dried' a manner, and that it cannot therefore deal with contextual and metaphorical effects. For example, we know that there are circumstances where the words boy and lad are frequently used for adult males, in other words as synonyms of man. Thus, in the context of social interaction in the dressing-room in the aftermath of a rugby match between two teams of males of mature years the following sentences are entirely equivalent: Are we going for a pint, men? Are we going for a pint, boys? Are we going for a pint, lads? How does an analysis of boy and la d as [+ HUMAN, + MALE, - ADULT] sit with this? Similarly, the word girlfriend is applied by males and females alike to female companions of any age from nine to ninety, which casts more than a modicum of doubt on the analysis of girl as simply [+ HUMAN, - MALE, -ADULT]. These and other points have not gone without response from those who advocate a componentialist approach, although at least some componentialists are prepared to admit that componential analysis is not the whole story. On the other hand, non-componentialists like Lyons are perfectly happy to recognize that, because it is based on structural notions of sense, componential analysis is, 'at least in principle, fully compatible with [other approaches to structural semantics]'.


Cognitive approaches to meaning

One version of the componential approach which appears to meet some of the above criticisms is that which starts from the notion of prototypical sense (otherwise labelled stereotypical, focal or nuclear). The notion of prototype arises from the work of psycholinguists and cognitive linguists - in other words from research which is interested in how language relates to the mind. According to advocates of prototype theory, on the basis of our experience of the world we construct in our minds 'ideal exemplars' of particular categories of 'real world' phenomena with ideal sets of characteristics. These 'ideal exemplars' are the prototypes postulated by prototype theorists, who suggest that when we come across further candidates for inclusion in the same category, we judge them against the prototypes we have established. However, the matching process is envisaged as flexible. There does not have to be a complete match. Thus, for example, our prototype of a bird would undoubtedly include features such as 'HAS WINGS', 'FLIES', but this would not prevent us from recognizing a penguin as a bird, even though penguins have flippers rather than wings and swim rather than fly.


Language and the lexicon

Similarly, our prototype of chair would probably include the feature 'HAS FOUR LEGS', but that would not lead us to reject as chairs the kinds of seats that have appeared in offices and around tables in modern times items with single tubular steel stems attached to wide, heavy bases. On the other hand, in some instances it is unclear where a particular item fits in terms of prototypical categories. For example, there are drinking vessels on the market these days which are large and have no saucers - and to that extent resemble the mug prototype, but which on the other hand have an elegantly curved cup-like shape - and to that extent resemble the cup prototype. In other words, prototypes have 'fuzzy' boundaries. The prototypical view of lexical meaning obviously takes us away from what Lyons calls a 'checklist theory of definition' which allows for absolutely no indeterminacy of meaning. Clearly, prototype theory can cope far better than classic 'checklist' componential analysis with the fact that - in particular contexts - terms like boy and girl may be applied to adults and that terms like beast, rat, shark, snake may be applied to human beings. On the other hand, prototype theory is not without its drawbacks either. It appears to relate more to a traditional denotational view of meaning than to recent structuralist perspectives. In consequence, the prototypical approach may be not be able to cope equally well with all types of words; words which do not identify concrete 'real world' phenomena with observable characteristics - alas, albeit, become etc. - would seem to pose some problems in this connection. In any case, prototype theory has very little to say about sense, that important dimension of meaning - explored in 5.3 and also (in its collocational aspects) in Chapter 4 - which derives from relations holding between lexical expressions. Another approach to meaning which can be characterized as cognitive in nature is that proposed by linguists working within the 'conceptual semantics' framework. Conceptual semantics, whose best-known proponent is Ray Jackendoff, essentially says that semantic structure exactly coincides with conceptual structure and that, therefore, any semantic analysis is also an analysis of mental representations. Jackendoff claims that we human beings come into the world equipped with (a) some very basic concepts ('primitives' - such as spatial concepts, concepts of time, even some social concepts like possession and dominance) which are applicable to the interpretation and categorization of a whole variety of experiences, and (b) some principles of concept-combination. Lexical meanings, on this view, are constructed on the basis of interaction among: our inborn conceptual primitives, our inborn concept-combining principles, our experience of the world and our experience of language. There are, therefore, according to conceptual semantics, limits on the kinds of lexical meanings that we can generate - limits having to do with 'conceptual well-formedness' in terms of what kinds of combinations of concepts our innate primitives and principles will permit. A further aspect of the conceptual semantics perspective is that, because the process of meaning

Lexis and meaning


formation is combinatorial, lexical meanings so formed can necessarily be analysed into the concepts out of which they were composed. However, the kind of conceptual structure envisaged by Jackendoff goes far beyond the listing of features with plus or minus signs attached; for example, he suggests that lexical entries for physical object words include three-dimensional model representations - basically the prototypical images posited by prototype theorists but with more structure. Conceptual semantics has been criticized on the ground that there is not sufficient hard evidence to support the view that linguistic meaning exactly parallels conceptual structure. It is also claimed that linguistic meanings do not actually reflect the fuzziness of concept structure. For instance, the above-discussed semantic relation of complementarity (as in true:false) operates as if truth and falsehood ruled each other out (as we saw in 5.3, X is true entails and is entailed by X is not false:, and X is false entails and is entailed by X is not true. The fact is, though, that 'in real life', as it were, we can quite easily conceive of Xs that partake of both truth and falsehood. Jackendoff's remark that 'People have things to talk about only by virtue of having represented them' is difficult to argue with, but it is as yet unclear precisely how close the relationship is between mental representation and linguistic meaning. Finally in the context of cognitive approaches to lexical meaning, it is worth noting that a further development in the prototype concept in semantics is the idea that not only individual entities but also entire events may have prototypical features. This is a notion born of script theory, according to which we interpret experience via scripts - general prototypes of or templates for particular types of activity. For example, the prototypical scenario for going on a train journey will include going to the railway station, buying a ticket, standing on the station platform, boarding the train, finding a seat and presenting one's ticket to the ticket inspector when he passes through the train. According to script theory, event templates such as this allow us to fill in any information gaps from what we know about the typical way in which things happen. Related to and overlapping with the notion of script is the notion of frame:, frames are conceived of as mental frameworks or plans relating to specific domains of knowledge which assist us in dealing with relevant situations. A railway station frame, for instance, would include a ticket office, a waiting room, a cafeteria, an arrivals and departures information board, a station master etc. Also connected with script theory and the frame concept are schema-theoretic models of comprehension which are based on the idea that comprehension always taps into one's knowledge of the world as well as one's linguistic knowledge. The relevance of scripts, frames and schemata for lexical semantics has at least two aspects. On the one hand, scripts and frames provide a plausible underpinning for at least some aspects of syntagmatic lexical relations. That is to say, the fact that the same lexical expressions repeatedly recur in each other's company is partly explicable in terms of the fact that the same kinds


Language and the lexicon

of scenarios involving the same kinds of entities recur in the life of a particular culture and in the lives (including the mental lives) of those who participate in that culture. On the other hand, scripts, frames and schemata also relate to paradigmatic aspects of meaning, and, in particular to the contextual dimension of such relations. For example, the noun stump in some contexts denotes the remnant of a cut or fallen tree and in other contexts denotes one of the three uprights of the wicket defended by the batsman in the game of cricket. Now, it so happens that in the contexts where stump has the first of the above meanings the relevant prototypical frame ('in the forest') centrally involves trees and does not involve at all the accoutrements of cricket, while in contexts where stump has its 'wicket' sense the relevant prototypical frame ('a game of cricket') involves a large open pitch where trees have no place (except perhaps as peripheral background).

5.6 Summary This chapter has been devoted to exploring some of the different ways in which lexical meaning has been approached by linguists. The exploration in question has covered the traditional, referential/denotational account of word-meaning, has talked about Saussure's perspective and the lexical field theory to which it gave rise, has defined and exemplified lexico-semantic relations as they have been understood in recent decades by Lyons and others, and has sketched out the componential analysis approach to explicating such relations. Mention has also been made of a number of 'cognitive' perspectives on lexical meaning. It is clear from discussion in the chapter that lexical meaning is no different from other aspects of language in being in part a function of the network of interrelations between linguistic units. It is also clear that such relations hold not only between words, but also between words and multi-word lexical expressions and within pairs and groups of multi-word expressions. This underlines the fact - already clear from the discussion in earlier chapters - that the lexicon is not just an inventory of individual words but also covers a large variety of combinations of words. Finally, it is noteworthy that a consideration of context is necessary to the very definition of lexical sense-relations and that contextual influence on meaning is a major issue in lexical semantics - which leads to the conclusion that orientation to context is fundamental to the way in which the lexicon operates.

Sources and suggestions for further reading See 5.2. The diagram in 5.2 is based on the model proposed by C. Ogden and I. Richards in their book The meaning of meaning (fourth edition, London: Routledge and Kegan Paul, 1936, 11). The objection that the

Lexis and meaning


Ogden and Richards model is 'atomistic' is voiced by, among others, S. Ullmann in his Semantics: an introduction to the science of meaning (Oxford: Blackwell, 1962, 63). The problems surrounding a view of meaning which is purely based on reference or denotation have been discussed by philosophers for more than a century. The name usually mentioned in this connection is that of Gottlob Frege, and in particular his article 'Uber Sinn und Bedeutung' (Zeitschift fur Philosophic und philosophische Kritik 100, 1892, 25-50). Frege's work is available in English translation in P. Geach and M. Black's Translations from the philosophical writings of Gottlob Frege (Oxford: Blackwell, 1952). The Morning Star/Evening Star example is Frege's. The discussion of referential/denotational meaning in 4.2 was informed mostly by the work of J. Lyons - especially his Structural semantics (Oxford: Blackwell, 1963, Chapter 4) and his Introduction to theoretical linguistics (Cambridge: Cambridge University Press, 1968, Chapter 9). Other sources drawn on were R. Carter, Vocabulary (second edition, London: Routledge, 1998, Chapter 1) and Stephen Ullmann, Semantics (Oxford: Blackwell, 1962, Chapter 3). See 5.3. The Lyons quotations at the beginning of 5.3 are taken from his article 'Structuralism and linguistics' (in D. Robey (ed.), Structuralism: an introduction, Oxford: Clarendon Press, 1973, 6). Ferdinand de Saussure's Cours de linguistique generale (first published 1916) is available in a modern critical edition prepared by Tullio de Mauro (Paris: Payot, 1973). It is also available in English translation: Course in general linguistics, translated by W. Baskin with an introduction by J. Culler (Glasgow: Fontana/Collins, 1974). The monetary analogy is to be found in Chapter IV of the Second Part of the Cours. The words cited (and translated) from J. Trier are from his book Der deutsche Wortschatz im Sinnbezirk des Verstandes. (Heidelberg: Carl Winter, 1931, 2). The colour example is borrowed from J. Lyons's Introduction to theoretical linguistics (Cambridge: Cambridge University Press, 1968, 56-7). Leonard Bloomfield's comments on meaning are taken from his book Language (New York: Holt, Rinehart and Winston, 1933, 139). The discussion of synonymy, hyponymy and the various types of incompatibility is based very largely on the ideas of Lyons as set out in his books: Structural semantics: an analysis of part of the vocabulary of Plato (Oxford: Blackwell, 1963), Introduction to theoretical linguistics (Cambridge: Cambridge University Press, 1968), Semantics (two volumes, Cambridge: Cambridge University Press, 1977), Language, meaning and context (London: Fontana, 1981), Linguistic semantics: an introduction (Cambridge: Cambridge University Press, 1995). The principal source for the treatment of meronymy is Chapter 7 of Cruse's Lexical semantics (Cambridge: Cambridge University Press, 1986). See 5.4. An example of componential analysis being put at the service of anthropological linguistics is F. Wallace and J. Atkins's (1960) article 'The meaning of kinship terms' (American Anthropologist 62, 58-80).


Language and the lexicon

Examples of componential analysis partaking of a Chomskyan perspective are J. Katz and J. Fodor's (1963) much cited article 'The structure of a semantic theory' (Language 39, 170-210) and R. Jackendoff's Semantic structures (Cambridge, MA: MIT Press, 1990). An example of a componentialist without a particular methodological or theoretical axe to grind is G. Leech (see, for example his book Semantics: the study of meaning, second edition, Harmondsworth: Penguin, 1981). The critique of componential analysis in 5.4 draws on comments by D. Bolinger (1965) ('The atomization of meaning', Language 41, 555-73); J. Lyons (Introduction to theoretical linguistics, Cambridge: Cambridge University Press, 1968, 470 ff.; Linguistic semantics: an introduction, Cambridge: Cambridge University Press, 1995, 114 ff.) and J. Saeed (Semantics, Oxford: Blackwell, 1997, 259 ff.). The Lyons quote which ends 5.4 is taken from p. 117 of his Linguistic semantics: an introduction (Cambridge: Cambridge University Press, 1995). See 5.5. The sources for the discussion of prototype theory in 5.5 include: L. Coleman and P. Kay (1981), 'Prototype semantics: the English word "lie"' (Language 57,26-44); W. Labov, 'The boundaries of words and their meanings' (in J. Fishman (ed.), New ways of analyzing variation in English, Washington, DC: Georgetown University Press, 1973). G. Lakoff, Women, fire and dangerous things: what categories reveal about the mind (Chicago: University of Chicago Press, 1987); S. Pulman, Word, meaning and belief (London: Croom Helm, 1983); E. Rosch, 'Principles of categorization' (in E. Rosch and B. Lloyd (eds), Cognition and categorization, Hillsdale, NJ: Lawrence Erlbaum, 1978). The definition of prototype as 'ideal exemplar' is borrowed from J. Aitchison's book Words in the mind: an introduction to the mental lexicon (second edition, Oxford: Blackwell, 1994, 55). Lyons's description of componential analysis as a 'checklist theory of definition' is to be found on p. 99 of his book Linguistic semantics: an introduction (Cambridge: Cambridge University Press, 1995). The criticisms of prototype theory sketched here draw on the discussion by A. Lehrer in his article 'Prototype theory and its implications for lexical analysis' (in S. L.Tsohatzidis (ed.), Meanings and prototypes: studies in linguistics categorization, London: Routledge, 1990). The account of conceptual semantics is largely based on the first four chapters of R. Jackendoff's book Languages of the mind (Cambridge, MA: MIT Press, 1992) and on W. Fawley's discussion of the topic in his Linguistic semantics (Hillsdale, NJ: Lawrence Erlbaum, 1992, Chapter 2). The discussion of scripts, frames and schemas draws on: R. Anderson, R. Reynolds, D. Schallert and E. Goetz (1977), 'Frameworks for comprehending discourse' (American Educational Research Journal 14, 367-81); M. Minsky, 'A framework for representing knowledge' (in P. Winston (ed.), The psychology of computer vision, New York: McGraw-Hill, 1975); R. Schank and R. Abelson, Scripts, plans, goals and understanding (Hillsdale, NJ: Lawrence Erlbaum, 1977); R. Schank and A. Kass, 'Knowledge representation in people and machines' (in U. Eco,

Lexis and meaning


M. Santambrogio and P. Violi (eds), Meaning and mental representations, Bloomington, IN: Indiana University Press, 1988). Accessible introductions to lexical semantics are provided by: Chapter 1 of R. Carter's Vocabulary: applied linguistic perspectives (second edition, London: Routledge, 1998), chapters 1-4 of J. Lyons's Language, meaning and context (London: Fontana, 1981) and chapters 2-4 of the same author's Linguistic semantics: an introduction (Cambridge: Cambridge University Press, 1995), chapters 1-7 of G. Leech's Semantics: the study of meaning (second edition, Harmondsworth: Penguin, 1981), and chapters 1-7 of E. Hatch and C. Brown's Vocabulary, semantics and language education (Cambridge: Cambridge University Press, 1995). With regard specifically to 'cognitive' approaches to lexical semantics the interested reader may also care to consult J. Aitchison, Words in the mind: an introduction to the mental lexicon (second edition, Oxford: Blackwell, 1994, especially chapters 4-8), and F. Ungerer and H. J. Schmid, An introduction to cognitive linguistics (London: Longman, 1996).

Focusing questions/topics for discussion 1. In the introductory section of the chapter it was claimed that changing even a single word can make a radical difference to the meaning of a sentence or indeed a longer stretch of speech or writing. Try to think of five pairs of sentences differing by a single word where the effect of the wordchanges in question is to transform the meanings of the sentences in a fundamental way. 2. We noted in 5.2 that two or more expressions with different senses can identify the same object, person, place, attribute, action etc. in the real world, one example being the way in which the expressions Morning Star and Evening Star both refer to the planet Venus. Try to think of five further examples of words or phrases with different senses being applied to the same 'real world' phenomenon. 3. Section 5.3 defined and illustrated a number of sense-relations. Re-read this section and then - avoiding the examples already given in the section - try to supply the following: two pairs of synonyms, two pairs of expressions linked to each other by the relation of hyponymysuperordinateness, two pairs of complementaries, two pairs of polar antonyms, two pairs of converses, and two pairs of expressions linked to each other by the relation of meronymy-holonymy. In each case illustrate the relations in question, taking the illustrative sentences in the section as your model. 4. In 4.4 we saw some examples of the way in which componential analysts treat lexical meaning. Using these examples as your guide, suggest


Language and the lexicon

a possible componential analysis of the meanings of the following sets of words: bitch: dog: puppy duck: drake: duckling ewe: ram: lamb goose: gander: gosling mare: stallion: foal Indicate any problems you perceive in relation to the analysis you arrive at. 5. In 5.5 we saw some examples of items which were less close to the 'ideal exemplar' of their category than certain other items (e.g. penguin in relation to bird). For each of the following categories specify one member of the category in question which is close to the 'ideal exemplar' and one member which is less close. Give reasons for your proposals. country

(France, Germany, Spain etc.)


(apple, pear, orange etc.)


(blouse, jacket, sweater etc.)


(bear, cow, panda etc.)


(ash, elm, oak etc.)

6 Lexis, phonology and orthography 6.1

Lexis and 'levels of articulation'

In our discussion so far we have been concentrating mostly on what the French linguist Martinet calls the primary level of articulation, the level of language at which meaningful units (morphemes, words etc.) combine into larger meaningful units (phrases, sentences etc.). It is clear that what happens at this level is very largely determined by lexical choice. It would be fairly natural to speculate that things might be rather different at the secondary level of articulation - the level at which meaning/ess units (in speech, minimal phonemes; in alphabetic writing, individual letters) combine to form meaningful units (inflections, affixes, words etc.). However, as we shall see, there is enough evidence of interaction between lexis and phonology on the one hand and lexis and orthography on the other to rule out any idea that sound-systems and writing systems are partitioned off from the lexical domain. One self-evident sense in which lexis and the secondary level of articulation interact relates to the fact that the choice of any given lexical unit determines the particular combination of phonological or orthographical units that is deployed. A less obvious - and for that reason more interesting - dimension to the issue is the question of whether the phonological or orthographic realization of specific words draws on semantic and grammatical information about the word concerned which the lexicon has to supply and whether individual lexical items or categories of items have specific sounds or symbols associated with them.


Phonemes, stresses and tones

Let us begin with that aspect of the interaction between the lexicon and phonology which is labelled above as 'self-evident'. Given that knowledge of a lexical expression typically includes knowledge of how that expression is


Language and the lexicon

pronounced, we have to assume that an entry in the lexicon contains information about the sounds out of which the item in question is composed just as entries in dictionaries may contain 'phonetic' transcriptions. The sound components of a lexical unit include: (i) the relevant sequence of individual sound segments, (ii) (in languages such as English) the pattern of stress-distribution in the unit in question, and (iii) (in languages such as Chinese and Thai) the specific pitch or tone characteristic of the expression concerned when used in a particular sense. With regard to individual sound segments, we saw in Chapter 1 that some differences between sounds were critical in differentiating between words and that some were not. We noted that distinctions that are critical in this way are labelled phonemic, and that the sound units which are, as it were, kept apart by such distinctions are called phonemes. Phonemes can thus be looked upon as collections of distinctive features. Examples of such features are: • plosiveness: whether or not air is completely blocked before being released in the production of a sound - as in /p/- or not - as in /f/; • labiality: whether the lips are involved in the production of a sound - as in /p/ - or not - as in /k/; • nasality: whether air passes through the nose in the production of a sound - as in /n/ - or not - as in /d/; • voice: whether the vocal cords are in vibration in the production of a sound - as in /z/ - or not - as in /s/; • frontnesslbackness/centrality: whether the tongue is positioned towards the front of the mouth in the production of a vowel - as in /i/ (the vowel sound in lid), towards the back - as in /u: / (the vowel sound in boot), or centrally - as in /A/ (the vowel sound in but); • highnessllowness/midness (whether the tongue is high in the mouth in the production of a vowel - as in /i/, low in the mouth - as in /a:/ (the vowel sound in the standard British English pronunciation of bath), or in a mid position - as in /e/ (the vowel sound in bet). Correspondingly, the phonological dimension of each lexical entry can be conceived of as an array or matrix of distinctive features as well as a sequence of phonemes. Thus, a simplified version of the matrix for pin might be represented as follows: /p/



















Lexis, phonology and orthography


Turning now to the question of stress patterns, in many languages, including English, the ways in which stresses are distributed are important in differentiating between words. For example there are a number of pairs of nouns and verbs in English where grammatical category is signalled partly by stress distribution. Thus: Student numbers are continuing to decrease [VERB]. There has been a continuing decrease [NOUN] in student numbers. He's going to record [VERB] a new single. He's going to make a new record [NOUN]. In Europe we no longer implant [VERB] growth-promoting substances in cattle. The growth-promoting implant [NOUN] is no longer used in Europe. In some instances, stress distribution is a factor in distinguishing between distinct meanings of similar sequences of sounds. For example, the word process, when used as a verb, means something like 'to treat', 'to work on' as in: The new information had to be processed very rapidly by the research team. When the stress in process is shifted to the second syllable, however, the verb means 'to walk in procession' - as in: The bishop, the priests, the acolytes and the choir processed solemnly up the aisle. Not dissimilar is the case of the adjective contrary. When stressed on its first syllable, this form means 'opposed', 'opposite' - as in: This idea is contrary to good sense. When stressed on its second syllable, on the other hand, it means 'selfwilled', 'perverse', 'cantankerous' - as in the nursery rhyme: Mary, Mary, quite contrary. How does your garden grow? With silver bells and cockle shells And pretty maids all in a row. With regard to tone, in a number of languages of Asia - such as Chinese, Thai and Burmese, as well as in many African and Native American languages, the pitch at which a particular sequence of sounds is uttered and/or the direction of the pitch (rising or falling) will determine what is understood by the sound-sequence in question. For instance in Thai, the sequence /kao/ means 'he' or 'she' if spoken with a high or rising tone, 'rice' if spoken with

Language and the lexicon


a falling tone, 'white' if spoken with a low tone, and 'news' if spoken with a mid-tone. Similarly with the sequence /naa/, which means 'young maternal uncle or aunt' if spoken with a high tone, 'thick' if spoken with a rising tone, 'face' if spoken with a falling tone, 'nickname' if spoken with a low tone, and 'rice paddy' if spoken with a mid-tone.


Lexical phonology as a reflection of lexical grammar and lexical meaning

We have already seen that there is a connection between the way in which a particular form is pronounced and its grammatical category. In other words, there is in some cases a relationship between the grammatical category assigned to a given entry in the lexicon and the manner in which it is stressed. On the other hand, this kind of variation in stress placement, according to whether a noun or a verb is involved, is not systematic. In other cases the main stress remains in the same place irrespective of grammatical category, for example: delay [VERB]; delay [NOUN]; offer [VERB]; offer [NOUN]; repeat [VERB]; repeat [NOUN]. What this means is that the lexicon has to specify which nouns and verbs follow the record: record pattern and which do not, and that the pronunciation of a particular word will need to be based on this information as well as information about grammatical category. As for the question of the relationship between lexical phonology and meaning, one obvious set of circumstances in which this relationship can be seen to exist is any situation where onomatopoeia is involved. In such instances part of the meaning conveyed by the word is the sound made by the entity or activity to which the word is applied - buzz, crackle, cuckoo, plop, tinkle etc. In other words, in cases such as these the particular phonological shapes involved are determined in large measure by the meanings they are intended to convey. It is worth emphasizing, perhaps, that the phrase 'in large measure' is very deliberately chosen here. There is no question of the forms of onomatopoeic words being completely determined by the sounds they imitate; the conventions of the particular language in which an onomatopoeic form occurs also play a role. This is even true of words that are used to represent animal noises. For instance, in English, the sound made by a crowing cock is represented as cock-a-doodle-doo. Not so in other languages, as the following examples indicate. German










Lexis, phonology and orthography


Nevertheless, the relationship between the phonological forms of onomatopoeic words and their meanings is clear and indisputable. Somewhat more subtle demonstrations of a relationship between meaning and specific sounds are instances where particular combinations of sounds are avoided because they are associated with taboo words. For example, in Luganda, the /nj/ combination (which corresponds roughly to the combination of sounds in the middle of the English word onion / I Anjan/) occurs in taboo items like /kunja/ 'defecate' and /kinjo/ 'anus'. Because of its association with such words, it tends to be replaced in other items by /rj/ (which corresponds to the ng sound in English sing). Thus /kanja:la/ ('immature banana') and /munjorngo/ ('miserable') tend to be pronounced as /karjarla/ and /murjo:ngo/ respectively. To return briefly to the question of interaction among grammar, the lexicon and phonology, it is interesting to note that there is a whole theoretical approach to phonology - known as Lexical Phonology - which is based on a recognition of this interaction. In this conception of phonology, phonological processes are seen as operating together with word-formation rules in a cyclic fashion in such a way as to specify the lexical items in a language. Affixes are seen as being divided into different subsets (called levels or strata], to which different word-formation rules apply, these word-formation rules correlating with different phonological rules.

6.4 Association between particular sounds and particular (categories of) lexical items Let us now consider the notion that particular sounds in a language may be closely, even exclusively, associated with particular words or categories of words. A revealing case to examine in this connection is that of the /rj/ sound in Modern Standard French (the sound corresponding to ng in English sing). This sound, which features in the pronunciation of words like vin and pain in many non-standard varieties of French spoken in Southern France, is an innovation in the more prestigious varieties of the language - the French of educated speakers in Paris, Brussels, Geneva etc. It was brought into these latter varieties via loanwords from English - especially words ending in the morpheme -ing. When such words first came into Standard French their -ing ending was pronounced by most people using phonemes from the Standard French repertoire. Thus -ing was pronounced as the nasal vowel /e/ (which normally corresponds to the spelling in in Standard French), as /in/ (which normally corresponds to the spelling ine in Standard French) or as /iji/ (which normally corresponds to the spelling igne in Standard French). In more recent times, however, -ing words like parking, casting, lifting etc. have increasingly been pronounced using an /rj/ sound. The interesting point about /rj/ in the present context is that, although the distinction between this sound and other sounds is phonemic (differenti-


Language and the lexicon

ating, for example, shopping ('shopping') from chopine ('bottle [of wine]'), it occurs in a very limited set of words. Moreover, the words in question are rather difficult to place under a common heading. It certainly is not the case that /rj/ is systematically associated with the spelling ing. In many words ing simply indicates the presence of a nasal vowel (coing = /kwe/ - 'quince'; poing - /pwe/ - 'fist' etc.). Nor can one even say that the /rj/ phoneme is systematically associated with English loanwords ending in -ing-, for instance, the -ing in the loan-word shampooing is pronounced not as /rj/ but rather as the same nasal vowel as in coing, i.e. It I. In any case, many of the -ing words in French pronounced with final /rj/ are not so much loans as new coinages, for example footing meaning 'jogging', lifting in the sense of 'face lift'. To sum up, there is a phoneme in Modern Standard French which is exclusively associated with a small and rather ill-defined assortment of lexical items and whose occurrence is, therefore, entirely dependent on the selection of one of these words. In the above case the particular sound discussed can be seen as the result of language contact. However, lexically determined aspects of phonology are not necessarily connected to the borrowing of sounds. The process known as lexical diffusion may or may not involve cross-linguistic influence, but what it always does involve is an association between specific sets of lexical items and the sounds that are likely to occur. Lexical diffusion is a phenomenon that has been observed by linguists tracking phonological changes over time in particular languages and dialects. It refers to the fact that such changes develop gradually - affecting different portions of the vocabulary as they progress. It used to be thought that changes in sound-systems operated simultaneously across the board according to laws that admitted no exceptions, the same sound in the same environment always developing in precisely the same way. It now appears that this view of sound change was fundamentally mistaken. The current indications are that when a sound change gets under way it spreads on a word-by-word basis through the lexicon, so that whether or not the new sound is likely to occur is dependent not on the general phonetic/phonological environment but on specific lexical selection. A good illustration of such lexical diffusion comes from data on Belfast English collected in the 1970s. From these data it emerges that there is a sound shift in process in Belfast English from [«] (which is fairly close to the French u sound or the German u sound) to [A] (which is the u sound in Standard British English pronunciations of words like but and cut}. However, the [A] innovation is affecting different lexical items to varying degrees. Thus, the word pull, for instance, was pronounced [PA!] in the data in question in about three-quarters of its occurrences, whereas the word look attracted the pronunciation [Uk] in only about a quarter of its occurrences. In other words, whether or not [A] occurs is closely related to the selection and deployment of specific lexical items.

Lexis, phonology and orthography



Lexis and orthography

Much the same kind of situation applies in relation to the lexis/orthography interface as has been described in respect of lexis and phonology. That is to say, it is obvious that lexical selection determines the particular sequence of letters (in an alphabetic system), the character (in a logographic system) etc., that is deployed; it is also true to say that orthographic representations draw on lexicosemantic and lexicogrammatical information; and it is in addition the case that certain aspects of a writing system may be particularly, or even exclusively, associated with a specific set of lexical items. Writing systems vary enormously around the world, and have varied enormously through history. This book is written using an alphabetic system, where there is a clear relationship between written signs and the sounds of the words represented by those signs. For example, in the following written versions of English words, each letter represents a different phoneme occurring in the words in question: den






English, in common with all western European (and numerous other) languages, uses Roman script, which, as its name implies, was developed by the Romans, and was the form in which Latin was written. The Roman alphabet was based on the Greek alphabet, which exhibits the same basic principle of clear correspondence between written signs and individual sounds - as the following examples from Modern Greek demonstrate: vcc


'that', 'in order to'



'when', 'as soon as' (= oocv)



definite article (neuter nominative/accusative plural)

Also based on the Greek alphabet, and on the same principle of correspondence between letters and phonemes, is the Cyrillic alphabet, in which many Slavic languages, such as Russian, Bulgarian and Serbian, are written. As is well-known, there is a fair amount of variation in alphabetic systems in relation to the precise degree of consistency of correspondence between letters and sounds. In a language like Spanish or Finnish, the level of consistency is very high indeed. That is to say, in these systems it is more often than not the case that for any given sequence of phonemes there is only one possible spelling and that for any given sequence of letters there is only one possible pronunciation. Compare this with the situation in English, Modern Greek or French, where the relationship between sounds and letters is a good deal more fluid. In English, for example, the vowel sound /i:/ can be


Language and the lexicon

written as e (as in be), ee (as in bee), ea (as in bean), ie (as in brief), ei (as in receive), ey (as in key), i (as in ravine), even ae (as in encyclopaedia). An earlier version of the alphabetic approach was the system used to transcribe Semitic languages, starting with Phoenician. Semitic languages, represented in the modern world by Arabic and Modern Hebrew, have the characteristic of showing morphological contrasts (for verb tense etc.) through the alternation of vowels within the word rather than by the addition of endings. We have this to some extent in English too, for example run-ran, sing-sang, write-wrote; however, in the Semitic languages this type of grammatical patterning operates throughout. What this means is that the basic form of any given word is its 'consonantal shell' - the counterpart of English s-ng in the sing-sang case - and that the vowels are, as it were, supplied by the grammatical context. Probably for this reason, the Phoenician alphabet represented consonants only, the vowel sounds being left for readers to work out for themselves. Hebrew and Arabic were also originally written in the same way, with only consonants being represented, and, indeed, writing Hebrew and Arabic in this way remains an option even today. However, in the case of both languages, the writing systems have with the passage of time developed ways of indicating vowel sounds. The original Phoenician alphabet was the source of the Greek alphabet; what the Greeks did was to 're-cycle' consonantal signs that they did not require as vowel signs. Thus, for example, the Phoenician sign for a glottal stop (which involves holding air by totally closing the vocal cords and then releasing it) was ('mouth', /r/) by shape, and from • ('grill', /x/) by shape and by the absence/presence of shading. Because written signs are contrastive units in written language in much the same way as phonemes are in speech, they are sometimes referred to as graphemes (< Greek jpa^c\ graphs - 'writing'). Moreover, just as phonemes have allophones, so graphemes can be thought of as having allographs; that is to say, each written sign may be realized in a variety of ways. Thus, for example, A, a and a are all variant forms of the same letter. We have seen that in alphabetic systems the correspondences between graphemes and phonemes can sometimes be quite variable, even within the same language. Examples have already been given from English, Modern Greek and French. A further - indeed the classic example - from English of variation in grapheme-phoneme correspondence is the case of the combination of the letters o, u, g and h. This may correspond to /Af/ (as in rough), /of/ (as in cough), hu/ (as in though), /o:/ (as in thought) or /ox/ (as in Irish-English lough). Likewise, grapheme-meaning correspondences can vary. For example the Sumerian sign > ^corresponds not only to 'sky'/'heaven' (an), but also to 'god' (dingir). We have also seen that signs may stand both for sounds and for meanings. With regard to alphabetic systems, it has sometimes been claimed that variation in grapheme-phoneme correspondences can be of assistance in distinguishing between homophones. It is noted that, thanks to a certain looseness of fit between graphemes and phonemes in French, for example, it is possible to distinguish orthographically between identically pronounced pairs such as the adjectives sur ('sure') and sur ('sour'), the plural nouns maux ('evils') and mots ('words') and the verbs pecker ('to sin') and pecher ('to fish'). Unfortunately for this particular line of argument, French, in common with English, goes only a rather limited distance along this road. For instance, it does not distinguish between the identically pronounced pairs sur ('sour') and sur ('on') or pecher ('to fish') and pecher ('peach-tree'). Moreover, there are cases in French of homographs, i.e. identically spelt items - which are in fact differently pronounced. Thus fils meaning 'threads' is pronounced /fil/, while fils meaning 'son' or 'sons' is pronounced /fis/. As well as signs standing for phonemes and syllables and signs standing for meanings, writing systems may also contain signs indicating how words are stressed. For example, in Modern Greek, every word containing more than one syllable has a diacritic symbol (') over the syllable bearing the main word stress. Thus: 0opa









It is, then, perfectly possible for written forms of languages to incorporate information about word stress in what appears on the page. As it happens,

Lexis, phonology and orthography


the written conventions of different languages vary in the extent to which they make use of this possibility. The written form of English, for instance, provides absolutely no information about stress distribution. Similarly with the transcription of tone. The writing systems of some tone languages - such as the Thai system - indicate the tone associated with a particular lexical item, whereas others do not.


Orthography as a reflection of lexical grammar and lexical meaning

Just as the pronunciation of particular words may draw on information about their grammatical characteristics and about their meaning, so too may the way in which they are written. With regard to grammar, one very clear demonstration of the influence of a word's grammatical profile on its spelling is the way in which all nouns are written with initial capital letters in German. Thus, in the following sentence, apart from the first word (wir = 'we') - capitalized simply because it begins the sentence - all capitalized items are nouns: Wir finden das Essen und das Bier besonders gut in dieser Kneipe. ('We find the food and the beer especially good in this pub.') It is worth noting that capitalization is not triggered simply by the form of the word. Thus, for example, the form e-s-s-e-n, which here means 'food', can also be used as a verb, meaning 'to eat' (as in Sollen wir essen? - 'Shall we eat?'). In the latter circumstances, the word is not capitalized. In English (and in many other languages) capitalization in the spelling of nouns is restricted to the subcategory of proper nouns, that is to nouns which identify very specific persons, places, ideas etc. in any particular context. For example, Beethoven, Paris and Islam are all proper nouns. However, once again, capitalization is not triggered simply by the form of the word. For instance, the word Ulster may be used as a proper noun to refer to the nine northernmost counties of Ireland or, more loosely, to refer to Northern Ireland, which extends over six of the counties in question. However ulster may also be used as a common noun to refer to an entire class of entities, i.e. long loose overcoats of coarse cloth. Thus: The ancient province of Ulster is the setting for many of Ireland's bestknown legends. The stranger was wearing a dark ulster, which flapped in the wind as he walked. As far as the relationship between the lexicon and meaning is concerned, this has already been dealt with in the discussion of different types of writing system in the last section. We saw there that some writing systems (for


Language and the lexicon

example the Sumerian and the Ancient Egyptian systems) actually began as attempts to represent the entities referred to in pictorial form. However, from the way in which these systems subsequently developed - with, for example, particular signs sometimes being used to represent words with sound-shapes similar to those of the words associated with the original meanings represented - it is clear that the meanings on which such pictographic systems were based were essentially word-meanings. It is for this reason that such systems are often described as logographic. An example of a modern logographic system is that used in association with Chinese. As we saw earlier, the Chinese system also began life as a straightforwardly pictographic system, but the characters gradually lost contact with their original pictorial role. It should be noted that the situation in Chinese is actually a little more complicated than one in which an individual word is always represented by an individual character. For instance, the character^used alone stands for mu ('tree'); doubled (^ ^) it stands for lin ('wood', 'small forest'); and tripled (^^) for sen ('large forest', 'numerous', 'dark'). Also, as in Sumerian and Ancient Egyptian, certain characters may be combined with others in order to indicate phonetic characteristics of the word represented. These and other considerations have led some linguists to question whether the Chinese system is truly logographic - some scholars continuing to assert that it is essentially pictographic, others that it is a system which primarily represents syllables or morphemes rather than word-meanings. However, whichever line one wants to take on the terminology which most succinctly captures the most salient characteristics of the Chinese system, it is clear that at least part of what determines the forms of the characters deployed is the wordmeanings to which they relate. Thus, for example, the sequence /nan/ can mean 'difficult', 'south' or 'male', and each of these meanings is differently represented in the shape of the character used for /nan/. In alphabetic writing systems, too, there is often a relationship between what a word means and how it is spelt. In the previous section we looked at some examples of French homophones whose spelling varied in accordance with their meaning. Some further examples - from English - of identically pronounced items with different spellings depending on their different meanings are: beat:beet grate:great sole:soul As we also saw, this kind of differentiation of homophones by spelling is not universal. For instance, the following pairs of English words are identical in both pronunciation and spelling, even though their meanings are completely unrelated: cope (applied to priest's vestment):

cope (= manage]

pen (applied to enclosure for, e.g. sheep):

pen (applied to writing implement)

Lexis, phonology and orthography


However, there are enough orthographically differentiated homophones around in languages such as English and French to demonstrate that in at least some alphabetic systems word meaning can play an important role in the determination of orthographic form.

6.7 Association between particular written signs and particular (categories of) lexical items As is the case of the relationship between sounds and lexis, one can often point to particular written signs which are linked with particular lexical items or categories of lexical items. An obvious demonstration of this phenomenon is provided by logographic systems, such as those discussed earlier, where specific written characters are associated with specific words - or sets of words. However, there are also instances of particular signs being associated with particular words or types of words in syllabaries and alphabetic systems. With regard to syllabaries, a rather dramatic example of specific signs being associated with specific sets or types of words comes from Japanese. It was mentioned earlier - in 6.5 - that the Japanese kana syllabary script, has two forms. On the one hand, there is the 'normal' form - hiragana used for Japanese particles, verb-inflections etc., but, on the other hand, there is a version of kana — katakana - which is specifically and exclusively used to represent words borrowed from Western languages such as English. That is to say, in the Japanese system a particular category of words - loanwords from Western languages - has an entire script all of its own dedicated to it. As far as alphabetic systems are concerned, a case of a particular letter being associated with a particular type of word is that of the letter c in German when it is used outside the clusters ck and ch and outside proper names such as Celle and Claus. When it is used alone in common nouns, verbs, adjectives and adverbs, c is exclusively associated with foreign borrowings. Some examples of borrowed expressions in which c is used are: Comeback, Comics, Cornflakes. As foreign words become increasingly integrated into German, both their written form and their pronunciation are Germanized, c being written as k or z depending on whether it is to be pronounced as /k/ or as /ts/. Thus what was originally written as Copie is now written as Kopie, and what was originally written as Penicillin is now written as Penizillin. Other examples are: Spectrum —> Spektrum, Centrum —> Zentrum, Accusativ —> Akkusativ. Where c is retained or reverted to in the spelling of such words, this is often a deliberate act on the part of advertisers who thus seek to give a product or an event foreign chic or exotic connotations. One notes, for example, that cigarette advertisements sometimes contain the spelling Cigaretten rather than Zigaretten, and circus posters frequently prefer Circus to Zirkus. Clearly, such an advertising ploy would


Language and the lexicon

not be possible were it not the case that c is associated with a particular type of lexical item - namely foreign words. A further point worth noting is the way in which grapheme-phoneme correspondences are, at least in some languages, highly dependent on the particular lexical item in which particular letter-combinations occur. For example, mention was made earlier of the combination ough in English. Now, it so happens that there is just one word in English in which this sequence of letters is pronounced as /DX/, namely lough. Lough is a word used for 'lake', with particular reference to lakes in Ireland. (It is derived from the Gaelic word loch, which exists in both Irish Gaelic and Scots Gaelic.)



This chapter has concerned itself with evidence of interaction between the lexicon and phonological and orthographic systems. With regard to phonology, it pointed to the rather obvious fact that the choice of a lexical item determines the particular sound-shape, the particular combination of phonological units - phonemes, allophone, stressed and unstressed syllables, and (in languages like Thai) tones - that is deployed. It also looked at evidence in favour of the notions that phonological realizations of lexical items are informed by grammatical and semantic considerations and that individual lexical items or groups of items may have particular sounds associated with them. In relation to orthography, the chapter noted that lexical choice determines orthographic shape no less than it determines phonological shape. The chapter also set out evidence showing that, again as in the case of phonology, on the one hand, orthographic realizations draw on grammatical and semantic information, and, on the other, certain features of a writing system, and/or particular grapheme-phoneme correspondences, are often associated with a specific set or category of lexical items.

Sources and suggestions for further reading See 6.1. The notion of double articulation referred to is discussed in A. Martinet's articles 'La double articulation linguistique' (Travaux du Cercle Linguistique de Copenhague 5, 1949, 30-7) and 'Arbitraire linguistique et double articulation' (Cahiers Ferdinand de Saussure 15, 1957, 105-16). See 6.2. The discussion in 6.2, 6.3 and 6.4 owes some of its inspiration to F. Katamba's treatment of the topics in question in his book An introduction to phonology (London: Longman, 1989). The first of the Thai examples in 6.2 was provided by Jennifer Pariseau; the second was taken from V. Fromkin and R.Rodman, An introduction to language (sixth edition, New York: Harcourt Brace, 1998, 241).

Lexis, phonology and orthography


See 6.3. The examples in 6.3 of the different ways in which the cockcrow is designated in different languages are borrowed from V. Cook's Inside language (London: Arnold, 1997, 53). The examples from Luganda are to be found on p. 256 of F. Katamba's An introduction to phonology. Lexical Phonology is the brainchild of P. Kiparsky - see, e.g. his articles 'Lexical phonology and morphology' (in I. S. Yang (ed.), Linguistics in the morning calm, Seoul: Hanshin, 1982) and 'Some consequences of lexical phonology' (Phonology Yearbook 2, 83-138). Other treatments of the topic include K. P. Monahan's book The theory of lexical phonology (Dordrecht: D. Reidel Publishing, 1986) and M. Kenstowicz's chapter 'Lexical Phonology' in his volume Phonology in generative grammar (Oxford: Blackwell). See 6.4. The discussion of the /rj/ phoneme in French broadly follows what I had to say on the matter in my little volume French: some historical background (Dublin: Authentik, 1992, 49f.). The notion of lexical diffusion derives from the work of W. Wang - e.g. W. Wang, 'Competing changes as a cause of residue' (Language 45,1969, 9-25); M. Chen and W. Wang, 'Sound change: actuation and implementation' (Language 51, 1975, 255-81); it is discussed by, among others, J. Aitchison, in her book Language change: progress or decay? (London: Fontana, 1981, 95), R. Hudson, in his book Sociolinguistics (Cambridge: Cambridge University Press, 1980, 168ff.) and S. Romaine, in her book Socio-historical linguistics: its status and methodology (Cambridge: Cambridge University Press, 1982, 254ff.). The Belfast data are discussed in articles by R. Maclaran ('The variable (A): a relic form with social correlates, Belfast Working Papers in Language and Linguistics 1,1976, 45-68) and J. Milroy ('Lexical alternation and diffusion in vernacular speech', Belfast Working Papers in Language and Linguistics 3, 1978, 101-14). See 6.5. Section 6.5 draws liberally on the following three sources: chapters 2-6 of L-J. Calvet's Histoire de I'ecriture (Paris: Plon, 1996); Chapter 6 of V. Cook's Inside language (London: Arnold, 1997); and chapters 1-3 of Georges Jean's Writing: the story of alphabets and scripts (London: Thames &C Hudson, 1992). The Sumerian, Chinese and Ancient Egyptian examples cited in the section are all borrowed from these authors. The English, French, Modern Greek and Spanish examples are my own. See 6.6. The brief mention of the controversy about the nature of the Chinese writing system was inspired by articles contributed by W. C. Brice, M. A. French and E. Pulgram to the collection of papers edited by W. Haas under the title Writing without letters (Manchester: Manchester University Press, 1976) and by J. DeFrancis's article 'How efficient is the Chinese writing system?' (Visible Language 30, 1996, 6-44). See 6.7. The examples of German words spelt with c are taken from D. Berger, G. Drosdowski and O. Kage's Richtiges und gutes Deutsch (Mannheim: Dudenverlag, 1985, 160). The examples of words exchanging


Language and the lexicon

their c for a k or a 2 are taken from G. Drosdowski, W. Miiller, W. ScholzeStubenrecht and M. Wermke's Rechtschreibung der deutscben Sprache (Mannheim: Dudenverlag, 1991, 29). Good introductions to phonology - all of which refer in varying degrees to lexical matters - are: V. Cook, Inside language (London: Arnold, 1997, Chapter 4); H. Giegerich, English phonology: an introduction (Cambridge: Cambridge University Press, 1992); F. Katamba, An introduction to phonology (London: Longman, 1989). More theoretical treatments of phonology are to be found in: P. Carr, Phonology (London: Macmillan, 1993); J. Goldsmith, The handbook of phonological theory (Oxford: Blackwell, 1995); M. Kenstowicz, Phonology in generative grammar (Oxford: Blackwell, 1994); A. Spencer, Phonology: theory and description (Oxford: Blackwell, 1996). Accessible introductory publications on writing systems and orthography include: L.-J. Calvet, Histoire de I'ecriture (Paris: Plon, 1996); V. Cook, Inside language (London: Arnold, 1997, Chapter 6); F. Coulmas, The Blackwell encyclopedia of writing systems (Oxford: Blackwell, 1996); Georges Jean, Writing: the story of alphabets and scripts (London: Thames & Hudson, 1992); J. L. Swerdlow, 'The power of writing' (National Geographic 1962, 1999,110-32). The reader looking for more in-depth discussion of orthographic and related issues may wish to consult one or more of the following: E. Carney, A survey of English spelling (London: Routledge, 1994); P. T. Daniels and W. Bright (eds), The world's writing systems (Oxford: Blackwell, 1996); G. Sampson, Writing systems: a linguistic introduction (London: Hutchinson, 1985); M. Stubbs, Language and literacy (London: Routledge &c Kegan Paul, 1980).

Focusing questions/topics for discussion 1. In 6.2 and 6.3 we saw some examples of different stress distributions in relation to similar sound-sequences being associated with different

Lexis, phonology and orthography


grammatical categories - e.g. reject [VERB] vs. reject [NOUN]. Try to find further pairs of English words which are differentiated in this way. 2. In 6.4 we looked at the association between particular sounds and particular (categories of) lexical items. Consider the nasalized vowel sound /a/ as it occurs, for example, in the final syllable of many English-speakers' pronunciation of the word restaurant. In what other English words does this sound occur? What kinds of words are these? 3. In 6.5 the notion of allograph was briefly discussed. It was noted that a particular grapheme - the first letter of the Roman alphabet - has the allographs A, a and a. Taking any writing system(s) with which you are familiar, try to find some further examples of 'families' of allographs. 4. In 6.5 and 6.6 it was shown that orthography is sometimes used to differentiate between homophones (e.g. meat:meet). Illustrate this phenomenon further from any language(s) you know. 5. In 6.7 we observed that some written signs are associated with particular words or categories of words. Try to think of some further instances of this in any language(s) with which you are familiar. Note that sometimes the specific association has to do with where the particular sign occurs as well as with the nature of the sign itself. For example, the letter x rarely occurs at the beginning of words in English, and almost all the words which feature x in this position are of Greek origin.

This page intentionally left blank

7 Lexis and language variation 7.1

Variety is the spice of language

So far we have been looking at the lexical aspects of language largely as if the same range of forms and functions of any given language were deployed in all circumstances of language use. A moment's reflection, however, will bring us to the conclusion that this is a simplification and that, in fact, languages are characterized by high degrees of variation. Regional accents immediately spring to mind in this connection, as do the different words that people from different regions use for the same object. Similar kinds of variation occur across the social and ethnic spectrum, as well as between the genders. With regard to gender, for instance, there are languages in which males and females pronounce the same words differently; thus, in Gros Ventre, a Native American language of the north-eastern United States, words which male speakers pronounce with a /dj/ sound (the sound in the middle of Indian) are pronounced by women with a /kj/ sound (the sound in the middle of Slovakian) - so that the word for 'bread' in this language is either djatsa or kjatsa, depending on whether the speaker is male or female. In other languages certain pronouns and particles are genderspecific; thus, in Japanese, female speakers indicate their gender by using the particles ne or wa at the end of sentences they produce. Also, it is clear that we vary our use of language from situation to situation - so that, for example, the way we talk or write to a prospective employer is likely to differ significantly from the way in which we talk or write to close personal friends. For example, over a cup of coffee with a friend we might explain that we are too busy to go somewhere or do something using a form of words such as: Can't. I'm up to my eyes. In a formal interaction with an employer or a prospective employer we might be more likely to express the same thought rather differently: I am unfortunately unable to make myself available at that particular time because of pressure of work. The study of language variation falls within the ambit of that branch of linguistics known as sociolinguistics. In this chapter we shall begin with some discussion around a few of the basic concepts and terms developed by


Language and the lexicon

sociolinguists in connection with their study of language variation. We shall then home in on the extremely important lexical dimension of language variation in its different manifestations. Finally, we shall take a brief look at the question of the relationship between lexical variation and cultural variation and the impact, if any, of lexical variation on perceptions and thought patterns.

7.2 Language variation: sociolinguistic perspectives With all this variation in evidence, a legitimate question to ask is: when are the differences so great as to indicate that we are dealing with distinct languages rather than two or more versions of the same language? Compare, for example, the expressions in the first two columns below: auf dem Fahrrad

op de Fiets

'on the bicycle'

es zieht

et drekkt

'there's a draught'

wir trinken Schnaps

we drenken Sopis

'we're drinking schnaps'

Can these two sets of expressions possibly be from the same language? In fact the expressions from the first column are from Standard High German (Hochdeutsch), whereas the expressions in the second column are from what is generally regarded as a 'dialect' of German spoken in a part of Germany which lies very close to the Dutch border. The 'dialect' in question is in fact a variety of Plattdeutsch or Low German. Now consider three more sets of expressions: et bord til to

et bord til to

ett bord for tva

hva hosier det

hvad koster det? vad kostar det? 'what does it cost?'

jeg er t0rst

jeg er t0rstig

jag dr torstig

'a table for two'

'I'm thirsty'

Are these perhaps also from different dialects of the same language? In fact, the first column contains expressions from Norwegian, the second column contains expressions from Danish, and the third column contains expressions from Swedish. In other words, in this case we are talking about sets of expressions from what are regarded as three separate languages. What is interesting is that, whereas a Norwegian, Dane and a Swede, can, each using his/her own language, to a very large extent converse with each other intelligibly, a speaker of Standard German with no knowledge of Plattdeutsch would have great difficulty understanding the German 'dialect' from which the earlier examples are taken, including the examples themselves. Actually, a speaker of Dutch would fare better in this regard. This demonstrates clearly that whether we call something a dialect or a language is really more a matter of politics than of linguistics. If the part of Germany where the above-exemplified type of Plattdeutsch is spoken had happened to

Lexis and language variation


be situated in the Netherlands, the linguistic variety in question would have been designated as a 'dialect' of Dutch. As it turns out, much the same variety is spoken on the Dutch side of the border, where it is indeed regarded as a 'dialect' of Dutch. The way in which sociolinguists deal with this problem terminologically is to apply the neutral term variety to any set of linguistic items and patterns which coheres into a means for communication - not only in the context of geographic variation but also in the context of social, ethnic, gender-related or context-related variation. For example, one ethnic variety which has been much studied by sociolinguists is Black English, otherwise known as Black Vernacular English or Afro-American Vernacular English. This variety used, as its various labels indicate, by many Black people in the United States - diverges very noticeably from varieties used by Whites. It has a fairly welldefined set of characteristics, one of which is a tendency for word-final plosive consonants to be voiceless; thus, the Black English consonants corresponding to Standard English /g/ in big, /b/ in cub and /d/ in kid may occur as /k/, /p/ and /t/. A feature such as a word-final plosive whose differing realization can, as in the case of Black English, contribute towards the identification of a particular variety is labelled a variable in sociolinguistics. Another example of a variable is that of /h/ in British English. Most British English-speakers would consider the dropping of /?'s at the beginnings of words to be a 'workingclass' phenomenon. As it turns out, /7-dropping is not the exclusive preserve of any particular social stratum in Great Britain, although the degree to which it occurs is correlated with class, as is evident from the following figures from a study conducted in the 1970s showing the percentages of word-initial /h/s dropped by samples from different social groupings in Bradford and Norwich: BRADFORD


Middle middle-class



Lower middle-class



Upper working-class



Middle working-class



Lower working-class



What this set of figures illustrates is that linguistic varieties are not necessarily characterized by the absolute presence or the absolute absence of a specific realization of a variable - in this case the suppliance or the dropping of the word-initial h sound. Thus, while it seems to be the case that, in both Bradford and Norwich, middle-class speakers pronounce word-initial h more often than they fail to pronounce it, there are nevertheless occasions when they drop it. Varieties, in other words, are often characterized by tendencies or probabilities in terms of the presence or absence of particular


Language and the lexicon

variants of variables rather than by categorical attributes. Another dimension of the way in which variables relate to varieties which is illustrated by the above figures is the fact that different varieties are not necessarily discrete, self-contained systems neatly divided off from each other, but may form a continuum and blur into each other. A continuum of variation is precisely what one usually finds in social context-related variation, which is sometimes referred to as style-shifting. In all languages people adjust their language style according to the situation in which communication is taking place and according to the relationship that exists between the participants in the interaction. For example, consider the expression going to in English, as in: I'm going to leave tomorrow. This expression can undergo a range of reductions - indicated below - and its most reduced forms are more frequently used in informal styles of speech and less frequently used in formal styles: /g9Uirj tu: /

('going to')

/g9Uirj t9/

('going tub''}

/gaum t9/

('goin' tub')

/g9n t9/

('guhn tub''}


('guhnuh' - the form usually written as 'gonna')

That is not to say that 'one or the other' situations are unknown in language variation. Where two or more varieties have attained the status of standard regional, national or international languages and their patterns have been fixed and prescribed for by grammar books, dictionaries, and language academies, the differences between them are more categorical. For example, in Standard Swedish the form tvd will always be used for 'two', whereas in Standard Danish the form used will be to. Similarly with the pairs of forms below representing Castilian (Standard Spanish) and Catalan respectively: CASTILIAN
















'we can'







Two important points remain to be made before we move on to examine the lexical dimension of language variation in more detail. The first is that

Lexis and language variation


the various factors in language variation do not operate in isolation from each other but on the contrary constantly interact. For example, there is an interplay between geographical variation and social variation, with nonstandard regional accents and expressions being more frequently found in working-class language use than in the language use of the middle classes. There is also interaction, to take another example, between geographical variation and social context-related variation; thus, an individual with a knowledge of both a national standard variety (such as Standard German) and a non-standard regional variety (such as a variety of Plattdeutscb] will tend to use the non-standard variety in situations of intimacy and informality - in the home and with friends and acquaintances from his/her locality - and will use the standard variety in more formal circumstances with strangers and in the world of officialdom. The second point is that the particular variety or varieties that we use are not deterministically imposed on us but rather reflect the models we ourselves adopt and the attachments and affiliations we enter into; I was born and raised in a working-class home in Dorset, but - because at some stage in my childhood I began to identify with the norms, including the linguistic norms, of my Standard English-speaking educators - I no longer (alas!) speak with a Dorset accent. Sometimes a particular affiliation can take speakers in a less rather than a more standard direction. For instance, a much-cited study conducted in Martha's Vineyard (an island off the coast of Massachusetts) some years ago revealed that the use of a particular nonstandard vowel sound was increasing among the islanders - apparently reflecting a heightened sense of local solidarity and a negative reaction to the values and behaviour of the large numbers of mainlanders who holidayed on the island in the summer.


Lexical aspects of geographical variation

A very frequently cited illustration of lexical variation related to geography is the case of lexical divergence between American and British English, for example: AMERICAN ENGLISH










trunk (of a car)


Also well-known are pronunciation and spelling differences between British English and American English such as:


Language and the lexicon AMERICAN ENGLISH



harass, /ha'ras/

harass /'haeras/

laboratory /'labraton/

laboratory /b' boratri/

leisure /'li:39r/

leisure /'less/

magazine /'magazin/

magazine /msega'zhn/

missile /'misal

missile /'misail/












In a number of cases where British and American English have what look like identical words, there are differences in morphological behaviour. For example, the verb to dive, which in British English has dived as its preterite (simple past) form, in at least some varieties of American English has dove as its preterite. Other cases where - at least to judge by the usage of many current American popular writers - British and American preterites diverge include: to fit - British fitted, American fit; to sneak - British sneaked, American snuk; to strive - British strove, American strived. There is also the case of the past participle of the verb to get, which in British English is got and in American English gotten. Probably more problematic in communicative terms are instances of 'false friends' - words which seem to be identical but which have different meanings. The case of bum is probably too well known to cause misunderstandings; in American English it means 'tramp', whereas in colloquial British English it denotes 'buttocks' (= colloquial American English buns}. The metaphorical use of the expression pissed, on the other hand, might just be a source of difficulty. In colloquial British English I'm really pissed means 'I'm really drunk'; in American English, however, it means 'I'm really annoyed', which British English speakers express by adding off: I'm really pissed off. A British English speaker buying a small item - such as a book or a card - in downtown Indianapolis may also be taken aback (as I was!) to be asked 'Do you want a sack for that?'; the word sack, which in American English can be applied to bags of any description, is in British English applied only to very large bags - such as those used for coal or fertilizer.

Lexis and language variation


Much the same kind of situation as one finds in relation to lexical differences between the English of Great Britain and the English of North America applies to the Castilian Spanish of Spain and the Spanish of Latin America. Thus, for example, in America the Spanish for 'bean' is /n/'o/, whereas the Castilian word is alubia or judia; in America the Spanish for 'bus' is bus, whereas the Castilian version is autobus; in America the words used when answering the telephone are alo, hola or bueno, whereas in Castilian the expressions used are digame or diga. There are a number of 'false friends' in this connection too. For instance, the word carro, which in Castilian means 'cart' or 'wagon', also means 'car' in America (= Castilian coche); the word estampilla, which in Castilian means 'rubber stamp', is also used in Latin America for 'postage stamp' (= Castilian sello); and the word coger, which in Castilian has the innocent enough meaning 'to take hold of, in Latin America is a slang word for 'to have sex with' (= Castilian joder). Not that the interposition of a large ocean is a necessary prerequisite for lexical divergences. Such divergences are also found from country to country within Europe. For example, the number system in French operates differently in Francophone parts of Belgium and Switzerland from the way it operates in France. In France the words used for 'seventy', 'eighty' and 'ninety' are, respectively, soixante-dix (literally 'sixty-ten'), quatre-vingts (literally 'fourtwenties') and quatre-vingt-dix (literally 'four-twenty-ten'). In Belgium and Switzerland, on the other hand, the words used for 'seventy' and 'ninety' are, respectively, septante and nonante; also, in Switzerland the word huitante is frequently used for 'eighty'. There are also differences between the English of Ireland (sometimes called Hiberno-English) and British English. For instance, most British English speakers would have difficulties with the Irish English expressions: boreen ('narrow track'), garsoon ('boy'), gurrier ('ruffian'), locked (in the sense of 'drunk'), and yoke (in the sense of 'thing'). Even national frontiers are of only limited value as guides to lexical divergence. That is to say, particular lexical forms or usages do not necessarily stop at frontiers - as we saw in the earlier discussion of the Plattdeutsch examples - and lexical differences are to be observed within as well as between varieties spoken in any given country. Thus, for example, although the statement in the last paragraph about the use of soixante-dix for 'seventy' in France - as opposed to septante in Belgium and Switzerland - is generally true, in fact, septante is also used by some speakers in eastern France. A further case of lexical variation within a country is that of the German words for 'Saturday'; in northern Germany the word used is typically Sonnabend, whereas in southern Germany the word Samstag tends to be used.

7.4 Lexical aspects of social variation Whereas relating the way in which people speak and write to the country or region they come from is relatively uncontroversial, making the same kinds


Language and the lexicon

of connections between language varieties and social background is a somewhat more sensitive matter, since the description of particular variants of linguistic variables as being associated with a particular social class is liable to be interpreted as feeding into snobbery, elitism and/or anti-democratic political philosophies. Indeed, one early attempt to analyse lexical usage in social terms was immediately put at the service of elitist attitudes. This was the work of the English linguist, A.S.C. Ross, which set out - in a rather impressionistic manner - to isolate markers of upper-class ('U') and non-upperclass ('non-U') language use in respect of pronunciation, grammar and most especially vocabulary. Ross's dictates were seized upon and added to by linguistic snobs all over the English-speaking world and led to the establishment of a veritable glossary of 'U' and 'non-U' terms. For example, in the U/non-U scheme of things, the words on the left below are supposedly 'U', and the words on the right their 'non-U' equivalents: 'U'




looking glass




lunch (eon)










One rather amusing point in this connection is that the so-called 'upperclass' variants in many cases precisely coincide with the variants used in working-class circles. For example, in my own working-class home in the 1950s we listened to the wireless rather than the radio, looked forward to pudding not sweet, rode bikes not cycles, and occasionally presented my mother with bottles of scent not perfume. Ogden Nash's suitably sceptical comment on the whole U/non-U discussion was that the Wicked Queen in the Snow-White story, by uttering the words 'Mirror mirror on the wall', 'exposed herself as not only wicked but definitely non-U'. Other early attempts to examine the relationship between language including lexis - and social class were rather more scientific. As far back as the late 1930s the American linguist Charles Fries compared a number of aspects of the language used in letters on similar topics sent to the same destination (an administrative department of the armed forces) by lower working-class and professional correspondents. Among the lexical differences that emerged from Fries's work were the following: the professional subjects in the study tended to intensify the force of adjectives using forms ending in -ly (as in awfully difficult), whereas

Lexis and language variation


the more common intensifiers used by the working-class subjects were items like awful, mighty, pretty, real, right. the professional subjects used a single form you, whether the reference was singular or plural, whereas the working-class subjects often used forms such as youse, you all, you people to indicate plurality; the working-class subjects often used double prepositions such as off from, whereas the professional subjects tended not to use such forms. Another early study, this time dating from the 1950s, found that, on being interviewed about a tornado in Arkansas, working-class speakers, unlike middle-class speakers, used we, they and persons' names without further explanation for the benefit of the interviewer, and that they used expressions like and stuff like that instead of going into detail. A more recent account of language and class is that of the British sociologist Basil Bernstein. Bernstein talks about two 'codes' to which, he claims, lower working-class and middle-class speakers have differential access. The two 'codes' in question are, on the one hand, restricted code (originally labelled public language] and, on the other, elaborated code (originally labelled formal language}. Restricted code, according to Bernstein, is the code of intimacy, the code we all use when with people and in circumstances where we can communicate a great deal without saying very much because there is so much shared information and there are so many shared expectations in the situations in question. Elaborated code, for its part, says Bernstein, is the code we use when we need to be explicit in our speech and writing because the person(s) to whom we are addressing ourselves is/are not familiar with the people, places, ideas etc. we are referring to, which means that we need to contextualize everything we are producing in order to be understood. Bernstein contends that, whereas all users of a language have access to restricted code, lower working class speakers have little experience of elaborated code and so are likely to be disadvantaged in situations, such as school, where the use of elaborated code is required. The linguistic characteristics of restricted and elaborated code, as described by Bernstein, include the following: RESTRICTED


short, often unfinished or fragmentary sentences

well-ordered complete sentences with syntactic norms observed

simple and repetitive use of conjunctions (e.g. because, so) and very limited use of subordinate clauses

use of a wide range of conjunctions and subordinate clauses

rigid and limited use of adjectives and adverbs

appropriate use of a wide range of adjectives and adverbs


Language and the lexicon

With regard to the lexicon, what all of the above amounts to is a claim that lower working-class language users produce fewer conjunctions, adjectives and adverbs than middle-class language users, and in fact, a number of studies appear to show that this is indeed the case. On the other hand, Bernstein's claims and his interpretation of the relevant evidence have been called into question by some linguists on the basis that the quantitative findings he cites do not necessarily indicate two qualitatively different orientations, and that, in any case, a narrower vocabulary in some grammatical categories may perhaps be compensated for by a wider vocabulary in other categories hitherto uninvestigated. A final point on the question of lexis and social class concerns 'bad' language or 'vulgar' language. It seems to be quite widely assumed that such language is mostly to be found on the lips of people at the lower end of the social scale. Indeed, the very word vulgar comes from a Latin word, vulgus, which means 'the common people', and there has been a longstanding tendency to associate the use of choice language with stigmatized social categories. However, oaths, curses profanities and obscenities have also been a royal and an aristocratic prerogative. Queen Elizabeth I, for instance, was famous for her foul mouth, and the traditionally choice language of the nobility is reflected in the expression to swear like a lord. In the modern age, at least in the West, there seems to have been an increase in the use and acceptability of words which would once have been regarded as offensive (see Chapter 8) and this phenomenon has apparently affected the entire social range. Serious research into the social distribution of 'swear-words' remains to be done, but it is likely that the extent of the use of such items will depend on factors rather more complex than simply adherence to a particular social class. For example, among the working-class population of Great Britain there are sizeable numbers of practising Christians, Hindus, Muslims and Sikhs for whom the use of explicitly sexual words or irreverent references to sacred matters would be unthinkable.


Lexical aspects of ethnic variation

We turn now to the issue of the relationship between language variation and ethnicity. Ethnicity is that aspect of culture which signifies 'belongingness' to a community in terms other than socio-economic terms; it is been recently defined as 'the identificational dimension of culture'. Racial factors may or may not be present among the criteria by which an ethnic group defines itself and/or is defined by other groups. For example, the small Vietnamese community in Dublin has characteristics of both a cultural and a racial kind which distinguish it from the majority of the population, whereas most Scots residing in the same city would not be identifiable in racial terms but would nevertheless see themselves as culturally distinct from the Irish people among whom they live.

Lexis and language variation

11 s

Obviously, one component of a culture which very often plays an important role in identifying an ethnic group is language. For many members of particular communities there is an absolutely vital connection between their language and their ethnicity; thus, for instance, one of the slogans frequently heard in the context of the revival of the Irish language is 'Gan teanga, gan tir ' - 'Without a language, without a country' - and among Jews it has been claimed that Hebrew 'emerges from the same fiery furnace from which the soul of the people emerges'. In some countries and regions there is a high degree of separation of ethnic groupings defined largely in linguistic terms. For example, in Belgium the longstanding linguistico-cultural conflict between the Dutch-speaking Flemings and the French-speaking Walloons has resolved itself into a division of the country - with the exception of the bilingual territory of Brussels - into two large unilingual regions, Dutch-speaking Flanders to the north, and French-speaking Wallonia to the south. There is in addition an officially recognized small German-speaking area in eastern Belgium (Eupen, St Vith). In other situations, members of different ethnic groupings are living and working side by side, communicating with each other via the standard language of the country and largely reserving their use of ethnic varieties distinct from that standard language for use with family and friends of the same ethnic background. This would be true, for example, of the community of Turkish immigrants in the Netherlands. In still other situations the varieties spoken by particular ethnic groups may have strong resemblances to and connections with the varieties of other ethnic groups, including the standard variety of the country or region in question. Examples of this kind of scenario would include the cases of speakers of American English, Australian English, Hiberno-English, West Indian English etc. living in Great Britain. Further to this last point, a particularly interesting study of patterns of language use among West Indians in Great Britain was conducted in the 1980s by the British linguist Viv Edwards. According to her account, the variety - or patois - used (especially in informal and intimate contexts) by the Jamaican community is very closely related to Standard English but has a large number of specific features, including lexical features, which set it apart from the latter. Some of the lexical differences between Jamaican Patois and Standard English reported by Edwards are detailed below. JAMAICAN PATOIS


PLURAL MARKING OF NOUNS mostly zero marking, e.g.

mostly -s

He give me two book.

He gave me two books.

-dem, where the context does not suffice to indicate plurality, e.g. Clovis gone up a Elaine ft you record-dem.

Clovis went up Elaine's for your records.


Language and the lexicon JAMAICAN PATOIS




I, me, my




he, him, his, she, her


it, its


we, us, our




they, them, their INFINITIVES OF VERBS

fi + base form of verb, e.g.

to + base form of verb

Dem want me fi go up dere go tell dem.

They want me to go up there and tell them

EXPRESSION OF LOCATION deh + expression of place, e.g.

to be + expression of place

When me deh at school, di whole a dem hate me

When I was at school, they all hated me.

Finally in this connection, it may be worth mentioning that some ethnic groups mark their identity by conversational code-switching - switching to and fro - apparently quite arbitrarily - between the languages at their disposal. For example, some groups within the Hispanic (or Latino] communities in the United States - whether from parts of the country which once belonged to Mexico or immigrants from Cuba, El Salvador, Guatemala, Mexico or Puerto Rico - signal their ethnicity in all its biculturality by inserting English expressions into their discourse when speaking Spanish and insert Spanish expressions into their discourse when speaking English. In some instances there actually seems to be a convention to the effect that roughly equal amounts of both languages should be used in any given conversation. One oft-cited study of code-switching among Puerto Ricans in New York demonstrates this kind of balanced approach in its very title quoted from one of the members of the community in question: 'Sometimes I'll start a sentence in English y termino en espanol' (Sometimes I'll start a sentence in English and finish it in Spanish.').


Lexical aspects of gender-related variation

Gender-related variation, as we saw from the Gros Ventre and Japanese examples mentioned at the beginning of the chapter, may in some cases be

Lexis and language variation


very clear and noticeable by all. In other cases the differences between male and female speech are more subtle and users of a given language may or may not be conscious of them. With regard to English and many other European languages, for instance, the differences are often said to reside in the tendency of female language use to be closer to the 'prestige variety' than male language use. Thus, for example, as far as accent is concerned, it has been observed that female British English speakers are more likely than their male counterparts to produce pronunciations which resemble those of radio and television announcers. An explanation commonly offered for this kind of difference is that women have traditionally been expected to be more 'correct' and conforming in their behaviour than men and that this expectation and its consequences carry over into the linguistic sphere. With regard to lexis, a test case for 'good behaviour' among women as far as language is concerned is that of 'swear-words'. It is certainly true to say that there is - or at least until recently was - a certain reluctance on the part of many men to utter such words in the presence of women. The expression not in mixed company, which really means 'not in front of the women', was frequently used as an interdiction in respect of jokes and anecdotes which contained sexually explicit references and/or 'four-letter words'. One presumes from this kind of approach on the part of some men that women have traditionally heard less 'bad language' than men, but what about their production of such language? Queen Elizabeth I was mentioned in the last section as a user of choice language. One interesting comment about her in the present context depicts her as having 'sworn like a man'. This implies that in Renaissance England at least in well-to-do circles - swearing was associated more with men than with women, but it also implies, of course, that individual women (including the Supreme Governor of the Church of England) refused to be bound by this particular convention. In seventeenth century England the association between maleness and swearing was still, apparently, very much in place if the following quotation is anything to go by. The Grace of Swearing has not obtain'd to be a Mode yet among the Women; God damn ye, does not sit well upon a Female Tongue; it seems to be a Masculine Vice, which the Women are not arrived to yet. . . Defoe, An essay on projects, 1697 To bring the discussion a little closer to our own times, in an influential study published in 1975 under the title of Language and women's place, the American linguist, Robin Lakoff claims that 'If a little girl "talks rough" like a boy, she will be ostracized, scolded or made fun of (p. 5). Lakoff provides the following example (p. 10): (a)

Oh dear, you've put the peanut butter in the refrigerator again.


Shit, you've put the peanut butter in the refrigerator again.

It is safe to predict that people would classify the first sentence as 'women's language' the second as 'men's language'.


Language and the lexicon

Actually, 25 years on, the above prediction would not be at all safe. In Great Britain and Ireland at any rate many women now say shit no less readily than they drink pints. Whether this means that women have entirely caught up with men in the 'four-letter word' stakes is not clear, but there is little doubt that - to say the very least of the matter - the gap is closing. Lakoff also claims that some other words are more frequently used by women than by men. Thus, for example, she maintains that certain colour words such as aquamarine, chartreuse, lavender and magenta are more likely to be produced by women than by men, and that much the same applies to adjectives such as adorable, divine and precious. Among the many aspects of British upper middle-class behaviour parodied in the television series Absolutely Fabulous! is the vocabulary used by women of that background - darling, gorgeous, sweetie etc. Vivian Cook found in an informal survey conducted in association with his book Inside language that 90% of his 48 respondents identified Absolutely gorgeous and It's nice, isn't it? as coming from female speakers. Just how far lexical divergences genuinely differentiate between speakers of different genders in a language like English is, as can be seen from the above discussion, a matter of some debate - whatever may be the situation in languages like Gros Ventre and Japanese. It is worth saying, however, that in the major European languages, including English, and presumably in all languages there are certain words which, when used literally and selfreferentially, will very clearly designate the speaker or writer as male or female. The particular items will vary from language to language but their denotation will typically have to do with biological attributes and/or with roles or positions assigned to one gender or the other in a given society. Here are some examples from English: MALE-IDENTIFYING


I'm extremely virile.

I'm extremely pregnant.

I'm a monk.

I'm a nun.

I'm a widower.

I'm a widow.

Moreover, in languages with grammatical gender the particular morphological shape of certain words will have much the same effect, as the following examples from French demonstrate: MALE-IDENTIFYING


Je suis etudiant.

Je suis etudiante.

'I'm a student'

Tm a student'

Je suis heureux.

Je suis heureuse.

'I'm happy'

'I'm happy'

Cela m 'a surpris.

Cela m 'a surprise.

'That surprised me'

'That surprised me'

Lexis and language variation



Lexical aspects of context-related variation

As has already been indicated, and as a moment's reflection on our own use of language will confirm, language varies not only in accordance with speakers'/writers' geographical, social, ethnic and gender profiles but also in accordance with the context in which the speaking or writing takes place. The examples given earlier were of people using a very different speech style with their friends from that used with employers or prospective employers, and of people who speak Plattdeutsch at home and with friends switching to Standard German when in the presence of strangers or bureaucrats. This second example illustrates a phenomenon which the American linguist Charles Ferguson called diglossia - in an article bearing that name published in 1959. In the cases described by Ferguson diglossia refers to situations where two related but very different varieties are in use within a given community, one of which - labelled High (H) - is used for formal, high-status functions, and the other - labelled Low (L) - is used in more intimate, informal circumstances. The cases in question are Classical Arabic and Egyptian Arabic in Egypt, Standard German and Swiss German in Switzerland, French and Haitian Creole in Haiti, and Katharevousa and Demotic Greek in Greece. A word or two of explanation about each of these cases follows. • Classical Arabic is the language of the Koran; in its modern form it is nowadays more usually called Modern Standard Arabic (MSA). MSA is used as a means of communication throughout the Arab world, but each Arabic-speaking country and region has its own local variety of Demotic Arabic, these different varieties being unintelligible to speakers of other local varieties. • The case of Standard German (Hochdeutsch) and Swiss German (Schweizerdeutsch, Schwyzertuiisch) in Switzerland is comparable to the case of Hochdeutsch and Plattdeutsch in northern Germany. That is to say, Swiss German is very different from Standard German - to the point of being largely unintelligible to Standard German speakers who have not learnt Swiss German. • The official language of Haiti is French, the language of the colonists who populated it with African slaves and ruled it until 1804. However, the native variety of most of its population is Haitian Creole. A Creole develops when a simplified system of communication between two groups speaking mutually unintelligible languages (pidgin), is adopted as a mother tongue (by, for example, children born of sexual relationships between members of the two groups). Haitian Creole, like most Creoles, took most of its vocabulary from the language of the economically dominant group, i.e. French in this instance, but has some grammatical elements derived from the languages - in this case African languages - of the economically subordinate group.


Language and the lexicon

• Katharevousa is a supposedly 'pure' (= Greek KaOapog- katharos) form of Modern Greek which is much closer to Ancient Greek than is Demotic Greek (=