2,930 168 2MB
Pages 406 Page size 423 x 648 pts
Research and Practice in Applied Linguistics General Editors: Christopher N. Candlin and David R. Hall, Linguistics Department, Macquarie University, Australia. All books in this series are written by leading researchers and teachers in Applied Linguistics, with broad international experience. They are designed for the MA or PhD student in Applied Linguistics, TESOL or similar subject areas and for the language professional keen to extend their research experience. Titles include: Dick Allwright and Judith Hanks THE DEVELOPING LANGUAGE LEARNER An Introduction to Exploratory Practice Francesca Bargiela-Chiappini, Catherine Nickerson and Brigitte Planken BUSINESS DISCOURSE Alison Ferguson and Elizabeth Armstrong RESEARCHING COMMUNICATION DISORDERS Sandra Beatriz Hale COMMUNITY INTERPRETING Geoff Hall LITERATURE IN LANGUAGE EDUCATION Richard Kiely and Pauline Rea-Dickins PROGRAM EVALUATION IN LANGUAGE EDUCATION Marie-Noëlle Lamy and Regine Hampel ONLINE COMMUNICATION IN LANGUAGE LEARNING AND TEACHING Virginia Samuda and Martin Bygate TASKS IN SECOND LANGUAGE LEARNING Norbert Schmitt RESEARCHING VOCABULARY A Vocabulary Research Manual Helen Spencer-Oatey and Peter Franklin INTERCULTURAL INTERACTION A Multidisciplinary Approach to Intercultural Communication Cyril J. Weir LANGUAGE TESTING AND VALIDATION Tony Wright CLASSROOM MANAGEMENT IN LANGUAGE EDUCATION Forthcoming titles: Anne Burns and Helen da Silva Joyce LITERACY Lynn Flowerdew CORPORA AND LANGUAGE EDUCATION
9781403_985354_01_prexviii.indd i
6/11/2010 1:15:09 PM
Sandra Gollin and David R. Hall LANGUAGE FOR SPECIFIC PURPOSES Numa Markee and Susan Gonzo MANAGING INNOVATION IN LANGUAGE TEACHING Marilyn Martin-Jones BILINGUALISM Martha Pennington PRONUNCIATION Annamaria Pinter TEACHING ENGLISH TO YOUNG LEARNERS Devon Woods and Emese Bukor INSTRUCTIONAL STRATEGIES AND PROCESSES IN LANGUAGE EDUCATION
Research and Practice in Applied Linguistics Series Standing Order ISBN 978–1–4039–1184–1 hardcover Series Standing Order ISBN 978–1–4039–1185–8 paperback (outside North America only) You can receive future titles in this series as they are published by placing a standing order. Please contact your bookseller or, in case of difficulty, write to us at the address below with your name and address, the title of the series and one of the ISBNs quoted above. Customer Services Department, Macmillan Distribution Ltd, Houndmills, Basingstoke, Hampshire RG21 6XS, England
Also by Norbert Schmitt WHY IS ENGLISH LIKE THAT? (with R. Marsden, 2006) FOCUS ON VOCABULARY (with D. Schmitt, 2005) FORMULAIC SEQUENCES: ACQUISITION, PROCESSING, AND USE (editor, 2004) AN INTRODUCTION TO APPLIED LINGUISTICS 2nd edition (editor, 2010) VOCABULARY IN LANGUAGE TEACHING (2000) VOCABULARY: DESCRIPTION, ACQUISITION, AND PEDAGOGY (co-editor with M. McCarthy, 1997)
9781403_985354_01_prexviii.indd ii
6/11/2010 1:15:09 PM
Researching Vocabulary A Vocabulary Research Manual
Norbert Schmitt University of Nottingham, UK
9781403_985354_01_prexviii.indd iii
6/11/2010 1:15:10 PM
© Norbert Schmitt 2010 All rights reserved. No reproduction, copy or transmission of this publication may be made without written permission. No portion of this publication may be reproduced, copied or transmitted save with written permission or in accordance with the provisions of the Copyright, Designs and Patents Act 1988, or under the terms of any licence permitting limited copying issued by the Copyright Licensing Agency, Saffron House, 6-10 Kirby Street, London EC1N 8TS. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil claims for damages. The author has asserted his right to be identified as the author of this work in accordance with the Copyright, Designs and Patents Act 1988. First published 2010 by PALGRAVE MACMILLAN Palgrave Macmillan in the UK is an imprint of Macmillan Publishers Limited, registered in England, company number 785998, of Houndmills, Basingstoke, Hampshire RG21 6XS. Palgrave Macmillan in the US is a division of St Martin’s Press LLC, 175 Fifth Avenue, New York, NY 10010. Palgrave Macmillan is the global academic imprint of the above companies and has companies and representatives throughout the world. Palgrave® and Macmillan® are registered trademarks in the United States, the United Kingdom, Europe and other countries. ISBN: 978–1–4039–8535–4 hardback ISBN: 978–1–4039–8536–1 paperback This book is printed on paper suitable for recycling and made from fully managed and sustained forest sources. Logging, pulping and manufacturing processes are expected to conform to the environmental regulations of the country of origin. A catalogue record for this book is available from the British Library. Library of Congress Cataloging-in-Publication Data Schmitt, Norbert. Researching vocabulary : a vocabulary research manual / Norbert Schmitt. p. cm.—(Research and practice in applied linguistics) Includes bibliographical references and index. ISBN 978–1–4039–8536–1 (pbk. : alk. paper) – ISBN 978–1–4039–8535–4 (alk. paper) 1. Language and languages—Study and teaching. 2. Vocabulary—Study and teaching. 3. Second language acquisition. I. Title. P53.9.S365 2010 418.007⬘2—dc22
2009046796
10 9 8 7 6 5 4 3 2 1 19 18 17 16 15 14 13 12 11 10 Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne
9781403_985354_01_prexviii.indd iv
6/11/2010 1:15:10 PM
Improve the World Start with Knowledge
9781403_985354_01_prexviii.indd v
6/11/2010 1:15:11 PM
9781403_985354_01_prexviii.indd vi
6/11/2010 1:15:11 PM
Contents Quick Checklist
xi
General Editors’ Preface
xiii
Preface
xiv
Acknowledgements
xvi
Part 1 1
Vocabulary Use and Acquisition 1.1 Ten key issues 1.1.1 Vocabulary is an important component of language use 1.1.2 A large vocabulary is required for language use 1.1.3 Formulaic language is as important as individual words 1.1.4 Corpus analysis is an important research tool 1.1.5 Vocabulary knowledge is a rich and complex construct 1.1.6 Vocabulary learning is incremental in nature 1.1.7 Vocabulary attrition and long-term retention 1.1.8 Vocabulary form is important 1.1.9 Recognizing the importance of the L1 in vocabulary studies 1.1.10 Engagement is a critical factor in vocabulary acquisition 1.2 Vocabulary and reading 1.3 A sample of prominent knowledge gaps in the field of vocabulary studies Part 2
2
Overview of Vocabulary Issues 3 3 3 6 8 12 15 19 23 24 25 26 29 35
Foundations of Vocabulary Research
Issues of Vocabulary Acquisition and Use 2.1 Form-meaning relationships 2.1.1 Single orthographic words and multi-word items
47 49 49
vii
9781403_985354_01_prexviii.indd vii
6/11/2010 1:15:11 PM
viii
Contents
2.1.2 2.1.3 2.1.4
Formal similarity Synonymy and homonymy Learning new form and meaning versus ‘relabelling’ 2.2 Meaning 2.2.1 Imageability and concreteness 2.2.2 Literal and idiomatic meaning 2.2.3 Multiple meaning senses 2.2.4 Content versus function words 2.3 Intrinsic difficulty 2.4 Network connections (associations) 2.5 Frequency 2.5.1 The importance of frequency in lexical studies 2.5.2 Frequency and other word knowledge aspects 2.5.3 L1/L2 frequency 2.5.4 Subjective and objective estimates of frequency 2.5.5 Frequency levels 2.5.6 Obtaining frequency information 2.6 L1 influence on vocabulary learning 2.7 Describing different types of vocabulary 2.8 Receptive and productive mastery 2.9 Vocabulary learning strategies/self-regulating behavior 2.10 Computer simulations of vocabulary 2.11 Psycholinguistic/neurolinguistic research 3
Formulaic Language 3.1 Identification 3.2 Strength of association – hypothesis tests 3.3 Strength of association – mutual information 3.4 A directional measure of collocation 3.5 Formulaic language with open slots 3.6 Processing formulaic language 3.7 Acquisition of formulaic language 3.8 The psycholinguistic reality of corpus-extracted formulaic sequences 3.9 Nonnative use of formulaic language Part 3
4
52 52 53 53 54 54 55 58 63 63 64 66 67 68 70 71 75 79 89 97 105 117 120 124 130 131 132 134 136 141 142
Researching Vocabulary
Issues in Research Methodology
9781403_985354_01_prexviii.indd viii
50 52
149
6/11/2010 1:15:11 PM
Contents ix
4.1 4.2 4.3
Qualitative research Participants The need for multiple measures of vocabulary The need for longitudinal studies and delayed posttests Selection of target lexical items Sample size of lexical items Interpreting and reporting results
149 150
Measuring Vocabulary 5.1 Global measurement issues 5.1.1 Issues in writing vocabulary items 5.1.2 Determining pre-existing vocabulary knowledge 5.1.3 Validity and reliability of lexical measurement 5.1.4 Placing cut-points in study 5.2 Measuring vocabulary size 5.2.1 Units of counting vocabulary 5.2.2 Sampling from dictionaries or other references 5.2.3 Recognition/receptive vocabulary size measures 5.2.4 Recall/productive vocabulary size measures 5.3 Measuring the quality (depth) of vocabulary knowledge 5.3.1 Developmental approach 5.3.2 Dimensions (components) approach 5.4 Measuring automaticity/speed of processing 5.5 Measuring organization 5.6 Measuring attrition and degrees of residual lexical retention
173 173 174
4.4 4.5 4.6 4.7 5
6
Example Research Projects Part 4
7
155 158 164 166
179 181 187 187 188 193 196 203 216 217 224 242 247 256 260
Resources
Vocabulary resources 7.1 Instruments 7.1.1 Vocabulary levels test 7.1.2 Vocabulary size test
9781403_985354_01_prexviii.indd ix
152
279 279 279 293
6/11/2010 1:15:11 PM
x
Contents
7.1.3
7.2
7.3 7.4 7.5 7.6 7.7
Meara’s_lognostics measurement instruments Corpora 7.2.1 Corpora representing general English (mainly written) 7.2.2 Corpora representing spoken English 7.2.3 Corpora representing national varieties of English 7.2.4 Corpora representing academic/business English 7.2.5 Corpora representing young native English 7.2.6 Corpora representing learner English 7.2.7 Corpora representing languages other than English 7.2.7.1 Parallel corpora 7.2.7.2 Monolingual corpora 7.2.8 Corpus compilations 7.2.9 Web-based sources of corpora 7.2.10 Bibliographies concerning corpora Concordancers/tools Vocabulary lists Websites Bibliographies Important personalities in the field of vocabulary studies
306 307 309 320 323 324 325 325 326 326 327 331 333 335 335 345 347 351 352
Notes
359
References
362
Index
385
9781403_985354_01_prexviii.indd x
6/11/2010 1:15:11 PM
Quick Checklist (Principal sections which discuss these issues)
Target lexical items ●
● ● ● ●
Do any lexical characteristics potentially confound your results? (2.1– 2.4, 4.5) Have you taken frequency into account? (2.5) Does L1 influence potentially confound your results? (2.6) Is your sampling rate sufficient to make your results meaningful? (4.6) Have you considered including formulaic sequences as well as individual words? (3)
Measurement instruments ● ● ● ●
● ●
●
●
Are they valid, reliable, and appropriate for your participants? (5) Are they suitable for answering your research questions? (whole book) Are you measuring receptive or productive mastery, or both? (2.8) Have you considered measuring word knowledge aspects besides meaning and form? (1.1.5, 4.3, 5.3) Have you considered measuring depth of lexical knowledge? (5.3) Have you considered measuring lexical organization and speed of processing? (2.4, 2.11, 5.4, 5.5) If the study is focused on acquisition, is previous lexical knowledge determined or controlled for? (5.1.2) If the study is focused on acquisition, are there delayed posttests? (4.4)
Participants ●
Are there enough participants to make the study viable? (4.2)
Corpus issues ●
Is the corpus you use appropriate for your research questions? (1.1.4, 3.8, 6.2)
xi
9781403_985354_01_prexviii.indd xi
6/11/2010 1:15:11 PM
xii Quick Checklist
Reporting ● ● ● ●
Were the units of counting clearly described? (5.2.1) Did you discuss the absolute size of any gain/attrition? (4.7) Did you report effect sizes? (4.7) Are your interpretations and conclusions warranted based on your results? (4.7)
Bottom line ● ●
Is your study interesting? Is your study useful to anyone?
9781403_985354_01_prexviii.indd xii
6/11/2010 1:15:12 PM
General Editors’ Preface Research and Practice in Applied Linguistics is an international book series from Palgrave Macmillan which brings together leading researchers and teachers in Applied Linguistics to provide readers with the knowledge and tools they need to undertake their own practice related research. Books in the series are designed for students and researchers in Applied Linguistics, TESOL, Language Education and related subject areas, and for language professionals keen to extend their research experience. Every book in this innovative series is designed to be user-friendly, with clear illustrations and accessible style. The quotations and defi nitions of key concepts that punctuate the main text are intended to ensure that many, often competing, voices are heard. Each book presents a concise historical and conceptual overview of its chosen field, identifying many lines of enquiry and findings, but also gaps and disagreements. It provides readers with an overall framework for further examination of how research and practice inform each other, and how practitioners can develop their own problem-based research. The focus throughout is on exploring the relationship between research and practice in Applied Linguistics. How far can research provide answers to the questions and issues that arise in practice? Can research questions that arise and are examined in very specific circumstances be informed by, and inform, the global body of research and practice? What different kinds of information can be obtained from different research methodologies? How should we make a selection between the options available, and how far are different methods compatible with each other? How can the results of research be turned into practical action? The books in this series identify some of the key researchable areas in the field and provide workable examples of research projects, backed up by details of appropriate research tools and resources. Case studies and exemplars of research and practice are drawn on throughout the books. References to key institutions, individual research lists, journals and professional organizations provide starting points for gathering information and embarking on research. The books also include annotated lists of key works in the field for further study. The overall objective of the series is to illustrate the message that in Applied Linguistics there can be no good professional practice that isn’t based on good research, and there can be no good research that isn’t informed by practice. Christopher N. Candlin and David R. Hall Macquarie University, Sydney xiii
9781403_985354_01_prexviii.indd xiii
6/11/2010 1:15:12 PM
Preface This is a vocabulary research manual. It aims to give you the background knowledge necessary to design rigorous and effective research studies into the behavior of L1 and L2 vocabulary. It can also help you better understand other people’s research and interpret it more accurately. In order to keep the manual to a reasonable length, I assume that you already have an understanding of basic research methodology for language research in general, and also have a basic understanding of statistics. I also assume you have a general understanding of vocabulary issues. The manual will build on this knowledge and discuss the issues which have particular importance for vocabulary research. The exception to these assumptions of previous knowledge is statistical knowledge about corpus linguistics (e.g. t-score and MI), which is more specific to vocabulary research, and so the calculations behind these statistical procedures are spelled out in Chapter 3. In addition, I have almost always built descriptions of terminology and concepts into the text, but in a few cases have added Concept Boxes to supplement the text. I did not want this book to be just my personal take on vocabulary research, but rather wished it to be a consensus state-of-the-art research manual. While it inevitably reflects my own interests and biases (and uses many of the studies I have been involved with for illustration), I have been extremely fortunate that many of my friends in the field of vocabulary studies have been willing to read all or parts of the book and provide comments. I often incorporated their insightful critiques more-or-less directly into the text, and the final version of the book is greatly improved by the process. As a result, I feel that the book does reflect a (somewhat personalized) consensus view of good vocabulary research practice. While many of my colleagues might do certain things differently than indicated in this book, it does indicate the major issues which need to be considered to carry out worthwhile vocabulary research, and hopefully will help you to avoid many of the pitfalls that exist. Although most of the issues discussed in this handbook pertain to vocabulary research in any language, the majority of research to date has been on English, including my own personal research. Almost inevitably, this has led to the majority of examples and citations referring to the English language. There is no value judgement intended in this, and I hope you are able to take the ideas and techniques and apply them to the languages you are researching. xiv
9781403_985354_01_prexviii.indd xiv
6/11/2010 1:15:12 PM
Preface
xv
This handbook can’t tell you the exact research methodologies to use, as every lexical study is different, entailing unique goals and difficulties. However, I have tried to provide enough background information about the nature of vocabulary and discussion of possible research methodologies to help guide you in thinking about the issues necessary in selecting and developing sound methodologies for the lexical research you wish to do. I love vocabulary research, and with so many questions still unanswered, I want to encourage as much of it as I can. I hope this book stimulates you to begin researching vocabulary yourself, or to keep researching if you are already at it. It is a fascinating area, and I hope to hear your results at a future conference and/or read them in a future journal. Norbert Schmitt Nottingham June 1, 2009
9781403_985354_01_prexviii.indd xv
6/11/2010 1:15:12 PM
Acknowledgements I would like to thank the University of Michigan for giving my wife a Morley scholarship to study in Ann Arbor in July 2008. This allowed me to write a large portion of this book in the wonderful environs of the Rackham Building on their campus. All that was missing from the atmosphere was Indiana Jones sliding through the library on his motorcycle. Colleagues who have graciously commented on the entire manuscript include Paul Nation, Birgit Henriksen, Averil Coxhead, and Ronald Carter. Their many perceptive comments have improved the final version, and helped to make it more complete. I also owe a debt of thanks to numerous colleagues who commented on the parts of the book where their particular specialisms were covered, or who contributed material. Their input has added much to the rigor of the book: Frank Boers, Tom Cobb, Kathy Conklin, Zoltán Dörnyei, Philip Durrant, Catherine Elder, Nick Ellis, Glen Fulcher, Tess Fitzpatrick, Lynne Flowerdew, Gareth Gaskell, Sylviane Granger, Kirsten Haastrup, Marlise Horst, Jan Hulstijn, Kon Kuiper, Batia Laufer, Phoebe Lin, Ron Martinez, Paul Meara, Imma Miralpeix, Anne O’Keeffe, Spiros Papageorgiou, Sima Paribakht, Aneta Pavlenko, Pam Peters, Diana Pulido, Ana Maria Pellicer Sánchez, Paul Rayson, John Read, Ute Römer, Diane Schmitt, Rob Schoonen, Barbara Seidlhofer, Anna Siyanova, Suhad Sonbul, Pavel Trofimovish, Mari Wesche, Cristina Whitecross, and David Wood. Comments from my editors Chris Candlin and David Hall did much to sharpen both the thinking and presentation of the material. Of course, everyone had slightly different views on the best research methodologies and other content of the book, and so the final distillation of the various points of view is my personal interpretation for which I alone am responsible. Finally, to my wife Diane, for commenting on the manuscript, but more importantly, for taking me to places like Carcassone, Ann Arbor, Auckland, and Copenhagen where writing various parts of the book was a pleasure. I love you more than ever. The author and publishers wish to thank Wiley-Blackwell, Elsevier and Lee Osterhout for permission to reproduce copyright material: Figure 2.1 The Relationship between Historical Origin and Register, G. Hughes A History of English Words, 2000, Malden, MA: Blackwell p. 15 Figure 2.12 ERP plots showing N400 and P600 phenomena, Osterhout, L., McLaughlin, J., Pitkänen, I., Frenck-Mestre, and Molinaro, N. (2006). Novice learners, longitudinal designs, and event-related potentials: A means xvi
9781403_985354_01_prexviii.indd xvi
6/11/2010 1:15:12 PM
Acknowledgements
xvii
for exploring the neurocognition of second language processing. Language Learning 56, Supplement 1: p. 204. Figure 2.13 fMRI brain location results; Hauk, O., Johnsrude, I. & Pulvermüller, F. Somatotopic representation of action words in the motor and premotor cortex. Neuron 41, 301–307 (2004), Elsevier Science
9781403_985354_01_prexviii.indd xvii
6/11/2010 1:15:12 PM
9781403_985354_01_prexviii.indd xviii
6/11/2010 1:15:13 PM
Part 1 Overview of Vocabulary Issues
9781403_985354_02_cha01.indd 1
6/9/2010 1:58:04 PM
9781403_985354_02_cha01.indd 2
6/9/2010 1:58:05 PM
1 Vocabulary Use and Acquisition
This is a vocabulary research manual whose primary goal is to provide readers with a solid foundation of vocabulary research methodology, both in terms of good research practice, and in terms of the common pitfalls to avoid. But in doing research, we must always make methodology serve the research issues we are interested in exploring. The issues which attract the most attention (and thus research) in the field of vocabulary concern the nature of lexis, its employment in language use, and the best ways of facilitating its acquisition. In order to design good vocabulary research on these issues, one must be on good terms with what the field already knows about these issues. There are a number of good overviews/collections which should be reviewed to gain a general understanding of vocabulary and its behavior (e.g. Bogaards and Laufer, 2004; Carter, 1998; Coady and Huckin, 1997; Daller, Milton, and Treffers-Daller, 2007; Hunt and Beglar, 1998, 2005; McCarthy, 1990; Meara, 2009; Nation, 1990, 2001; Read, 2000, 2004; Schmitt, 2000, 2008; Schmitt and McCarthy, 1997; Singleton, 1999). This chapter of the book will follow up on the information in these publications and highlight ten key issues which must be taken into account when designing vocabulary research. They are outlined below and have direct implications for the discussion of methodology in the following chapters of the book. I will then identify a number of important vocabulary issues about which we do not yet have much knowledge, and how these gaps affect lexical research.1
1.1 1.1.1
Ten key issues Vocabulary is an important component of language use
Quote 1.1 Wilkins on the importance of vocabulary for communication Without grammar very little can be conveyed, without vocabulary nothing can be conveyed. (1972: 111) 3
9781403_985354_02_cha01.indd 3
6/9/2010 1:58:05 PM
4
Overview of Vocabulary Issues
One thing that all of the partners involved in the learning process (students, teachers, materials writers, and researchers) can agree upon is that learning vocabulary is an essential part of mastering a second language. The importance of vocabulary is highlighted by the oft-repeated observation that learners carry around dictionaries and not grammar books. However, it is important to provide empirical evidence to back up this type of anecdotal observation (after all, this is a research manual!). This is easily done, as there is plenty of evidence pointing to the importance of vocabulary in language use. One strand of this evidence is the typically high correlations between vocabulary (usually measures of vocabulary size) and various measures of language proficiency. For example, a close relationship has been shown between vocabulary size and reading (e.g. correlations of .50–.75, Laufer, 1992).2 Furthermore, Laufer and Goldstein (2004) found that knowing the form-meaning link of words accounted for 42.6% of the total variance in participants’ class grades according to a regression analysis. Given that the language class grade reflected performance on reading, listening, speaking and writing, grammatical accuracy, sociolinguistic appropriateness, and language fluency, the above figure indicates that vocabulary knowledge contributes a very great deal to overall language success. Albrechtsen, Haastrup, and Henriksen (2008) compared measures of vocabulary size and depth (association tests) of Danish ESL learners with several measures of the ability to use English. In the L1, lexical size correlated with lexical inferencing success (guessing the meaning of unknown words in written text/written discourse) at .69–.82, and in the L2 at .48–.66. L2 vocabulary size correlated with L2 reading ability at .73–.80. One of the most systematic explorations of the relationship between vocabulary knowledge and language proficiency occurred as part of the development of the DIALANG3 tests (Alderson, 2005). His research team, with Paul Meara heading the vocabulary section, compared scores on various vocabulary tests with the scores from the other language components of the DIALANG test. The results are illustrated in Table 1.1. As is clear from the table, vocabulary has strong relationships with the language skills. The checklist test and the vocabulary test battery correlate with reading at .64, listening from .61–.65, writing from .70–.79, and grammar at .64. Thus the r2 (i.e. correlation values squared) values indicate that vocabulary accounts for 37–62% of the variance in the various language proficiency scores. Considering the multitude of the factors which could affect these scores (e.g. learner motivation, background knowledge, familiarity with test task), it is striking that a single factor, vocabulary knowledge, can account for such a large percentage of the variation. The relationship between vocabulary and writing is particularly strong, but even the individual skill subcomponents (e.g. inferencing) have strong relationships with vocabulary knowledge. Moreover, this strong relationship is not a ‘one-off’; rather it is consistent across the board. The lowest correlation reported was
9781403_985354_02_cha01.indd 4
6/9/2010 1:58:05 PM
Vocabulary Use and Acquisition
5
Table 1.1 Correlations between vocabulary and other language proficiencies a Vocabulary checklist
Test
Reading – Identifying main idea – Understanding specific detail – Lexical inferencing
.64 .50
Listening – Identifying main idea – Understanding specific detail – Lexical inferencing
.61 .60
Writing – Accuracy – Register – Textual organization Grammar a b c
Vocabulary test battery
Meaningb Collocationb
Gap– fillc
Word formationc
Total
.47 .58 .44
.43
.62
.63
.56
.50
.65
.44 .56 .70 .70 .57 .51
.66
.71
.79
.64
This table is compiled from Alderson (2005: 87, 89, 205). Receptive test. Productive test.
between the checklist test and understanding specific detail in listening (.44), which still accounts for a very respectable 19% of variance. In short, the DIALANG data clearly support the intuitive notion that vocabulary is important for language use.
Quote 1.2 Alderson on the importance of vocabulary for language use What [the DIALANG analysis] would appear to show is that the size of one’s vocabulary is relevant to one’s performance on any language test, in other words, that language ability is to quite a large extent a function of vocabulary size. (2005: 88)
9781403_985354_02_cha01.indd 5
6/9/2010 1:58:05 PM
6
Overview of Vocabulary Issues
1.1.2
A large vocabulary is required for language use
People use language to communicate, and so naturally one key issue in vocabulary studies is how much vocabulary is necessary to enable this communication. The short answer is a lot, but it depends on one’s learning goals. If one wishes to achieve native-like proficiency, then presumably it is necessary to have a vocabulary size similar to native speakers. Because most of the research on vocabulary size has been done on English, my discussion will focus on that language, although there are reasons to believe that the figures for other languages may be lower (Nation and Meara, 2002). Unfortunately, much of the research into native speaker vocabulary size has been methodologically flawed, leading to wildly varying estimates (Nation, 1993). In fact, the estimates are sometimes an order of magnitude apart. However, there have been a few well-designed studies which provide reliable estimates. Goulden, Nation, and Read (1990) found that their New Zealand university undergraduates had a vocabulary size of about 17,000 word families. (See Section 5.2.1 for a description of the various units for counting vocabulary.) D’Anna, Zechmeister, and Hall (1991) found that their university students knew a little under 17,000 of the headwords in the 1980 Oxford American Dictionary. Using the same methodology as D’Anna et al., Zechmeister, Chronis, Cull, D’Anna, and Healy (1995) found similar results for university students (around 16,000 headwords), while junior high school students knew 11,836 headwords on average, and retired adults 21,252. When I use the Goulden et al. (1990) checklist test with my educated friends and university students, I normally come up with estimates in line with the above, ranging between 15,000 and 18,000 word families. Native speakers will always vary in their vocabulary size to some extent, depending on the amount and the manner in which they use their language. We would expect highly educated persons to have a larger vocabulary than less educated persons, but this may not always be true. For example, a crossword enthusiast may well have a wider vocabulary than a holder of a PhD. Nevertheless, a range of 16,000–20,000 word families seems a fair estimate of the vocabulary size for educated native speakers.
Quote 1.3 Nation and Waring on native-speaker vocabulary size The best conservative rule of thumb that we have is that up to a vocabulary size of around 20,000 word families, we should expect that [English] native speakers will add roughly 1,000 word families a year to their vocabulary size. This means that a [L1] five year old beginning school will have a vocabulary of around 4,000 to 5,000 word families. A university graduate will have a vocabulary of around 20,000 word families. These figures are very rough and there is likely to be a large variation between individuals. These figures exclude proper names, compound words, abbreviations, and foreign words. (1997: 7–8)
9781403_985354_02_cha01.indd 6
6/9/2010 1:58:05 PM
Vocabulary Use and Acquisition
7
Luckily, second language learners do not need to achieve native-like vocabulary sizes in order to use English well. A more reasonable vocabulary goal for these learners is the amount of lexis necessary to enable the various forms of communication in English. One of the most basic things a person might want to do is to communicate orally on an everyday basis (e.g. asking directions to the train station, describing one’s holiday). If we assume that 98% of the vocabulary needs to be known (Hu and Nation, 2000; Schmitt, Jiang, and Grabe, in press), and also assume that the proper nouns in the discourse are known, we can estimate the number of word families it takes to be able to engage in informal daily conversation. Nation (2006), using word lists based on the Wellington Corpus of Spoken English, calculated that 6,000–7,000 word families are required to reach this goal. An analysis of the spoken CANCODE corpus (Adolphs and Schmitt, 2003) found coverage figures congruent with Nation’s at the 3,000 word family level (the upper limit of their analysis), supporting Nation’s calculations. Kon Kuiper (2009) argues it is useful to make a distinction between ‘communicatively competent’ in a genre and ‘native-like’ in a genre. He makes the case for genre-specific native-like competence and performance, because no native speaker has native-like competence and communicative performance in all genres. For L2 speakers, communicative competence is possible in a number of genres, but native-like competence is much more difficult. However, it is not yet clear that the 98% coverage figure (derived from research on written discourse) is the most appropriate figure for spoken discourse. Nation (2006), using the Wellington Corpus, calculated that 95% coverage would require knowledge of about 3,000 word families, plus proper nouns. In addition, Staehr (2009) found that advanced Danish listeners who knew the 5,000 most frequent word families in English were also able to demonstrate adequate listening ability on the Cambridge-ESOL Certificate of Proficiency in English (CPE) listening exam. Overall, the current evidence suggests that it requires between 2,000 and 3,000 word families to be conversant in English (if 95% coverage is adequate) or between 6,000 and 7,000 word families if 98% coverage is needed. However, there is simply not enough evidence to confidently establish a coverage requirement for listening at the moment. For estimates of written vocabulary, we are on firmer ground. Nation (2006) went on to calculate that 8,000–9,000 word families are necessary to read a range of authentic texts (e.g. novels or newspapers), based on British National Corpus (BNC) data and 98% coverage. Similarly, both the highest level (C2) of the Common European Framework and the CPE require between about 4,500 and 5,000 word families on a 5,000 level test (i.e. knowing most of these frequent families) (Milton and Hopkins, 2006).4 Because learners are also likely to need some families beyond the 5,000 level, it seems that 8,000–9,000 word families is the realistic target if they wish to read a wide variety of texts without unknown vocabulary being a problem.
9781403_985354_02_cha01.indd 7
6/9/2010 1:58:05 PM
8
Overview of Vocabulary Issues
These figures may seem daunting to both teachers and learners, but even so, they probably underestimate the learning challenge. Each word family includes several individual word forms, including the root form (stimulate), its inflections (stimulated, stimulating, stimulates), and regular derivations (stimulation, stimulative). Nation’s (2006) BNC lists show that the most frequent 1,000 word families average about six members (types per family), decreasing to about three members per family at the 9,000 frequency level. According to his calculations, a vocabulary of 6,000 word families (enabling listening) entails knowing 28,015 individual word forms, while the 8,000 families (enabling wide reading) entails 34,660 words. Sometimes these word family members are transparently related (nation–national) and relatively guessable if unknown. However, this is not always the case (involve–involvedness), and learners may have trouble with these lesstransparent members, especially in terms of production. While Horst and Collins (2006) found a growing morphological productive ability in their French learners of English over 100, 200, 300, and 400 hours of instruction, Schmitt and Zimmerman’s (2002) advanced learners of English (preparing to enter English-medium universities) typically knew only some, but not all, of the noun/verb/adjective/adverb members of word families taken from the Academic Word List (Coxhead, 2000). Thus, it cannot be assumed that knowing one word family member implies knowing (or being able to guess) other related members. The upshot is that learners must learn a very large number of lexical items to be able to operate in English, especially considering that the above figures do not take into account the multitude of phrasal lexical items (see Chapter 3) which have been shown to be extremely widespread in language use (e.g. Schmitt, 2004; Wray, 2002). Learning such a large number of lexical items is one of the greatest hurdles facing learners in acquiring English. Moreover, it is one which a great many learners fail to cross successfully, as the vocabulary sizes of learners reported in research studies typically fall well short of these size requirements (Table 1.2). The scope of the vocabulary learning task, and the fact that many learners fail to achieve even moderate vocabulary learning goals, indicates that it can no longer be assumed that an adequate lexis will simply be ‘picked up’ from exposure to language tasks focusing either on other linguistic aspects (e.g. grammatical constructions) or on communication alone (e.g. communicative language teaching). Rather, a more proactive, principled approach needs to be taken in promoting vocabulary learning, which includes both explicit teaching and exposure to large amounts of language input, especially though extensive reading (Laufer, 2005a; Schmitt, 2008). 1.1.3
Formulaic language is as important as individual words
Vocabulary instruction has tended to focus on individual words because they have been considered the basic lexical unit, but also because they
9781403_985354_02_cha01.indd 8
6/9/2010 1:58:05 PM
Vocabulary Use and Acquisition
9
Table 1.2 English vocabulary size of foreign learnersa Vocab. size
Hours of instructionb
Japan EFL University
2,000 2,300
800–1,200
China English majors
4,000
1,800–2,400
Indonesia EFL University
1,220
900
Oman EFL University
2,000
1,350+
Horst, Cobb, and Meara, 1998
Israel High school graduates
3,500
1,500
Laufer, 1998
France High school
1,000
400
Arnaud et al., 1985
Greece Age 15, high school
1,680
660
Milton and Meara, 1998
Germany Age 15, high school
1,200
400
Milton and Meara, 1998
Country
Reference (re: size) Shillaw, 1995 Barrow et al., 1999 Laufer, 2001 Nurweni and Read, 1999
a
Table is taken from Laufer (2000a: 48, slightly adapted). The data on hours of instruction were largely obtained by Laufer’s personal communication with colleagues from the respective countries. b
are easier to work with than formulaic language. Languages like English indicate individual words in text by placing spaces around them, while formulaic language is seldom rendered as single forms (e.g. with hyphens: state-of-the-art). Individual words are convenient units to teach and incorporate into materials. The main vocabulary reference source, dictionaries, are set up around individual headwords. Word processors give counts of individual words in documents. It is therefore no wonder that most teachers and students tend to think of vocabulary in terms of individual words. For similar reasons, most vocabulary research has studied individual words. However, it is becoming increasingly clear that formulaic language5 is an important element of language learning and use, in ways outlined over the years by Pawley and Syder (1983), Nattinger and DeCarrico (1992), Moon (1997), Wray (2002), Schmitt and Carter (2004), Fellbaum (2007), and Granger and Meunier (2008), among others. There are a number of reasons why we should give formulaic language a prominent place in vocabulary research: ●
Normal discourse, both written and spoken, contains large (but not yet fully determined) percentages of formulaic language. Erman and Warren (2000) calculated that 52–58% of the L1 English language they analyzed
9781403_985354_02_cha01.indd 9
6/9/2010 1:58:06 PM
10
●
●
●
Overview of Vocabulary Issues
was formulaic, and Foster (2001) came up with a figure of 32% using different procedures and criteria. If much discourse is made up of formulaic language, then this implies that proficient language users know a large number of formulaic expressions. Pawley and Syder (1983: 213) suggest that the number of ‘sentence-length expressions familiar to the ordinary, mature English speaker probably amounts, at least, to several hundreds of thousands’. Jackendoff (1995) concludes from a small corpus study of spoken language in a TV quiz show that people may know at least as many formulaic sequences as single words. Mel’cuk (1995: 169) believes that phrasemes are more numerous than words by a ratio of at least 10 to 1. It must be said however, that there is little hard research yet to either support or refute these assertions. Formulaic language is not a homogeneous phenomenon, but is, on the contrary, rather varied. Formulaic sequences can be long (You can lead a horse to water, but you can’t make him drink) or short (Oh no!), or anything in between. They are commonly used for different purposes. They can be used to express a message or idea (The early bird gets the worm = do not procrastinate), functions ([I’m] just looking [thanks] = declining an offer of assistance from a shopkeeper), social solidarity, and to transact specific information in a precise and understandable way. They realize many other purposes as well, as formulaic sequences can be used for most things society requires of communication through language. These sequences can be totally fixed (Ladies and Gentlemen) or have a number of ‘slots’ which can be filled with appropriate words or strings of words ([someone/thing, usually with authority] made it plain that [something as yet unrealized was intended or desired]). Formulaic language also includes the multitude of collocations which exist in language (blue sky, hard work).6 Similarly, formulaic language is used to realize a number of different communicative purposes in language use, including: Functional use There are recurring situations in the social world that require language to deal with them. These are often described as functions, and include such speech acts as apologizing, making requests, giving directions, and complaining. These functions typically have conventionalized language attached to them, such as I’m (very) sorry to hear about ——— to express sympathy and I’d be happy/glad to ——— to comply with a request (Nattinger and DeCarrico, 1992). Because members of a speech community know these expressions, they serve as a quick and reliable way to achieve the related speech act. Social interaction (phatic communion) People commonly engage in ‘light’ conversation for pleasure or to pass the time of day, where the purpose is not really information exchange or to get someone to do something. Rather, the purpose is social solidarity, and people rely on non-threatening phrases to keep the conversation flowing, including
9781403_985354_02_cha01.indd 10
6/9/2010 1:58:06 PM
Vocabulary Use and Acquisition
11
comments about the weather (Nice weather today; Cold isn’t it?), agreeing with your interlocutor (Oh, I see what you mean; OK, I’ve got it), providing backchannels and positive feedback to another speaker (Did you really?; How interesting). Research has shown that such phrases are a key element of informal spoken discourse (McCarthy and Carter, 1997). Discourse organization Formulaic phrases are a common way to signpost the organization of both written (in other words, in conclusion) and spoken discourse (on the other hand, as I was saying). Precise information transfer Technical vocabulary are words which have a single and precise meaning in a particular field (scalpel is a specific type of knife used in medicine). But this phenomenon is not restricted to individual words. Indeed, fields often have phraseology to transact information in a way which minimizes any possible misunderstanding. For example, in aviation language, the phrase Taxi into position and hold clearly and concisely conveys the instructions to move onto the runway and prepare for departure, but to wait for final clearance for takeoff. ●
The use of formulaic language helps speakers be fluent. Pawley and Syder (1983) suggest native-speakers have cognitive limitations in how quickly they can process language, but they are also able to produce language seemingly beyond these limitations. They present evidence that the largest unit of novel discourse that native speakers are able to process is a single clause of eight to ten words. When speaking, they will speed up and become fluent during these clauses, but will then slow down or even pause at the end of these clauses (Dechert, 1983). Presumably these pauses permit the speaker to formulate the next clause. Speakers seldom pause in the middle of a clause, or at least not for long. Together, this evidence suggests that speakers are unable to compose more than about eight to ten words at a time.
On the other hand, native speakers can fluently say multi-clause utterances. Consider the following examples: 1. You shouldn’t believe everything you hear. 2. It just goes to show, you can’t be too careful. 3. You can lead a horse to water, but you can’t make him drink. They have increasingly more words, and Example 3 is clearly beyond the limit of eight to ten words. Yet native speakers can say them all without hesitation. Pawley and Syder suggest that these examples can be fluently produced because they are actually already memorized, i.e. as prefabricated phrases which are stored as single wholes and are, as such, instantly available for use without the cognitive load of having to assemble them on-line as one speaks. Pawley and Syder suggest that the mind uses its vast memory to store these prefabricated phrases in order to compensate for a limited
9781403_985354_02_cha01.indd 11
6/9/2010 1:58:06 PM
12
Overview of Vocabulary Issues
working memory (and the capacity to compose novel language on-line). Indeed, research by Kuiper (2004) shows that speakers who operate under severe time constraints (play-by-play sports announcers, auctioneers) use a great deal of formulaic language in their speech. In addition, there is now converging evidence that collocations and other formulaic language are indeed processed more quickly than non-formulaic language (Ellis, 2006a; Conklin and Schmitt, 2008; Underwood, Schmitt, and Galpin, 2004). Overall, these points illustrate that formulaic language is intrinsically connected with functional, fluent, communicative language use. As such, it is just as important as individual words. Thus, vocabulary researchers always need to be aware of both single- and multi-word lexical items, and whenever practical, include both types in their research and discussions.
Quote 1.4 Wray on the development of formulaic language in L1 and L2 Wray suggests that the development of good collocation intuitions comes down to how language is learned. Natives appear to learn formulaic language throughout the language acquisition process, while nonnatives focus more on individual words than sequences because they are more manageable and give a feeling of control over the language: The consequence [of focusing on word-sized units in L2 learning] is a failure to value the one property of nativelike input which is most characteristic of the idiomaticity to which the learner ultimately aspires: words do not go together, having first been apart, but, rather, belong together, and do not necessarily need separating. (2002: 212)
1.1.4
Corpus analysis is an important research tool
One of the most significant developments in vocabulary studies in recent years has been the use of corpus evidence to provide an empirical basis for determining vocabulary behavior, instead of relying on appeals to intuition or tradition. The first major corpus study I could find any reference to was carried out by F.W. Kaeding in 1898 (Howatt, 2004: 290). He supervised a massive analysis of German, where hundreds of workers manually counted nearly 11 million words. This is an amazing feat for such an early time, without even typewriters, let alone computers! By the first half of the 1900s, corpus analysis was already making an impact on pedagogy. Several scholars (Harold Palmer, Michael West, Edward Thorndike, Lawrence Faucett, Irving Lorge) were concerned with ways to systematize the selection of vocabulary for learners. They also tried to make vocabulary easier by limiting it to some degree, and so their attempts came to be collectively
9781403_985354_02_cha01.indd 12
6/9/2010 1:58:06 PM
Vocabulary Use and Acquisition
13
known as the vocabulary control movement (see Howatt, 2004; Schmitt, 2000; and Zimmerman 1997, for overviews). The work of several of these scholars merged into what came to be referred to as ‘The Carnegie Report’ (Palmer, West, and Faucett, 1936). The report recommended the development of a list of vocabulary which would be useful in the production of simple reading materials. The list ended up having about 2,000 words, and was finally published as the General Service List of English Words (GSL) (West, 1953). A key feature of the GSL is that each word’s different parts-of-speech and different meaning senses are listed, which makes the list much more useful that a simple frequency count. It has been immensely influential in lexical research and materials design, but is now dated (as it is based on word counts from the first part of the last century) and requires a complete revision based on current corpora. (See Sections 2.5, 2.7, and 6.4 for more details.) However, corpus analysis really took off when it became computerized. Early computerized corpora of 1 million words were considered large, e.g. the Brown Corpus (Kuˇcera and Francis, 1967) focusing on American English, and its counterpart in Europe, the Lancaster-Oslo/Bergen Corpus (LOB) (Hofland and Johansson 1982; Johansson and Hofland, 1989) focusing on British English. Nowadays, much larger corpora are the norm. Perhaps the best-known corpus of general English is the 100 million word British National Corpus (including 10 million words of unscripted spoken discourse). American English is also well catered for with the 385 million word Corpus of Contemporary American English. The Bank of English Corpus is larger (524 million words),7 but contains almost exclusively written text. The TOEFL 2000 Spoken and Written Academic Language Corpus consists of 2.7 million words sampled at four US universities, including almost 1.7 million spoken (1.2 million from class sessions) and 1 million written. There are also several corpora based on unscripted spoken English, including the Cambridge and Nottingham Corpus of Discourse English (CANCODE – 5 million words), the Michigan Corpus of Academic Spoken English (MICASE – 1.7 million words), the British Academic Spoken English corpus (BASE – 1.6 million words). (See Section 6.2 for detailed descriptions of these and many other corpora.) Frequency is one of the most important characteristics of vocabulary, affecting most or all aspects of lexical processing and acquisition. Corpus data is the best source of frequency information, and several findings invariably appear. One key characteristic is that a relatively small number of the most frequent words cover an inordinate percentage of word occurrences in language. For example, the is the most frequent word in written and spoken English, making up approximately 6.2% of all word occurrences. The top three words (the, of, and) make up about 12.8%, the top ten words (the, of, and, a, in, to, it, is, was, I) 22.2% of all tokens. (These figures are based on unlemmatized BNC data (Leech, Rayson, and Wilson, 2001).) Nation and Waring (1997) report that 2,000 lemmas cover about 80% of the occurrences in the Brown Corpus. Thus, a relative handful of
9781403_985354_02_cha01.indd 13
6/9/2010 1:58:06 PM
14
Overview of Vocabulary Issues
words cover the vast majority of language, while the rest occur much less frequently. Another finding is that the most frequent words in English tend to be grammatical words. This stems from the commonsense fact that such grammatical words are necessary to the structure of English regardless of the topic. Articles, prepositions, pronouns, conjunctions, forms of the verb be etc. are equally necessary whether we are talking about cowboys, space exploration, botany, or music. A third insight is that the frequencies of lexical items differ considerably between spoken and written discourse. For example, a number of content words, such as know, well, got, think, and right, are more frequent in spoken discourse than written discourse. On closer inspection, it turns out that these words are not content words at all but actually elements of interpersonal phrases (you know, I think), single-word organisational markers (well, right), smooth-overs (never mind), hedges (kind of/sort of ), and other kinds of discourse items which are characteristic of the spoken mode (McCarthy and Carter, 1997). This shows that spoken language makes frequent use of these types of discourse markers, while they rarely occur in written language. A related difference is that the same word may take different meanings in the two modes. McCarthy and Carter (1997) show that got is used mainly in the construction have got in the CANCODE as the basic verb of possession or personal association with something. However, they highlight the following two sentences from the corpus which are indicative of other meanings: 1. I’ve got so many birthdays in July. 2. I’ve got you. In Example 1, the speaker means something like ‘I have to deal with’, because he/she is referring to the obligation of sending numerous birthday cards. In Example 2, I’ve got seems to mean ‘I understand you’. Neither of these meaning senses would be common in the formal written mode. These insights about vocabulary frequency have immediate ramifications for research. The most important is that frequency must be considered in the selection of target words for vocabulary studies. Language learners typically acquire higher frequency vocabulary before lower frequency vocabulary, so matching the vocabulary frequency to the level of the participants in a study is important. For example, if your participants are beginners, higher frequency words will need to be selected. Another implication is that we need to be aware of the differences in spoken and written frequency and not assume that they are interchangeable. Thus, a study concerning spoken vocabulary will usually be best designed based on data from spoken, rather than written, corpora. It should also be noted that, in some sense, genre is relevant: some lexical items that are infrequent in general English (e.g. collocation) become much higher in frequency in specific genres, like
9781403_985354_02_cha01.indd 14
6/9/2010 1:58:06 PM
Vocabulary Use and Acquisition
15
applied linguistics. There will be further detailed discussion of vocabulary frequency effects in Section 2.5. Another major vocabulary characteristic for which corpus data supplies information is the kind of lexical patterning described in the above section on formulaic language. It is also worth mentioning that corpus data has been influential in lexicography, as all of the major modern learner dictionaries have been based on corpus data.
Quote 1.5
Reppen and Simpson on the value of corpora
Corpus linguistics provides an extremely powerful tool for the analysis of natural language and can provide tremendous insights as to how language use varies in different situations, such as spoken versus written, or formal interactions versus casual conversation. (2002: 92)
1.1.5
Vocabulary knowledge is a rich and complex construct
In addition to needing a large vocabulary size to function in a language, a person must also know a great deal about each individual lexical item in order to use it well. This often referred to as the quality or ‘depth’ of vocabulary knowledge, and is as important as vocabulary size. Most laymen (including many teachers and learners) might consider a lexical item ‘learned’ if the spoken/written form and meaning are known. Furthermore, Brown (in press) found that the nine general English textbooks he analysed focused mainly on meaning and form, with some attention to grammatical function, to the exclusion of other types of word knowledge (see below). While it is true that the form-meaning link is the first and most essential lexical aspect which must be acquired, and may be adequate to allow recognition, much more must be known about lexical items, particularly if they are to be used productively.
Quote 1.6 Anderson and Freebody on breadth and depth of vocabulary knowledge The first [type of vocabulary knowledge] may be called ‘breadth’ of knowledge, by which we mean the number of words for which the person knows at least some of the significant aspects of meaning ... . [There] is a second dimension of vocabulary knowledge, namely the quality or ‘depth’ of understanding. We shall assume that, for most purposes, a person has a sufficiently deep understanding of a word if it conveys to him or her all of the distinctions that would be understood by an ordinary adult under normal circumstances. (1981: 92–93)
9781403_985354_02_cha01.indd 15
6/9/2010 1:58:07 PM
Overview of Vocabulary Issues
There are a number of ways ‘depth of knowledge’ can be conceptualized. One is overall proficiency with a word, ranging from no knowledge at all to complete mastery. This ‘developmental’ conceptualization (Read, 2000) is typically measured along a scale. Examples of such scales include the Vocabulary Knowledge Scale (Paribakht and Wesche, 1997) and a four-stage scale used by Schmitt and Zimmerman (2002). (See Section 5.3.1 for a fuller description and evaluation of developmental scales.) A second way of conceptualizing vocabulary knowledge is by breaking it down into its separate elements, which could be described as a ‘component’ or ‘dimensions’ approach. The genesis of this approach is usually traced back to an article in 1976 by Jack Richards in TESOL Quarterly, where he discussed several assumptions about knowing vocabulary. His article attracted notice, and led Paul Nation to specify the kinds of knowledge one must have about a word in order to use it well. The original list included eight types of word knowledge: ● ● ● ● ● ● ● ●
spoken form written form grammatical patterns collocations frequency appropriateness (register) meaning associations. (Nation, 1990: 31)
He presented a revised and expanded version in 2001, which is the best specification of the range of ‘word knowledge’ aspects to date (Table 1.3). These various types of word knowledge become important when teaching a language or developing research for a number of reasons. First, some of these word knowledge aspects are relatively amenable to intentional learning, such as word meaning and word form, while the more contextualized aspects, such as collocation and intuitions of frequency, are much more difficult to teach explicitly. They probably have to be acquired instead through massive exposure to the L2. Likewise, some aspects are relatively easy to measure in research (e.g. written form, meaning), while some are extremely difficult to capture (register, collocation). In addition, although all of the word knowledge types are learned concurrently, some are mastered sooner than others (Schmitt, 1998a). This has implications for research, as different vocabulary measures might be appropriate at the different stages of acquisition of an item. At the beginning of the incremental learning process, measuring the meaning-form link is probably most appropriate, but as the word becomes more established, it might be better to measure some of the contextual types of word knowledge (e.g. collocation) to determine the degree of higher-level mastery of a lexical item.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
16
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 16
6/9/2010 1:58:07 PM
Vocabulary Use and Acquisition
17
Form
Spoken Written
Word parts
R P R P R P
Meaning
Form and meaning
R P
Concept and referents
R P
Associations
R P
Use
Grammatical functions
R P
Collocations
R P
Constraints on use (register, frequency ...)
R P
What does the word sound like? How is the word pronounced? What does the word look like? How is the word written and spelled? What parts are recognizable in this word? What word parts are needed to express this meaning? What meaning does this word form signal? What word form can be used to express this meaning? What is included in the concept? What items can the concept refer to? What other words does this make us think of? What other words could we use instead of this one? In what patterns does the word occur? In what patterns must we use this word? What words or types of words occur with this one? What words or types of words must we use with this one? Where, when, and how often would we expect to meet this word? Where, when, and how often can we use this word?
(Nation, 2001: 27).
Another way lexical items are mastered lies in the automaticity in which they can be recognized and produced. This approach has often been used in psycholinguistic experiments (Section 2.11), although often in studies where the focus was on some other aspect (e.g. the influence of the L1 on L2 processing) and where vocabulary was just an expedient linguistic element to measure. Measures of automaticity have just begun to be used in general vocabulary research (e.g. Siyanova and Schmitt, 2008). However, automaticity measures can be highly important if we wish to move away from declarative knowledge and explore procedural knowledge. As the essence of vocabulary mastery is the ability to use it fluently in communication (not the ability to talk about it metalinguistically), measures which tap into fluent
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 1.3 What is involved in knowing a word
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 17
6/9/2010 1:58:07 PM
Overview of Vocabulary Issues
and accurate usage are crucial. Barcroft (2002) suggests that the mind has limited cognitive resources, and so if they are focused on one aspect (e.g. form), there will be less available to apply to other aspects (e.g. meaning). Thus, the more automatic some word knowledge aspects are, the more resources can be given to other aspects. Furthermore, these cognitive constraints are not limited only to vocabulary. The ability to use lexical items without thinking frees up resources for other language processes, e.g. planning, and so automatic lexical processing has benefits for language use as a whole. The above perspectives highlight the ways people gain better mastery over lexical items, i.e. knowing more lexical items, knowing more about each item, and being able to utilize the items more automatically. However, vocabulary mastery can also be considered in terms of the overall mental lexicon rather than individual items. The main method of exploring this lexical organization has been with word associations, a methodology probably best known for its application in the field of psychology. A stimulus word is given to participants and they are asked to respond with the first word or words which come to mind. For example, the stimulus word needle typically elicits the responses thread, pin(s), sharp, and sew(s). The assumption is that automatic responses which have not been thought out will consist of words which have the strongest connections with the stimulus word in the subjects’ mental lexicon. By analyzing associations, we can gain clues about the mental relationships between words and thus the organization of the mental lexicon. In general, we find that association responses exhibit a great deal of systematicity, i.e. many of the same responses are produced by a wide variety of participants, signalling similar lexical organization. We also find that nonnatives produce a wider variety of responses than natives, suggesting less-well-organized lexicons. These results will be discussed in more detail in Section 2.4. All of these dimensions of word mastery are interrelated, and are holistically connected. However, it is not possible to measure all of these dimensions individually in any kind of test that can be envisaged. Even if it were possible, the test battery would probably be too long and complex for research purposes, and would certainly be too extended to be of any kind of pedagogical use. Therefore, vocabulary researchers need to carefully consider which limited aspects they are going to measure in their vocabulary studies, and carefully consider the limitations and implications of their choices.
Quote 1.7
Read on depth of vocabulary knowledge
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
18
... learners need to have more than just a superficial understanding of the meaning [of a word]; they should develop a rich and specific meaning representation as well as knowledge of the word’s format features, syntactic functioning, collocational possibilities, register characteristics, and so on. (2004: 155)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 18
6/9/2010 1:58:07 PM
Vocabulary Use and Acquisition
Vocabulary learning is incremental in nature
Vocabulary acquisition is incremental both in terms of acquiring an adequate vocabulary size, and in terms of mastering individual lexical items. The gradual acquisition of increasingly larger lexicons is well-illustrated in a study by Henriksen (2008). She measured the L2 vocabulary size of Danish EFL students, but almost uniquely, she also measured their L1 size. She found that there was consistent improvement in vocabulary size across the increasing grades in both the L1 and L2, although this growth was achieved over an extended period of time. Unsurprisingly, she also found the L1 scores were larger than the L2 scores, even though the L1 test included very low-frequency items compared to the L2 test.
Grade 7a L1 L2 a b
Grade 10
Grade 13
b
M
50.2
SD
18.4
83.5 18.7
102.1 10.1
M SD
33.8 22.4
71.9 20.6
94.8 14.7
Each grade had 29 Danish informants. Max. score = 120.
The above are total vocabulary size estimates, but the Vocabulary Levels Test (Section 5.2.3) which Henriksen used also provides a profile of how much learners know at various frequency levels. These profiles similarly show the gradual growth of vocabulary through the various frequency levels.
Number of participants masteringa each frequency level (N = 29 in each grade) Level < 2.000b 2,000 3,000 5,000 10,000
Grade 7a
Grade 10
Grade 13
24 5 0 0 0
11 12 2 4 0
0 8 11 5 5
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1.1.6
19
a
Mastery was set at scoring 26 out of 30 items. These students failed to meet the criterion of 26 correct items on the 2,000 level. b
Considering the incremental acquisition of individual lexical items, it is well-established that individual lexical items need to be met many times in
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 19
6/9/2010 1:58:07 PM
Overview of Vocabulary Issues
order to be learned (Nation, 2001). Thus, it is obvious that lexical items cannot be fully learned from only a single exposure, yet much of research and testing seems to revolve around the assumption that it can. Much vocabulary research discusses items as being either ‘unknown’ or ‘learned’ depending on whether a test item was answered correctly or not. I think this is unfortunate, as the true underlying vocabulary learning process is clearly incremental in nature. Moreover, it is incremental in a variety of ways. Let us examine this in more detail. We have seen that complete mastery of an item entails a number of types of word knowledge, as shown in Table 1.3, not all of which can be learned completely from a few exposures. Experience has shown that some are mastered before others. For example, learners will surely know a word’s basic meaning sense before they have full collocational competence. However, at the moment it is difficult to confidently say much about how the different word knowledge types develop in relation to each other, simply because there is a shortage of studies which look at the acquisition of multiple types of word knowledge concurrently. Those that do exist (Pigada and Schmitt, 2006; Schmitt, 1998a; Schmitt and Meara, 1997; Webb, 2005, 2007a, 2007b) seem to confirm that some word knowledge types do develop before others, but it is difficult to come to any conclusion about an overall pattern. However, based on such studies and my own understanding of vocabulary, I would suggest the following scenario. On the first exposure to a new word, all that is likely to be picked up is some sense of word form and meaning. If the exposure was oral, the person might remember the pronunciation of the whole word, but might only remember what other words it rhymes with or how many syllables it has. If the exposure came from a written text, the person may only remember the first few letters of the word, or its broad structural outline. Since it was only a single exposure, it is only possible to gain the single meaning sense which was used in that context. There is also the possibility that the word class was noticed, but not much else. As the person gains a few more exposures, these features will start to be consolidated, and perhaps some other meaning senses will be encountered. But it will probably be relatively late in the acquisition process before a person develops intuitions about the word’s frequency, register constraints, and collocational behavior, simply because these features require a large number of examples to determine the appropriate values. This account allows for a great deal of variability in how individual lexical items are learned, but the key point is that some word knowledge aspects develop before others. Thus, vocabulary learning is incremental because some types of word knowledge are established before others. However, I would also argue that each individual type of word knowledge is learned incrementally as well. As part of Henriksen’s (1999) description of the incremental development of vocabulary knowledge, she proposes that learners have knowledge of any
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
20
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 20
6/9/2010 1:58:07 PM
Vocabulary Use and Acquisition
21
lexical aspect which ranges from zero to partial to precise. This would mean that all word knowledge ranges on a continuum, rather than being known versus unknown. Even knowledge as seemingly basic as spelling can behave in this manner, ranging on a cline something like this: knows some letters
phonologically correct
Other word knowledge aspects would follow a similar zero→partial→ precise development. I found evidence for partial/precise degrees of knowledge in a study I made of advanced L2 learners at university level (Schmitt, 1998a). I followed their mastery of a number of word knowledge aspects for 11 words over the majority of an academic year. The students rarely knew all of the words’ derivational forms or meaning senses. They normally knew the word class of the stimulus word and one derivation, but rarely all of the four main forms (noun, verb, adjective, adverb). Likewise, they normally knew the core meaning sense, but almost never all of the possible senses. The association scores for my students generally became more native-like over time, indicating the words were gradually becoming better integrated into the students’ mental lexicons. All of this shows that learner knowledge of the various word knowledge aspects is often partially mastered, and that it takes time to develop each of these word knowledge aspects towards more precision. One word knowledge component which is well-researched is meaning. From this body of research, it is clear that receptive mastery generally develops before productive mastery, although this may not be the case for every item. This is illustrated by studies which have compared the number of words known productively versus receptively. For example, Laufer (2005a) compared learners’ productive test scores on L1-L2 recall tests as a percentage of their receptive test scores on L2-L1 translation tests. She found productive/receptive ratios ranging from 16% at the 5,000 frequency level to 35% at the 2,000 level, while Fan (2,000) found a range from 53% to 81% (mean 69.2%) for words taken from the 2,000, 3,000, and UWL levels. Laufer and Paribakht (1998) found an average ratio of 77% for Israeli EFL students and 62% for Canadian ESL students. While the ratios are highly dependent on the types of receptive/productive tests used (Laufer and Goldstein, 2004), it seems clear that a learner’s receptive lexicon is likely to be larger than his/her productive lexicon. See Sections 2.8, 5.2, and 5.3 for a detailed discussion of receptive versus productive mastery of vocabulary and tests thereof. We also know that learners vary in their ability to use lexical items in written and spoken discourse, i.e. their orthographic and phonological mastery of items. Milton and Hopkins (2006) compared the written and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
can’t spell word at all
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 21
6/9/2010 1:58:07 PM
Overview of Vocabulary Issues
spoken English vocabulary sizes of Greek- and Arabic-speaking learners, and found that the written was generally larger (mean written size: 2,655 words; spoken: 2,260). The correlation between the two sizes was moderate (.68), but varied according to L1: Greek .81; Arabic .65). The relationship between orthographic and phonological knowledge also varied according to proficiency: for both language groups, low scores tend to be associated with a greater tendency for phonological vocabulary knowledge to exceed orthographic vocabulary knowledge, but for high scorers, the reverse was true. Thus we cannot assume a straightforward relationship between the written and spoken knowledge of words in a learner’s lexicon, although these do seem generally to increase in parallel manner. In sum, not only is vocabulary acquisition incremental, but it is incremental in a variety of ways. First, lexical knowledge is made up of different kinds of word knowledge and not all can be mastered simultaneously. Second, each word knowledge aspect may develop along a cline, which means not only is word learning incremental in general, but learning of the individual word knowledge aspects is as well. Third, each word knowledge type varies in the degree of receptive/productive mastery. Taken together, this indicates that word learning is a complicated, but gradual process. The implication for research is that simple knows/doesn’t know descriptions of vocabulary knowledge (usually based only on the initial form-meaning link) are wholly inadequate for describing vocabulary knowledge. If, for practical reasons, only a single word knowledge aspect like meaning can be measured in a study, at a minimum, the results need to be interpreted in terms of an incremental learning perspective. For example, instead of reporting that words in such studies are ‘learned’ because a form-meaning test item was correctly answered, it is better to interpret this result as showing that the word’s form-meaning link has been established at either the receptive or productive level, and acknowledge that this does not imply that other word knowledge aspects have been mastered. Thus, such a test item only indicates initial learning of a word. However, it is better to establish a norm in which multiple measures of vocabulary are used in studies to paint a more complete picture of vocabulary knowledge and acquisition. This could be in terms of receptive/productive mastery, different types of word knowledge, degree of mastery of an individual word knowledge aspect, contexts of use, etc., or some combination of these.
Quote 1.8 Newton on research implications of the incremental nature of vocabulary acquisition
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
22
There is a need to develop instruments which are more sensitive to degrees of acquisition and to both receptive and productive vocabulary knowledge. (1995: 171)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 22
6/9/2010 1:58:08 PM
Vocabulary Use and Acquisition
Vocabulary attrition and long-term retention
Vocabulary acquisition is not a tidy linear affair, with only incremental advancement and no backsliding. All teachers recognize that learners forget material as well. This forgetting (attrition) is a natural fact of learning. We should view partial vocabulary knowledge as being in a state of flux, with both learning and forgetting occurring until the word is mastered and ‘fixed’ in memory. In Schmitt (1998a), I found that advanced L2 university students improved their knowledge of the meaning senses of target words about 2.5 times more than that knowledge was forgotten (over the course of one year), but this means there was some backsliding as well. Of course attrition can also occur even if vocabulary is relatively well known, such as when one does not use a second language for a long time, or one stops a course of language study. Studies into attrition have produced mixed results, largely due to the use of different methods of measuring vocabulary retention (e.g. Bahrick, 1984; Hansen and McKinney, 2002; Weltens and Grendel, 1993). In general though, lexical knowledge seems to be more prone to attrition than other linguistic aspects, such as phonology or grammar. This is logical because vocabulary is made up of individual units rather than a series of rules, although we have seen that lexis is much more patterned than previously thought. It appears that receptive knowledge does not attrite dramatically, and when it does, it is usually peripheral words, such as low-frequency noncognates, which are affected (Weltens and Grendel, 1993). On the other hand, productive mastery is more likely to be lost (Cohen, 1989; Olshtain, 1989), although see Schmitt (1998a) for contrary results. There is some evidence that the rate of attrition is connected to proficiency level, with learners with larger vocabularies retaining more residual knowledge of their vocabulary (Hansen et al., 2002). Weltens, Van Els, and Schils (1989) found that most of the attrition for the participants in their study occurred within the first two years and then levelled off. Overall, once vocabulary is learned, it does not seem to ever completely disappear, as Bahrick (1984) found residual vocabulary knowledge in his informants even after 50 years of language disuse. It therefore is probably best to think of attrition in terms of loss of lexical access, rather than in terms of a complete elimination of lexical knowledge. See Section 5.6 for more on attrition and its measurement.
Quote 1.9 Hansen, Umeda, and McKinney on the absence of complete attrition even after long periods of language disuse
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1.1.7
23
... even though access to lexical knowledge is lost, attriters may retain a substantial advantage in regaining that knowledge, in comparison with others who are learning the same words for the first time. (2002: 669)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 23
6/9/2010 1:58:08 PM
Overview of Vocabulary Issues
1.1.8
Vocabulary form is important
As mentioned above, learning a lexical item is often conceptualized as learning its meaning. While learning meaning is undoubtedly an essential initial step, more precisely this involves developing a link between form and meaning. If one thinks about it, this linkage is the minimum specification for knowing a word. If a lexical form is familiar, but its meaning is not known, then this item is of no communicative use. Likewise, if a meaning is known, but not its corresponding form, then the item cannot be either recognized or produced. It is thus little wonder that most vocabulary materials attempt to teach this form-meaning link, and that most tests measure it in one way or another. However, another common assumption seems to be that meaning is the key component of this link, while the form element is often downplayed or disregarded. In fact, there is a large body of research indicating that L2 learners often have trouble with the word form. For example, Laufer (1988) studied words with similar forms and found that some similarities were particularly confusing for students, especially words which were similar except for suffixes (comprehensive/comprehensible) and for vowels (adopt/adapt). Similarly, Bensoussan and Laufer (1984) found that a mis-analysis of word forms, which looked transparent but were not, sometimes leads to misinterpretation. Their learners interpreted outline (which looks like a transparent compound) as ‘out of line’, and discourse (which looks as if it has a prefix) as ‘without direction’. Moreover, it is not only the forms of the words themselves which can lead to problems. Regardless of the word itself, if there are many other words which have a similar form in the L2 (i.e. large orthographic neighborhoods (Grainger and Dijkstra, 1992)), it makes confusion more likely. For example, the word poll may not be difficult in itself, but the fact that there are many other similar forms in English can lead to potential confusion (pool, polo, pollen, pole, pall, pill). One reason people can learn their L1 so easily is that the mind becomes attuned to the features and regularities in the L1 input (Doughty, 2003; Ellis, 2006b). This developmental sharpening applies to the word form as well, as people become attuned to the particular set of phonemes and graphemes in their L1, and the ways in which they combine. This specialization makes L1 processing efficient, but can cause problems when there is an attempt to process an L2 in the same way, even though this may be counterproductive because the languages have different characteristics. For example, English speakers use mainly stress to parse words in the speech stream, while French speakers rely more on syllable cues. Cutler and her colleagues have found that both French and English speakers used their L1 cue processing strategies when learning the other language as an L2, causing problems for both groups (e.g. Cutler, Mehler, Norris, and Segui, 1986; Cutler and Norris, 1988). The same type of mismatch has been found in the processing of written language, for example, between Chinese and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
24
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 24
6/9/2010 1:58:08 PM
25
English (e.g. Koda, 1997, 1998). What this means is that learners not only have to learn new oral and written forms in the L2, but they may also have to develop a completely new way of processing those forms, one which is in opposition to the automatic processes in their L1. The effect of this shows up in laboratory experiments, where de Groot (2006) found that L2 words which match L1 orthographical and phonological patterns are easier to learn and are less susceptible to forgetting than L2 words which do not match the L1 patterns. Thus, while Ellis (1997) argues that form is mainly acquired through exposure, it is clear that this may not occur without problems in an L2. Consequently, vocabulary researchers must not take orthographic/phonological mastery of word form for granted, but must consider it an essential (and often problematic) component part of lexical learning. Another implication is that formal aspects of target vocabulary need to be carefully considered when designing vocabulary research, in order to control for the difficulty that accrues from lexical form.
Quote 1.10 Koda on the problems related to poor recognition of word form [I]nefficient orthographic processing can lead not only to inaccurate lexical retrieval, but to poor [reading] comprehension as well. (1997: 35)
1.1.9
Recognizing the importance of the L1 in vocabulary studies
There is no doubt from research that the L1 exerts a considerable influence on the learning and use of L2 vocabulary in a number of ways (Ringbom, 2007; Swan, 1997). To start with, for learners studying an L2 through junior high school, senior high school, and university, the size of L1 and L2 lexicons correlate strongly (.61–.75), showing parallel growth to a large extent (Henriksen, 2008, see Section 1.1.6). In terms of learner output, Hemchua and Schmitt (2006) studied the lexical errors in Thai university EFL compositions, and found that nearly one-quarter were judged to be attributable to L1 influence. But for verb-noun collocation errors in particular, the percentage may be over 50% (Nesselhauf, 2003). Learners also typically employ their L1 in learning an L2, most noticeably in the consistently high usage of bilingual dictionaries (Schmitt, 1997). They also strongly believe that translating helps them acquire English language skills such as reading, writing, and particularly vocabulary words, idioms, and phrases (Liao, 2006). But perhaps the best evidence for L1 influence comes from psycholinguistic studies, which demonstrate that the L1 is active during L2 lexical processing
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 25
6/9/2010 1:58:08 PM
Overview of Vocabulary Issues
in both beginning and more advanced learners (e.g. Hall, 2002; Jiang, 2002; Sunderman and Kroll, 2006). The importance of the L1 is highlighted by a number of second-language vocabulary acquisition studies. Prince (1996) found that more newly learned words could be recalled using L1 translations than L2 context, particularly for less proficient learners. With secondary school Malaysian learners, using L1 translations was tremendously more effective than providing L2-based meanings (Ramachandran and Rahim, 2004). Laufer and Shmueli (1997) found the same trend with Hebrew students. Lotto and de Groot (1998) found that L2-L1 word pairs lead to better learning than L2-picture pairs, at least for relatively experienced foreign-language learners. The ubiquitous influence of the L1 on L2 vocabulary learning and use needs to be taken into consideration in vocabulary research. It has important implications for the selection of target vocabulary, as words which are cognates are typically easier than non-cognates. Also, words which follow the orthographic/phonologic regularities of the L1 will normally be easier than those which do not. In addition, the L1 needs to be considered in selecting measurement formats. For example, it has been hypothesized that the initial form-meaning link consists of the new L2 word form being attached to a representation of the corresponding L1 word which already exists in memory (Hall, 2002), and so L1 translation tasks would be a natural task for measuring this. See Section 2.6 for more discussion of L1 lexical influence.
Quote 1.11 Swan on L1 influence on second-language vocabulary The mother tongue can influence the way second-language vocabulary is learnt, the way it is recalled for use, and the way learners compensate for lack of knowledge by attempting to construct complex lexical items. (1997: 179)
1.1.10
Engagement is a critical factor in vocabulary acquisition
It is a commonsense notion that the more a learner engages with a new word, the more likely he/she is to learn it. A number of attempts have tried to define this notion of engagement more precisely. Craik and Lockhart’s (1972) Depth/Levels of Processing Hypothesis laid the basic groundwork by stating that the more attention given to an item, and the more manipulation involved with the item, the greater the chances it will be remembered. Laufer and Hulstijn (2001; also Hulstijn and Laufer, 2001) refined the notion further and suggested that the total involvement for vocabulary learning consists of three components: need, search, and evaluation. Need is the requirement
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
26
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 26
6/9/2010 1:58:08 PM
Vocabulary Use and Acquisition
27
The more effective task
The less effective task
Study
Meaning selected from several options Meaning looked up in a dictionary Meaning looked up in a dictionary Reading and a series of vocabulary exercises Meaning negotiated Negotiated input
Meaning explained by synonym Reading with/without guessing Meaning provided in a marginal gloss Reading only (and inferring meaning) Meaning not negotiated Premodified input
Hulstijn, 1992
Used in original sentences (oral task) Interactionally modified output Used in a composition (Ll-L2 look up) Reading, words looked up in a dictionary
Used in non-original sentences Interactionally modified input Encountered in a reading task (L2-L1 look up) Reading only, words not looked up
Knight, 1994; Luppescu and Day, 1993 Hulstijn, Hollander, and Greidanus, 1996 Paribakht and Wesche, 1997 Newton, 1995 Ellis, Tanaka, and Yamazaki, 1994 Joe, 1995, 1998 Ellis and He, 1999 Hulstijn and Trompetter, 1998 Cho and Krashen, 1994
(Laufer and Hulstijn, 2001: 13).
for a linguistic feature in order to achieve some desired task, such as needing to know a particular word in order to understand a passage. Search is the attempt to find the required information, e.g. looking up the meaning of that word in a dictionary. Evaluation refers to the comparison of the word, or information about a word, with the context of use to see it fits or is the best choice. The authors found some support for their hypothesis: learners writing compositions remembered a set of target words better than those who saw the words in a reading comprehension task, and learners who supplied missing target words in gaps in the reading text remembered more of those words than learners who read marginal glosses. In both comparisons, the ‘better learning’ case had higher involvement according to the Laufer and Hulstijn scheme. They also reviewed a number of studies and found that the tasks with relatively more need, search, and evaluation elements were more effective (Table 1.4). While this is almost certainly true, research also shows that many other factors make a difference as well. For example, while Laufer and Hulstijn’s Involvement Load Hypothesis is useful for materials writers to set up good materials which can facilitate incidental vocabulary learning, it does not fully take the student into account. The ‘need’ component does have a
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 1.4 Relative effectiveness of vocabulary learning methods
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 27
6/9/2010 1:58:08 PM
Overview of Vocabulary Issues
motivational aspect, because strong need is when learners decide that a lexical item is necessary for some language task they wish to do. However, Joe (2006) points out that students can scan, engage, and interpret in many different ways regardless of material design, and there is little way to know in advance exactly how. Moreover, students’ strategic behavior has an effect as well. It appears that vocabulary learning is part of a cyclical process where one’s self-regulation of learning leads to more involvement with and use of vocabulary learning strategies, which in turn leads to better mastery of their use. This better mastery enhances vocabulary learning, the effectiveness of which can then be self-appraised, leading to a fine-tuning of self-regulation and the start of a new cycle (Tseng and Schmitt, 2008). (See Section 2.9 for more on vocabulary learning strategies and self-regulation.) Furthermore, Folse (2006) suggests that the number of exposures to the target items may be at least as important as the type of learning activity. There are a range of other factors which recur throughout the literature as facilitating vocabulary learning, including the following: – increased frequency of exposure – increased attention focused on lexical item – increased noticing of lexical item – increased intention to learn lexical item – a requirement to learn lexical item (by teacher, test, syllabus) – a need to learn/use lexical item (for task or for a personal goal) – increased manipulation of lexical item and its properties – increased amount of time spent engaging with lexical item – amount of interaction spent on lexical item Overall, it seems that virtually anything that leads to more exposure, attention, manipulation, or time spent on lexical items adds to their learning. In fact, even the process of being tested on lexical items appears to facilitate better retention, as research designs which include multiple posttests usually lead to better results on the final delayed posttest compared to similar designs with fewer or no intermediate posttests (e.g. Mason and Krashen, 2004). Previously, there was no one cover term that encompassed all of these involvement possibilities, and so in an overview of instructed second-language vocabulary instruction, I proposed the term engagement (Schmitt, 2008). Because it has such an important influence on vocabulary learning, it needs to be especially carefully controlled in vocabulary research. Comparison conditions need to be as equivalent as possible in terms of the number of exposures, total time of exposure, type of manipulation, and even the number of interim tests before the final assessment.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
28
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 28
6/9/2010 1:58:08 PM
Vocabulary Use and Acquisition
Quote 1.12
29
Schmitt on the importance of engagement
In essence, anything that leads to more and better engagement should improve vocabulary learning, and thus promoting engagement is the most fundamental task for teachers and materials writers, and indeed, learners themselves.
1.2
Vocabulary and reading
It is beyond the scope of this book to provide a detailed overview of vocabulary in all four skills. However, the vast majority of skills/vocabulary research has focused on reading. I will briefly survey the vocabulary/reading research as a means of giving a flavor of the kind of lexical research which can be done in relation to the skills. Readers interested in vocabulary and the other skills are directed to Chapters 4 and 5 in Nation (2001).
Concept 1.1
Incidental learning
Incidental learning is learning which accrues as a by-product of language usage, without the intended purpose of learning a particular linguistic feature. An example is any vocabulary learned while reading a novel simply for pleasure, with no stated goal of learning new lexical items.
The effectiveness of incidental vocabulary learning from reading Early research on vocabulary acquisition from incidental exposure in reading found a discouragingly low pickup rate, with about 1 word being correctly identified out of every 12 words tested (Horst, Cobb, and Meara, 1998). However, the early studies typically had a number of methodological weaknesses, including very small amounts of reading, insensitive measurement instruments, inadequate control of text difficulty, small numbers of target words, and no delayed posttests. More recent studies which have addressed some or all of these problems have found more gains from reading than previous studies indicated. Horst et al. (1998) found learning of about 1 new word out every 5, and that this learning persisted over a period of at least ten days. Horst (2005) found that her participants learned well over half of the unfamiliar words they encountered in their extensive reading. Pigada and Schmitt (2006) studied the learning of spelling, meaning, and grammatical characteristics during a one-month extensive reading case study. They found that 65% of the target words were enhanced on at least one of these word knowledge types, for a pickup rate of about 1 of every 1.5
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
(2008: 339–340)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 29
6/9/2010 1:58:09 PM
Overview of Vocabulary Issues
words tested. Spelling was strongly enhanced, even from a small number of exposures, while meaning and grammatical knowledge were enhanced to a lesser degree. Brown, Waring, and Donkaewbua (2008) found encouraging amounts of durable incidental vocabulary learning in terms of recognition of word form and recognition of meaning in a multiple-choice test, but far less in terms of being able to produce the meaning in a translation task. Waring and Takaki (2003) also found stronger gains and retention for recognition than recall knowledge. Their Japanese participants recognized the meaning of 10.6 out of 25 words on a immediate multiple-choice test, but only were able to provide a translation for 4.6 out of 25. However, after three months, while the recognition of meaning score dropped to 6.1, the translation score dropped much more sharply to 0.9. This indicates that incidental vocabulary learning from reading is more likely to push words to a partial rather than full level of mastery, and that any recall learning is more prone to forgetting than recognition learning.
Quote 1.13 Nation on the relationship between vocabulary and reading Research on L1 reading shows that vocabulary knowledge and reading comprehension are very closely related to each other ... This relationship is not one directional. Vocabulary knowledge can help reading, and reading can contribute to vocabulary growth. (2001: 144)
Number of exposures necessary to promote incidental learning from reading An important issue related to lexical acquisition from reading is the number of exposures which are necessary to push the incremental learning of a word forward, especially in a way that is durable. Webb (2007a) compared the learning of words from the study of L2-L1 word pairs, both with and without the addition of a single example sentence. The results for the two conditions were the same, indicating that a single context had little effect on gaining vocabulary knowledge. Beyond a single exposure, learning increases, but there does not appear to be any firm threshold when it is certain. At the lower end of the frequency spectrum, Rott (1999) found that 6 exposures led to better learning than 2 or 4 exposures. Pigada and Schmitt (2006) found that there was no frequency point where the acquisition of meaning was assured, but by about 10+ exposures, there was a discernable rise in the learning rate. However, even after 20+ exposures, the meaning of some words eluded their participant. Waring and Takaki (2003) found it took at least 8 repetitions in order for learners to have about a 50% chance of recognizing
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
30
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 30
6/9/2010 1:58:09 PM
31
a word’s form, or its meaning on a multiple choice test, three months later. However, even if a new word was met 15–18 times, there was less than a 10% chance that a learner would be able to give a translation for it after three months, and no words met fewer than 5 times were successfully translated. Horst et al., (1998) also found that words appearing 8 or more times in their study had a reasonable chance of being learned, while Webb (2007b) found that 10 encounters led to sizeable learning gains across a number of word knowledge types. Of course, learning a word depends on more than just the frequency of exposure. The quality of engagement will obviously also play a part. Furthermore, Zahar, Cobb, and Spada (2001) suggest that the number of encounters needed to learn a word might depend on the proficiency level of the learners, because more advanced learners who know more words seem to be able to acquire new words in fewer encounters. Nevertheless, the research seems to suggest that 8–10 reading exposures may give learners a reasonable chance of acquiring an initial receptive knowledge of words. Taken together, the research confirms that worthwhile vocabulary learning does occur from reading. However, the pickup rate is relatively low, and it seems to be difficult to gain a productive level of mastery from just exposure. Hill and Laufer (2003) estimate that, at the rates of incidental learning reported in many studies, an L2 learner would have to read over 8 million words of text, or about 420 novels, to increase his/ her vocabulary size by 2,000 words. This is clearly a daunting prospect, and thus it is probably best not to rely upon incidental learning as the primary source of the learning for new words. Rather, incidental learning from reading seems to be better at enhancing knowledge of words which have already been met. This conclusion is congruent with Waring and Takaki’s (2003) findings that reading graded readers does not lead to the learning of many new words, but that is very useful in developing and enriching partially-known vocabulary. Studies with a variety of test types have shown that exposure leads to improvements in multiple types of word knowledge. Also, given that repetition is key to learning words, the benefits of repeated exposures in different contexts for consolidating fragile initial learning and moving it along the path of incremental development cannot be underestimated.
Quote 1.14 Webb on the value of repetition in reading for vocabulary learning
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
Repetition affects incidental vocabulary learning from reading. Learners who encounter an unknown word more times in informative contexts are able to demonstrate significantly larger gains in [various] vocabulary knowledge types than learners who have fewer encounters with an unknown word. (2007b: 64)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 31
6/9/2010 1:58:09 PM
32
Overview of Vocabulary Issues
One way of maximizing the vocabulary learning benefits from reading is to organize an extensive reading component (Day and Bamford, 1998). Although readers need to know 98–99% of the words in a text, many authentic texts will still be suitable for more advanced learners, especially if teachers provide support for the more difficult vocabulary (see below). However, for developing learners, the vocabulary load will probably be too high in authentic texts, and so the use of graded readers is recommended, as the vocabulary load is both fine-tuned for the learner’s level, and systematically recycled (Nation and Wang, 1999). Graded readers used to have a bad reputation for being boring and poorly written, but that is no longer the case, with several major publishers providing a series of interesting and well-presented readers. Most importantly, research shows that substantial vocabulary learning can be derived from graded readers. For example, Horst (2005) found that her participants learned over half of the unfamiliar words they encountered in the graded readers they read. Likewise, Al-Homoud and Schmitt (2009) found that Saudi learners in a short ten-week course incorporating extensive reading and graded readers increased their vocabulary at the 2,000, 3,000, and 5,000 frequency levels, as well as improving their reading speed and attitudes towards reading. Unsurprisingly, the amount of reading is key: of ten variables entered into a regression analysis, only the amount of extensive reading done during a two-month course came up as a significant predictor of gain scores in overall language proficiency (Renandya, Rajan, and Jacobs, 1999). Lexical inferencing While extensive reading programs can maximize the amount of exposure, it is possible to help learners utilize that exposure more effectively. One way is to train them in lexical inferencing (Haastrup, 1991: 13): The process of lexical inferencing involves making informed guesses as to the meaning of a word in the light of all available linguistic cues in combination with the learner’s general knowledge of the world, her awareness of the co-text and her relevant linguistic knowledge. In Haastrup’s view, it is clear that lexical inferencing is much more than merely ‘guessing from context’, as learners use both their existing knowledge and the textual context to guess the meaning of unknown lexical items. It is probably best to think of lexical inferencing as qualified guessing of the meaning of lexical items in context, rather than guessing from context, as contextual cues are only one of several knowledge sources. Learners typically rate lexical inferencing as a useful strategy (Schmitt, 1997; Zechmeister, D’Anna, Hall, Paus, and Smith, 1993) and research has shown that it is one of the most frequent and preferred strategies for learners
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Extensive reading
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 32
6/9/2010 1:58:09 PM
33
when dealing with unknown words in reading. In one study, Paribakht and Wesche (1999) found that their university ESL students used inferencing in about 78% of all cases where they actively tried to identify the meanings of unknown words, while Fraser (1999) found that her students used inferencing in 58% of the cases where they encountered an unfamiliar word. It also seems to be a major strategy when learners attempt to guess the meaning of phrasal vocabulary, at least for idioms (Cooper, 1999). Unfortunately, this does not mean that it is necessarily effective. Nassaji (2003) found that of 199 inferences, learners only made 51 (25.6%) that were successful, and another 37 (18.6%) that were partially successful. This low success rate is similar to the 24% rate that Bensoussan and Laufer’s (1984) learners achieved. In an extensive cross-sectional study, Haastrup (2008) studied the lexical inferencing success of young Danish learners of English, in both their L1 and L2, in Grades 7, 10, and 13. Grade 7
Grade 10
Grade 13
L2
16.83%
37.27%
48.10%
L1
28.93%
50.07%
58.80%
Unsurprisingly, she found that her participants’ L1 lexical inferencing was better than their L2 inferencing, but she also found increasing success as the learners matured, both in the L1 and the L2. However, by Grade 13, the lexical inferencing success rate had still only improved to the region of 50%. One of the reasons for this relatively poor rate is that learners often confuse unknown words for words which they already know with a similar form (Nassaji, 2003), again highlighting the importance of form in learning vocabulary. Other factors include the percentage of unknown words in the text, word class of the unknown words, and learner proficiency. Liu and Nation (1985) found that unknown words embedded in a text where 96% of the other words were known were guessed more successfully than unknown words in a text with only 90% known. They also found that verbs were easier to infer than nouns, and nouns easier than adjectives or adverbs. Finally they found that higher proficiency learners successfully inferred 85–100% of the unknown words, while the lowest proficiency learners only inferred 30–40% successfully. This uneven success in lexical inferencing suggests that these skills need to be taught. Two meta-analyses (Fukkink and De Glopper, 1998; Kuhn and Stahl, 1998) and an overview (Walters, 2004) have found a positive effect for instruction in the use of context. Both meta-analyses found that context clue instruction was as or more effective than other forms of instruction (e.g. cloze exercises, general strategy instruction), but the inferencing improvement may be mostly about attention given to the inferencing process, as Kuhn and Stahl concluded that there was little difference between teaching
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 33
6/9/2010 1:58:09 PM
Overview of Vocabulary Issues
learners inferencing techniques and just giving them opportunities to practise guessing from context. Walters (2006) found that learners of different proficiencies seemed to benefit from different approaches, with beginning learners benefiting most from instruction in a general inferencing procedure (Clarke and Nation, 1980), and more advanced learners benefiting more from instruction in the recognition and interpretation of context clues. She also found that instruction in inferencing may do more to improve reading comprehension than the ability to infer word meaning from context. Glossing Another way to help learners utilize reading exposure better is to give them information about unknown words in the text. One way this can be done in teacher-prepared texts is with glossing. Nation (2001) believes there are several reasons why glossing can be useful: more difficult texts can be read, glossing provides accurate meanings for words that might not be guessed correctly, it has minimal interruption to reading – especially compared to dictionary use, and it draws attention to words which should aid the acquisition process. Research tends to support these views. Hulstijn (1992) found that glosses helped prevent learners from making erroneous guesses about unknown words, which is important because learners seem reluctant to change their guesses once made (Haynes, 1993). Moreover, Hulstijn et al. (1996) found that L2 readers with marginal glosses learned more vocabulary than dictionary-using readers, or readers with no gloss/ dictionary support. (This is mainly because the L2 readers used the glosses more than their dictionaries. However, when the readers did use their dictionaries, the results were better than for using glosses, which is why Laufer and Hulstijn (2001) judged that dictionary use is more effective: Table 1.4, this volume.) But how and where to gloss? Research indicates that it does not matter much whether the gloss is an L2 description or an L1 translation, as long as the learner can understand the meaning (Jacobs, Dufon, and Fong, 1994; Yoshii, 2006), which suggests that there is no reason not to use L1 glosses with less proficient learners. Glosses just after the target word do not seem to be very effective (Watanabe, 1997), but glosses in the margin, bottom of the page, or end of the text have similar effectiveness (Holley and King, 1971). As learners seem to prefer marginal glosses, this is probably the best place for them (Jacobs et al., 1994). If phrasal vocabulary is being glossed, it helps to make the phrases more salient by highlighting their form (e.g. by printing them in color, and/or underlining them), so that the learner can recognize them as chunks (Bishop, 2004).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
34
Supplementing incidental vocabulary acquisition with explicit activities Glossing is one way of focusing explicit attention on lexical items during reading where otherwise only incidental learning would occur.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 34
6/9/2010 1:58:09 PM
35
Furthermore, it is effective: reading with marginal glosses or referral to a dictionary leads to better receptive knowledge of words than reading alone (Hulstijn et al., 1996). But there are many other possibilities for adding an explicit learning component to reading, based on the general principle that intentional and incidental learning are complementary approaches which can be usefully integrated. Perhaps the most effective way of improving incidental learning is by reinforcing it afterwards with intentional learning tasks. Hill and Laufer (2003) found that post-reading tasks explicitly focusing on target words led to better vocabulary learning than comprehension questions which required knowledge of the target words’ meaning. Atay and Kurt (2006) found that young Turkish EFL learners who carried out reading comprehension and interactive tasks as post-reading activities outperformed students who did written vocabulary tasks, and that the interactive tasks were much more appealing for the young learners. Mondria (2003) gives a particularly good illustration of the value of post-reading exercises. Dutch students who inferred the meaning of French words from sentence contexts, and then verified the meaning with the aid of a word list before memorization, learned just as much vocabulary (about 50% of the target words on a two-week delayed receptive test) as students who were given a translation before memorization. This shows that incidental learning plus explicit follow-up (particularly the memorization element) can be just as effective as a purely explicit approach. However, it is not as time effective, as the ‘translation + memorization’ method used 26% less time than the ‘incidental + follow-up’ method to achieve the same results. However, although the greater engagement of reading + explicit attention leads to greater learning, it is still fragile and needs to be followed up. Rott, Williams, and Cameron (2002) found that while reading + multiple-choice glosses led to better immediate scores than reading-only incidental learning alone, after five weeks the scores had decayed to the same level as the incidental learning condition. Thus, the improved learning gained from incidental exposure + supplementary tasks can be useful if subsequently consolidated and maintained, but if not followed up, the advantage may well be lost.
1.3 A sample of prominent knowledge gaps in the field of vocabulary studies We have learned much about vocabulary since the blossoming of lexical research which Meara first noted in 1987, with the points noted above being just a sample of some of the important findings from the last 20 years. Moreover, we must not forget that a great deal was learned about vocabulary in the early and mid-1900s, much of which has been forgotten, and which is often presented as ‘new’ in more recent studies which are unaware of the earlier research. However, there are still a large number of important
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 35
6/9/2010 1:58:10 PM
36
Overview of Vocabulary Issues
lexical issues about which we have little or no understanding. This section will highlight some of what I feel are the biggest gaps. All of these would make excellent (and challenging) research topics for the ambitious vocabulary researcher. No overall theory of vocabulary acquisition
● The relationship between receptive and productive mastery of vocabulary
An important part of the overall development of vocabulary is the movement from no knowledge to receptive mastery to productive mastery. Although we know that receptive mastery usually precedes productive mastery, it is unclear how the process proceeds, or exactly what input/practice is required to initiate it. The relationship between the two has been seen by some as a continuum (e.g. Melka, 1997), where gradually increasing knowledge helps one move from receptive to productive mastery. In contrast, Meara (1990) has argued that the two might be quantifiably different, perhaps depending on an item’s status within the lexical network. One of the problems in describing receptive and productive mastery lies in the difficulty of measurement. Waring (1999) found that the relationship between the two largely depends on the measurement instruments used. For example, if a researcher uses a relatively difficult receptive measurement and a relatively easy productive measure, it might even be found that productive mastery precedes
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
This has been often commented upon, and represents the Holy Grail of vocabulary studies. While we are gaining an increasing understanding of the development of some isolated aspects of vocabulary, the overall acquisition system is far too complex and variable for us to comprehend in its entirety, and so it still eludes description. It is difficult to visualize any particular study which could unlock the mystery; rather it will probably take a large number of studies using a combination of methodologies before the key developmental patterns become obvious. Some of the studies will need to be done in the actual environments in which learners learn (e.g. classrooms, private reading). There will also need to be experiments done in the laboratory, where the large number of potential learning variables can be better controlled. Computer simulations of learning models can be very useful, both because they make us think carefully about the assumptions we make about acquisition (in order to write the programming rules), and because a vast number of trials can be run (without the need for finding vast numbers of participants) (e.g. Meara, 2004, 2005, 2006; see Section 2.10). In addition, neurolinguistics is now beginning to shed light on the physiological underpinnings of language acquisition and use. Combining the strengths of these diverse research paradigms offers the best chance of understanding vocabulary acquisition well enough to formulate an explanation of its mechanisms.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 36
6/9/2010 1:58:10 PM
37
receptive mastery! Thus research into receptive and productive mastery requires careful selection of the measurement instruments, and a careful interpretation of the results. Only by carrying out a number of studies with both receptive and productive measurements of the same lexical items can the true relationship between the two levels of mastery be clarified. Only after this descriptive stage will it then be possible to hypothesize about the acquisition mechanisms which can explain the results. See Section 2.8 for much more on the issue of receptive and productive mastery. ●
Measuring the various word knowledge aspects
In order to better understand the nature and acquisition of vocabulary knowledge, it is necessary to develop measures for the different word knowledge aspects. While it is not practically possible, and probably not desirable, to measure all word knowledge aspects in any particular study, it is important to have a better understanding of how each of the various word knowledge aspects develops. For some aspects, like form-meaning, there are numerous measurement instruments which are commonly used. However, some aspects have hardly been researched at all and so no measurement instruments have been developed. Register is a good example of this, and we therefore have virtually no idea of how it develops. The acquisition of other aspects (e.g. collocation, intuitions of frequency, association) has received some attention, but there are still no accepted measurement instruments. Until valid and reliable measurements can be taken of more word knowledge aspects, it will be impossible to chart the incremental acquisition of overall vocabulary knowledge. Understanding implicit/procedural as well as explicit/declarative vocabulary knowledge
●
The vast majority of research into vocabulary involves measurement and discussion of explicit/declarative knowledge. For example, most vocabulary tests measure form and meaning, both lexical aspects which can be described by the learner (e.g. the French word merci means thank you in English). There has been much less research on how well the lexical items can be used in language use, that is implicit/procedural knowledge. For example, this would include how well vocabulary is utilized when giving a speech. This research bias largely results from the relative ease of measuring explicit/declarative knowledge, compared to the difficulty of measuring vocabulary in use and the underlying implicit/procedural knowledge which makes this possible. It also stems from the belief that lexical knowledge is mainly declarative, ignoring the complex nature of word knowledge. A few researchers are now beginning to discuss the nature of the declarative/procedural and explicit/implicit distinctions in language learning in general (e.g. DeKeyser, 2003; Hulstijn, 2007) and it is time to begin applying this type of thinking to exploring vocabulary knowledge, especially as research
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 37
6/9/2010 1:58:10 PM
38
Overview of Vocabulary Issues
●
Vocabulary in spoken discourse
There has been a great deal of research on reading’s relationship with vocabulary, and as a result, we know a great deal about the interaction between the two. However, there has been much less research on spoken discourse, again probably because it is harder to research. Therefore, there is a big gap in the field’s understanding of spoken discourse and vocabulary. We have little idea of how vocabulary is learned from listening, how many repetitions it requires, or what makes a word salient for learning in spoken discourse. Another big gap is the percentage of lexical coverage which is necessary for listening comprehension, that is, what percentage of lexical items in spoken discourse need to be known in order to comprehend it? This is critical, because as we have seen above, a few percentage points difference makes a huge difference in the amount of vocabulary required. If only 95% coverage were required, then something like 2,000–3,000 word families will be sufficient (Adolphs and Schmitt, 2003), but if 98% were necessary, this would entail a vocabulary size of 6,000–7,000 word families (Nation, 2006). Clearly, these are vastly different vocabulary size targets, which have huge implications for pedagogy and syllabus design. ● Measurement of vocabulary in free language production
Vocabulary studies have traditionally focused on receptive vocabulary. There could be several reasons for this. One could be because the receptive form-meaning link is the first step of learning vocabulary, it can make good sense to measure this. It must be said however, that receptive measurement of this aspect has probably had more to do with the ease of measurement than of any theoretical consideration of the nature of vocabulary acquisition. Another reason could be that both participants and researchers know and are comfortable with the type of multiple-choice items which typically appear in receptive formats. But probably a more important reason is that receptive test formats usually offer researchers more control than productive ones. Read (2000: 9) describes one of the dimensions of vocabulary assessment as a continuum between selective measures (where specific vocabulary items are the focus of the assessment) and comprehensive measures (where the measure takes account of the whole vocabulary content of the input
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
methodologies from the field of psycholinguistics and neurolinguistics now hold the promise of meaningful measurement of implicit/procedural knowledge. It is early days yet, but research into implicit/procedural knowledge can only help to clarify the full nature of what it means to know vocabulary, and the factors which allow this knowledge to be employed accurately, appropriately, and fluently by both L1 and L2 speakers in their language communication.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 38
6/9/2010 1:58:10 PM
39
material, e.g. a composition). Receptive test writers typically select target vocabulary, and there are numerous advantages to being able to do this, perhaps the greatest being a limiting of the scope of the enquiry. In an analysis of free written production, however, there is little control of the output vocabulary (other than limiting an examinee to all of the possible lexical items which could be used to discuss a particular topic), and one may therefore find a relatively unpredictable variety of lexical items. Because one cannot predict which items will appear with any precision (other than some necessary technical vocabulary particular to the topic area), it is not possible to define the behavior of specific lexical items in advance as part of a scoring scheme. This limitation has led to measures of free lexical output which focus on statistical analyses of the overall lexical output, e.g. the number of types and token produced, or the number of items produced from particular frequency bands. An example of this is Morris and Cobb’s (2004) analysis of entrance exam essays with the 2002 version of VocabProfiler. They found that resultant vocabulary profiles correlated with scores in a pedagogical grammar course, but only at between .34–.37, or around 13% of the variance. The profiles could distinguish between the writing of successful native and nonnative TESL trainees, but this was limited to comparisons of the first 1,000 words, words from the AWL, and function/content words. Overall, the authors felt that the profiles could add useful information to other measurements, but could not stand on their own. This seems a fair evaluation, as my personal experience has shown the various kinds of profiling to be quite limited in the insights they can produce. For example, I find that compositions which appear clearly different in lexical terms often do not show much of a difference when analyzed according to such profile statistics. It seems it doesn’t matter so much what particular lexical items are used, but rather if they are used appropriately for the particular context in terms of register, collocation, etc. Thus I feel that global measures of lexis (e.g. type-token ratios) will generally be less informative than measures of how appropriately the individual items are used. At the moment, there is no recognized measure of the appropriacy of written lexis in compositions, and this is a major gap in the field. Until we develop one, it will be very difficult to distinguish relatively better lexical performance in free composition writing from relatively weaker performance, at least in quantifiable and replicable terms. The same argument can also be made for the vocabulary in spoken output, and perhaps even more so. It must be said, however, that more detailed lexical profiling tools are continuously being developed, and they may well prove more informative than earlier tools. An especially good example of this continuing development is the Compleat Lexical Tutor by Tom Cobb, which now gives a frequency analysis of each 1,000 frequency band to the 20,000 level (available at ).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 39
6/9/2010 1:58:10 PM
40 Overview of Vocabulary Issues
The amount of formulaic language in language use
Research has clearly shown that there is a substantial amount of formulaic language in language (Biber, Johansson, Leech, Conrad, and Finegen, 1999; Erman and Warren, 2000; Foster, 2001), and that this is true in both written and spoken discourse. However, it is not yet clear what the proportion of formulaic language typically is compared to that creatively generated through grammar + vocabulary, and how this proportion varies according to the mode, genre, topic, or speaker. There must be a huge number of formulaic sequences in most languages, but we have no principled estimate of the size of the phrasal lexicon in English, or in any other language for that matter. It has been established that formulaic language has transactional and functional uses, but how much is required to operate in the four skills remains a mystery. Part of the problem is that there is no accepted methodology of how to identify and count formulaic items, which has led to a range of size estimates. Until an accepted categorization system appears, we will continue to have studies which are difficult to compare due to different counting methodologies. Another problem is that studies into formulaic language have almost always been corpus- and computer-based. Computers are amazingly effective at finding and counting any linguistic feature which can be unambiguously described. However, they are largely incapable of working with any feature that cannot be so described. Researchers of formulaic language usually ask a computer to find instances of contiguous words, because the computer has a hard time identifying probabilistic patterns where collocations can appear in any of a number of slots. Thus, most formulaic language research has researched only contiguous sequences (e.g. Biber et al., 1999). This is a problem, as many (most? – we simply do not know) formulaic sequences have slots into which one or more words can be inserted. (See Fellbaum, 2007, for one approach to this.) a ——— ago (hour, day, year ...) Would you please ———? (open the door, shut up ...) ——— (someone or something) thinks nothing of ——— (some unusual or unexpected activity) Patterns with open slots have a lot of flexibility to match different language use situations, and so are likely to be very common. In fact, flexible formulaic sequences may well outnumber the fixed kind, and may have greater importance in language use. The mind doesn’t seem to have a problem working with these flexible slots, but until very recently, computer programs were not able to handle this variability. Fortunately, there are now several programs (e.g. kfNgrams, ConcGrams) which can work with open-slot phraseology, and over time, should provide a much fuller description of the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 40
6/9/2010 1:58:10 PM
Vocabulary Use and Acquisition
41
scope and nature of formulaic language. (See Section 6.3 for more information on these concordancing tools.) Not enough vocabulary research is replicated
As with most research in Applied Linguistics, vocabulary research suffers from the lack of replication to confirm and refine results. This often results in key information in the field being based on the findings from single studies. A good example of this was the finding that learners require knowledge of 95% of the words in a text in order to comprehend it adequately. Batia Laufer came to this conclusion after carrying out a small study with her students. She published it in a relatively obscure edited volume (Laufer, 1989), and never made great claims for it, other than as an initial finding based on her particular participants, instruments, and criteria. Yet as so often happens, the 95% ‘number’ was picked out and widely cited. The chaining of citations eventually led to the figure having an authority which was never intended, and a great deal of research was based upon it, at least until newer research came along. In a study published in 2000, Hu and Nation carried out a more carefully controlled analysis, and came up with a coverage figure closer to 98–99% for adequate comprehension. However, even here, these new results are based on very small participant numbers (66 in total, split into four groups of only 16–17). What is still needed in this case is two types of replication. The first is a ‘pure replication’ with the same type of participants with the same instruments, to discover whether the results of the original studies are sound and reliable. If not, then either the overall research designs and instrumentation are suspect, or there are hidden factors which are affecting the results. If the replications confirm the original results, then additional replications in different contexts are then warranted. These could include participants with different L1s or L2 proficiencies, learner contexts, etc. Since the generally correct answer to most SLA questions is ‘It depends’, it is necessary for researchers to explore the various factors which can affect results and describe their effect. Given the importance of replication, one may wonder why more are not done. The main reason is probably that journals are not inclined to publish replication studies, which has a chilling effect on their production, especially by established researchers who have a high-stakes interest in getting their work published. The exception is the journal Language Teaching, which now explicitly encourages submission of well-done replications. Overall, replication is an important part of the research cycle, and our area needs to be more creative in developing ways to encourage it. Perhaps a start would be not to use the term replication, which has for some a negative image, but speak rather of cycles of research, which more people might view favourably. More established researchers could build replication into their research studies, either in follow-up studies, or by doing several concurrent
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 41
6/9/2010 1:58:10 PM
Overview of Vocabulary Issues
studies incorporating different participants/contexts. This could be done in collaboration with international colleagues, perhaps even providing them with the design and instrumentation, with the offer to potentially co-publish the combined findings. Another obvious solution to the problem is to involve emerging researchers. Replications are a very good way to break into research, as the research design and analysis methodology (and in many cases the instruments as well) are already provided. Beginning researchers (e.g. BA and MA students) may well benefit from carrying out replications, and having a much better chance of being able to do good research, as it will be based on designs which have already been tried, and vetted by journals’ editors and reviewers. The field could benefit by having a steady stream of replication either confirming or disconfirming existing results. The replication findings could be disseminated in several ways: (a) in journals which allow replications, (b) by ‘packaging’ and reporting several similar replications together, which may make them more attractive to journals which have traditionally liked only ‘original’ research, (c) by publishing them on-line on their university’s research web pages, or (d) the students’ supervisors could refer to the replications in the supervisors’ own work. ● Vocabulary specialists in different fields do not talk to one another
There are many researchers who focus on vocabulary, but often in separate fields. For example, vocabulary specialists work with disabled patients in speech pathology. Bilingual specialists look at knowledge of vocabulary in two or more languages, often through laboratory research. Lexicographers decide which words belong in dictionaries and how they can best be defined. All of these fields have rich insights which could prove beneficial to lexical scholars and practitioners working in the other areas. However, it seems that researchers working in one field often do not search out and read the relevant lexical literature from another field. While this is inevitable at the practitioner level for reasons of time, vocabulary researchers owe it to themselves and their readers to cast their nets more widely, and take advantage of the wider lexical insights available. In addition to these research gaps, Paul Nation lists numerous research topics on his website (http://www.victoria.ac.nz/lals/staff/paul-nation/ vocrefs/researchlval.aspx). Some examples include: ● ●
●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
42
Make a replacement for the General Service List. Investigate the qualitative differences between receptive and productive vocabulary knowledge. What unique information do different techniques add to word knowledge? What common information do they add?
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 42
6/9/2010 1:58:10 PM
●
●
● ● ●
●
●
43
Determine the factors influencing incidental vocabulary learning by using a message-focused computer game. How does learners’ focus of attention change as a text is listened to several times? Where does vocabulary fit in this range of focuses of attention? How can vocabulary learning from graded readers be optimized? What aspects of word knowledge are learned by guessing from context? Develop a list of frequent collocations using well-defined and carefully described criteria. Devise a well-based measure of total vocabulary size for non-native speakers. Measure the pattern of native-speaker and ESL non-native-speaker vocabulary growth.
Quote 1.15 Meara on the necessity of having vocabulary research mirror ‘real world’ conditions One of the main shortcomings of ... [some vocabulary research] is that it has focused attention on the acquisition of vocabulary divorced from use or from real context. Many of the subjects tested in the methodological comparisons were not real language learners, the time-scale studied was short compared to the time it takes to learn a language, and the vocabularies learned were actually quite small in comparison to what a real language learner has to acquire to become fluent. There is a serious shortage of good research that has looked at the behavior of real language learners acquiring vocabularies over a long time-scale. (1999: 565)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Use and Acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_02_cha01.indd 43
6/9/2010 1:58:10 PM
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_02_cha01.indd 44
6/9/2010 1:58:11 PM
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Part 2
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 45
6/9/2010 1:59:13 PM
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_03_cha02.indd 46
6/9/2010 1:59:13 PM
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2
What’s a word? This seems like a simple question and almost any layman could come up with a description of word. It might be something like ‘a group of letters with an empty space on either side, which has a meaning’. For the layman, such a definition is probably perfectly adequate, as most people only need to conceptualise word well enough to look up ‘words’ in a dictionary, or to count how many ‘words’ are in a document they are writing. But if we want to push beyond this basic level of understanding, we must understand the intricacies of lexis and control for them in our vocabulary research designs. This section will discuss some of the interrelated characteristics of vocabulary you will need to address when dealing with words in your research.
Concept 2.1
Psycholinguistics
Psycholinguistics is the study of language acquisition, processing, and use through the use of theories and research tools drawn from the field of psychology.
Psycholinguistic research has taken the lead in describing the various characteristics of vocabulary, because these characteristics need to be carefully controlled in order to isolate the effects of whatever linguistic variable is being studied. It has identified a large number of lexical characteristics which affect the way vocabulary is acquired and used. This is well-illustrated by a website that provides sets of target words in which users can manipulate over 30 characteristics (http://www.psy.uwa.edu.au/mrcdatabase/uwa_mrc. htm). Below is a partial listing of the lexical characteristics on this website: ● ● ●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
number of letters number of phonemes number of syllables 47
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 47
6/9/2010 1:59:13 PM
● ● ● ● ● ● ● ● ● ● ● ● ●
Foundations of Vocabulary Research
written frequency familiarity rating concreteness rating imagability rating meaningfulness rating age of acquisition rating common part of speech morphemic status ( prefix/suffix/abbrev/hyph, etc.) contextual status (colloquial/dialect/archaic, etc.) pronunciation variability capitalization irregular plural stress pattern (reduced, unstressed, stressed).
As impressive as this list seems, it is far from comprehensive. It does not include spoken frequency, as the website relies on early corpus evidence (Kuˇcera and Francis, 1967; Thorndike and Lorge, 1944, L count), which was based on written texts. It also does not capture issues based around the form-meaning link, such as synonymy (different forms with the same meaning). The list also tends to focus on lexis as individual words, and so does not include information based on those words’ connections to other words. Corpus research has been at the forefront of highlighting how the behavior of individual words is both constrained and enriched by their contextual environment (e.g. collocation and other phrasal patterning). This is particularly essential if vocabulary is conceptualized (as it should be) as multiword units as well as single words. Likewise, research into word associations has shown that words have connections with many others in the mental lexicon, through formal, paradigmatic, and syntagmatic links. Thus, in many kinds of vocabulary research, it is important to consider how a word’s behavior is affected these contextual and mental network connections. This brings up a number of other characteristics which may be relevant: ● ●
●
●
●
●
a word’s collocations whether a word’s meaning is largely driven by its phrasal patterning (semantic prosody) whether a word’s frequency differs according to mode (written, spoken, sign, etc) whether a word’s meaning and usage is connected to particular extralingual cues (e.g. some spoken words can be tightly connected with specific gestures or body language) context availability (how easy it is to think of a sentence or phrase which a word can appear in) a word’s associations ( formal, paradigmatic, syntagmatic).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
48
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 48
6/9/2010 1:59:13 PM
49
All of the above characteristics involve either the lexical item itself, or its connections to its discourse context or mental associations. However, if one is studying L2 vocabulary acquisition, it is also important to consider the match between L1 and L2 lexical characteristics. This is because research has shown that typicality (the degree to which the form structures (phonological and orthographic) and the semantic classifications of the L2 lexis resemble those of the learner’s L1 lexis) has a strong impact on the learnability of L2 vocabulary, with more similarity leading to better learning (de Groot, 2006; Ellis and Beaton, 1993). The formal similarities between words are also related to cognateness, and this characteristic has been shown to affect L2-L1 translation. It is probably impossible to fully control for all of these characteristics when doing research. Furthermore, different sets of characteristics will be relevant for different research designs and goals. Psycholinguistic studies will need to control for a relatively comprehensive range of characteristics, as the precise measures used in this type of research (e.g. often measured in milliseconds) can be confounded by even small differences in processing which can be caused by lexical factors. On the other hand, in lexical acquisition studies, it is important to focus on the characteristics which can make lexical items relatively more or less difficult to learn. In L2 vocabulary acquisition, this largely relates to how similar or dissimilar L2 word knowledge aspects are compared to their counterparts in the L1. Corpus research is not usually burdened by processing issues, as the focus in on the linguistic description of language output, where syntagmatic patterning comes to the fore. The important point from this discussion is that vocabulary researchers need to be aware of the various lexical characteristics, and so be able to make conscious and principled decisions about which characteristics to control for in their studies. Careful consideration at the initial stages of research design can be the best insurance against a study being later contaminated by unwanted lexical behavior which confounds interpretation of the results. Below are short discussions of some of the factors which may be worth considering in your studies. See Ellis and Beaton (1993) and de Groot (2006) for more discussion of these factors.
2.1
Form-meaning relationships
2.1.1 Single orthographic words and multi-word items
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
A basic characteristic of vocabulary is that meaning and form do not always have a one-to-one correspondence. Consider the following items: ● ● ●
die expire pass away
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 49
6/9/2010 1:59:13 PM
50
Foundations of Vocabulary Research
bite the dust kick the bucket ● give up the ghost (Schmitt, 2000: 1) ●
The six examples are synonymous, all with the meaning ‘to die’. However, several of the items contain more than one word. In some languages, and especially in English, meanings can be represented by multiple words operating as single units. To accommodate the fact that both single and multiword units can realize meaning, we use the terms lexeme, lexical unit, and lexical item. These interchangeable terms are all defined as ‘an item that functions as a single meaning unit, regardless of the number of words it contains’. Thus, all of the above examples are lexemes with the same meaning. I will generally use the term lexical item in this book to emphasize the point that most vocabulary research issues apply to both single orthographic words and multi-word lexemes, but will use the term word when words were the unit of counting in particular studies. (See Section 5.2.1 for more on units of counting in vocabulary studies.) 2.1.2
Formal similarity
We should also note that the forms of the above items vary considerably. Although the first two items are synonymous single words, they bear no formal similarity to one another. The same thing is true for the multi-word items. This is common in English: consider the English terms for stealing cattle (rustling), appropriating writing and ideas (plagiarism), and commandeering an aircraft in flight (hijacking). These words have no formal similarities whatsoever, even though they are all theft in one form or another (Nation and Meara, 2002). However, in other languages, these concepts would be expressed by words or expressions that literally translate as stealing cows or stealing writing or stealing aircraft. In these languages, the meaning of these expressions is relatively transparent, and they could easily be understood by people who knew the basic words of which these expressions are composed. Of course, English often does give formal cues to words of related meaning. For example, swim is action of propelling oneself through the water, swimmer is a person who swims, and swimming pool is a place where this can be done. However, as we have seen above, words with related meaning are not always this obvious. As a comparison, Arabic is an example of a language which gives more reliable formal cues to meaning. As opposed to Indo-European languages, which tend to have relatively stable roots to which affixes are attached, Arabic is based on roots that normally consist of three consonants, which can be combined with various vowels to form families of words that share a common
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 50
6/9/2010 1:59:13 PM
Issues of Vocabulary Acquisition and Use
51
meaning (Ryan, 1997: 188): maktaba (library) mudarris (teacher)
ketaab (book) madrasa (school)
kataba (he wrote) darrasa (to learn)
One reason for English’s inconsistent meaning-form relationships lies in its historical development. English was originally a Germanic language (Old English), though only around 15% of the original 24,000 or so lexical items still exist in Modern English (Schmitt and Marsden, 2006: 82). However, these are the most basic and frequent words in the language: man, wife, live, good, eat, strong. From the beginning, English absorbed loanwords from many different languages, particularly French after the Norman conquest and Latin/Greek after the 1600s. In many cases, English retained several words with no formal similarities as synonyms, but each with different register marking, e.g. kingly (Old English), royal (French), and regal (Latin). The relationship between source language/time of absorption and register marking in these cases is illustrated in Figure 2.1 from Hughes (2000). The lack of formal similarity among semantically-related lexical items is a factor that makes vocabulary in languages like English relatively more difficult to learn than vocabulary from languages with more transparent formal relationships. This may be important to consider when working with semantically-related items in your research. Because all L2 vocabulary research is affected by the L1, it might be useful to consider whether your cardiac
SPECIFIC/ TECHNICAL
GENERAL
REGISTER
FORMAL
interrogate
cordial question term
hearty ask word Anglo-Saxon
Middle English TIME
lexeme
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
k-t-b d-r-s
Early Modern English
Figure 2.1 The relationship between historical origin and register
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 51
6/9/2010 1:59:14 PM
52
Foundations of Vocabulary Research
participants’ L1(s) handle semantically-related lexical items in a more transparent way than your target language. Synonymy and homonymy
Synonymy is common in languages (several forms → one meaning), but so is the converse, where a single form has several meanings. This can be called either polysemy or homonymy. The distinction usually revolves around whether the different meaning senses are related or not. Chip is usually considered polysemous, in that a chip of wood, a computer chip, a potato chip, and a poker chip all have the same underlying concept of being small, thin, and flat(ish). A financial bank, a river bank, and the banking of an airplane when it turns are usually thought of as homonyms, as the meaning senses are totally unrelated. The distinction between polysemy and homonymy is important for lexicographers, as they have to decide the best way to list words in a dictionary. For other vocabulary researchers, it probably does not make much difference whether words are considered polysemous or homonymous; the important issue is the complexity of the form-meaning relationship, and the difficulties this leads to in vocabulary acquisition and use. In other words, it is probably the degree of variation between form and meaning that is the important factor to consider in most cases. 2.1.4 Learning new form and meaning versus ‘relabelling’ For adolescent and adult learners, most of the concepts connected with L2 words are likely to be already known. (This is obviously not always true for younger learners.) In this case, the learning task mainly consists of attaching an L2 label to a known concept, and then perhaps later fine-tuning the concept to match the exact L2 semantic representation. However, sometimes learners acquire new concepts in the L2, especially the technical vocabulary of the particular fields they are studying (e.g. legal language, business terminology). In these cases, the student is simultaneously learning both the concept and the L2 label. This is considered more difficult than the simpler ‘relabelling’ mentioned above. This difference in cognitive difficulty may be important when comparing the acquisition to different sets of lexical items. Those items where the participants must learn both concept and L2 label will presumably be more difficult than items where they already know the underlying concept. This makes it important to control for the ‘concept + label’ versus ‘relabelling only’ items in target vocabulary.
2.2
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2.1.3
Meaning
In addition to the issues concerning the form-meaning link, there are several aspects of meaning itself which may warrant attention.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 52
6/9/2010 1:59:14 PM
Issues of Vocabulary Acquisition and Use
Imageability and concreteness
Imageability refers to how easy it is to imagine a concept. Concreteness is ‘a variable that expresses the degree to which a word (or, rather, the entity the word refers to) can be experienced by the senses’ (de Groot, 2006: 473). These two characteristics are strongly associated, because in practice, lexical items that refer to concrete entities are usually easier to imagine, and items referring to abstract entities more difficult to imagine. Thus psychological research often conflates them, and reports them as a single variable. For example, de Groot (2006) gave her participants an imageability rating task and reported the results as ‘concreteness’. However, it is important not to assume that abstract entities are always more difficult to imagine: ‘Typical exceptions are words with strong emotional or evaluative connotations, such as anxiety and jealousy: words for fictitious creatures, such as demon and devil; and some concrete but extremely rare words, such as armadillo and encephalon’ (de Groot, 2006: 473). The degree of imageability/concreteness is important because it has been shown that more concrete/imageable words are learned far better than less concrete/imageable words, with the effect being both large and robust (de Groot, 2006; Ellis and Beaton, 1993). Therefore, in studies which compare the acquisition of groups of words, it is necessary to ensure that each group has equivalent concreteness, or any advantage in learning for a group may be due to higher concreteness alone, and not whatever acquisition variable is being studied. 2.2.2
Literal and idiomatic meaning
Most lexical items have meanings that are literal. For example, die and expire literally mean ‘to stop living’. Others can only be interpreted idiomatically: put your nose to the grindstone = work hard and diligently. (I would be very surprised if anyone has ever heard of someone literally pressing his nose against a grindstone!) Some can be interpreted both literally and figuratively (a breath of fresh air). For some research purposes, it may be important to determine the difficulty/knowledge of both literal/idiomatic meaning senses. Although it can be difficult to determine absolute learning difficulty, frequency can be a good guide to the chances of literal versus idiomatic meaning senses being known. One might assume that the literal meaning would usually be the most frequently used, but research with formulaic language has shown that it is often the idiomatic meaning that is far more frequent (e.g. Conklin and Schmitt, 2008). Whereas frequency tends to predict acquisition, this indicates that idiomatic meaning senses may often be more likely to be known than literal meaning senses. When using target items that have the possibility of both literal and idiomatic meaning senses, a researcher should determine the relative frequencies of each, to better understand the relationship between the various senses.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2.2.1
53
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 53
6/9/2010 1:59:14 PM
Foundations of Vocabulary Research
2.2.3
Multiple meaning senses
In languages like English, many lexical items have multiple meaning senses (see Section 2.2.3). This is an important feature to consider in vocabulary research. Learners will typically acquire the most frequent meaning senses before less frequent ones, so it often makes sense to do a frequency analysis not only on target words themselves, but also on their various meaning senses. This can be particularly true in acquisition studies where the researcher is interested in the depth of vocabulary knowledge as indicated by knowledge of the various meaning senses. While knowledge of the most frequent meaning sense is certainly important, knowledge of rarer meaning senses can indicate more comprehensive knowledge of a lexical item. 2.2.4
Content versus function words
A useful distinction is between lexical items that carry propositional content (i.e. meaning) and those which carry out grammatical functions. The former are commonly called content vocabulary (e.g. cow, fly, excruciating), while the latter are referred to grammatical or function vocabulary (e.g. is, he, the). Corpus word counts consistently show that function words are among the most frequent in language, which is not surprising because they are necessary for communicating about any topic, from daily life to astrophysics. This holds true regardless of whether the discourse is general in nature, technical, or academic. This is illustrated in Table 2.1, which lists the most frequent word forms in the BNC (Leech, Rayson, and Wilson, 2001: 120). The first 50 word forms in English are made up entirely of function word forms (depending on whether you consider I and you as content or function words). In fact, you must go beyond the first 100 word forms before content
Table 2.1 1 2 3 4 5 6 7 8 9 10
The most frequent 50 word forms in Englisha
the of and a in to it is to was
11 12 13 14 15 16 17 18 19 20
I for that you he be with on by at
21 22 23 24 25 26 27 28 29 30
have are not this ‘sb but had they his from
31 32 33 34 35 36 37 38 39 40
she that which or we ‘sc an ~n’t were as
41 42 43 44 45 46 47 48 49 50
do been their has would there what will all if
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
54
a Some word forms occur more than once in different word classes, e.g. to occurs as both infinitive marker and as preposition. b Genitive marker. c Verb.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 54
6/9/2010 1:59:14 PM
55
words are consistently found. Corpus research by Johansson and Hofland (1989) and Francis and Kučera (1982) found that the approximately 270 function word types in English (176 word families) accounted for 43–44% of the running words in most texts. This has important implications for vocabulary research. A very large proportion of discourse is made up of function words, yet these lexical items are extremely difficult to test. Likewise, although function words are among the most frequent in a language, learners often find them the most difficult to learn, e.g. articles are notoriously difficult for learners of English. Therefore, measures and discussions of vocabulary knowledge often address only content words, completely ignoring the extremely frequent category of function words. While this might often make sense, it can also be misleading if not handled properly. For example, in discussions about vocabulary coverage in texts, it has been found that the first 1,000 words in English make up about 70–85% of discourse (e.g. Nation, 2001: 17; Table 2.7 in this volume), but it is usually not pointed out how important the 1st 100 (mainly function) words are in achieving this high percentage of coverage.
2.3
Intrinsic difficulty
An interesting question is whether some words are relatively easier or harder to learn than others. Laufer (1997) lists a number of factors which affect the difficulty of learning a lexical item (Table 2.2). Some of these factors have to do the intrinsic difficulty of words, e.g. a word’s length and a word’s grammatical class. Other factors relate to the language system, e.g. whether an affixation rule is regular in a language and whether the particular lexical item conforms with it. The relationship between a word and others in the language also makes a difference: if several words have a similar written or orthographic form (synformy), it can make learning more difficult. For many of these factors, it is the relative similarity/dissimilarity between L2 and L1 which makes the difference. For instance, whether a word is difficult to pronounce depends largely on the phonological features one already has in their inventory from previous languages. If those features match the features of the new word, then it is comparatively easy; if not, it is comparatively difficult. This means that the absolute difficulty of a lexical item’s phonological requirements depends on the learner, and to a large extent, their L1. For example, an English word like rapid will be relatively difficult to pronounce for Japanese learners who do not have /r/ in their native repertoire, but relatively easy for French learners who do. Thus whether words are easy or difficult depends on intrinsic difficulty, the regularity of the systematic elements of the language being learned, and similarity with languages already known. Nation (1990) notes that the manner in which lexical items are taught can also affect the learning burden, with inappropriate techniques actually
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 55
6/9/2010 1:59:14 PM
56 Foundations of Vocabulary Research
Table 2.2 Factors which affect vocabulary learning Difficulty-inducing factors
familiar phonemes phonotactic regularity fixed stress consistency of sound-script relationship
presence of foreign phonemes phonotactic irregularity variable stress and vowel change incongruency in sound-script relationship
inflexional regularity derivational regularity morphological transparency
inflexional complexity derivational complexity deceptive morphological transparency synformy
Factors with no clear effect
word length
part of speech concreteness/ abstractness generality register neutrality one form for one meaning
specificity register restrictions idiomaticity one form with several meanings
(Laufer, 1997: 154)
making items harder to learn. For example, teaching two new items together initially which have formal or meaning similarities can lead to crossassociation (e.g. teaching left and right together in the first instance). In addition, teaching exceptions before the underlying rule is learned can make learning that rule more difficult. For example, teaching the words reply and release before the prefix re- is mastered can make learning that prefix more difficult. (Of course, this might often be in tension with the need to teach such words relatively early to learners because their meanings are required.) Examining the intrinsic factors in more detail, we unsurprisingly find that regularity is a facilitating characteristic. We see this in phonotactics, which refers to the phoneme/grapheme clusters which occur in a language. For instance, str occurs in English at the beginning of words but not in a wordfinal position. Words which contain clusters which follow the norms of a language (phonotactic regularity) will be easier to learn than those which do not. Likewise, regularity of stress placement (e.g. fixed initial stress in Finnish) will aid the recognition and production of spoken vocabulary compared to languages where the stress is variable (e.g. English: phótograph, photógraphy, photográphic). It is not difficult to see how a consistent relationship
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Facilitating factors
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 56
6/9/2010 1:59:14 PM
57
between sounds and their written correlates can make learning easier, as well as an affixation system which is limited and regular. Lexical items which can be used in a wide variety of contexts (generality) are less prone to error than items which can only be used appropriately in particular contexts (specificity), precisely because they have a wider range of appropriate usage: famous can be used to describe almost any person who is well-known, while notorious can be used to describe only people famous for unsavoury reasons. This is linked to register – general items usually have fewer register constraints than specific items.
Quote 2.1 Blum and Levenston on the attractiveness of general words ... learners will prefer words which can be generalized to use in a large number of contexts. In fact they will over-generalize such words, ignoring register restrictions and collocational restraints, falsifying relationships of hyponymy, synonymy and antonymy. (1978: 152)
Laufer (1997) suggests that idiomatic expressions (make up one’s mind) are more difficult than their non-idiomatic meaning equivalents (decide). This is undoubtedly true for idioms, which are very numerous in language as a category, but which occur relatively infrequently as individual items (Moon, 1997). The infrequency of all but a few idioms means learners have trouble obtaining enough exposure to them for much acquisition to occur. Learners also have trouble with more frequent types of idiomatic expressions, such as phrasal verbs (put off ), preferring instead their one-word equivalents (postpone) (Dagut and Laufer, 1985), even in informal spoken contexts where the multi-word verbs would be more appropriate (Siyanova and Schmitt, 2007). However, idioms and phrasal verbs are only two types of idiomatic expressions. Many lexical items can have both literal and figurative meaning senses (dog = animal/very poor example of something; let off steam = release steam pressure/release mental stress), and in these cases, the idiomatic meaning may well be the most frequent (Section 2.2.2) leading to a tension between the facilitating effects of frequency and the inhibiting effects of idiomaticity. It is interesting to note that Laufer categorizes word length, part of speech, and concreteness/abstractness as factors which have no clear effect. While this may be true of the earlier studies she reviewed (many relying on paperand-pencil measurement), these factors do affect the results of more sensitive psycholinguistic experiments. For example, Ellis and Beaton (1993) argue that nouns are much easier to learn than verbs, possibly because they
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 57
6/9/2010 1:59:15 PM
58
Foundations of Vocabulary Research
● ● ● ● ● ● ● ●
number of letters number of phonemes number of syllables common part of speech familiarity rating concreteness rating imageability rating meaningfulness rating.
Quote 2.2
Scrivener on enhancing knowledge of ‘old’ words
... much of the difficulty of lexis isn’t to do with learning endless new words, it’s learning how to successfully use words one already knows, i.e. learning how ‘old’ words are used in ‘new’ ways. (2005: 246)
2.4
Network connections (associations)
Lexical items have numerous formal and semantic connections with other items in every person’s mental lexicon. These connections can lead not only to appropriate lexical usage (e.g. being able to think of words which rhyme; being able to retrieve an appropriate synonym or collocation), but also more automaticity in using this knowledge, as a well-organized mental lexicon is thought to improve accessibility. Lexical connections are apparent in a number of types of language output. In slips-of-the-tongue (when you mean one thing but say another), the misspoken word usually has some close connection to the intended word, e.g. saying left for right, or Tuesday for Wednesday. Likewise, similar words are sometimes blended together (Aitchison, 2003: 88):
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
are more imageable, and thus more memorable, and perhaps also because they do not have the complex argument structures and associated thematic roles that verbs do. Likewise, de Groot (2006) argues that L2 translations of concrete L1 words are learned far better than those of abstract L1 words. The three factors are therefore included among the basic lexical characteristics which need to be controlled for in psycholinguistic experiments (see Section 2 for a fuller list):
I went to Noshville (Nashville + Knoxville, Tennessee towns). When a person has a concept in mind, but cannot remember the word for it, they often produce related words in their attempt to retrieve the word they
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 58
6/9/2010 1:59:15 PM
Issues of Vocabulary Acquisition and Use
59
are searching for: participants in a ‘tip of the tongue’ experiment who could not remember sampan, recalled words with formal similarities like Saipan, Siam, and sarong or words which had a similar meaning like barge, houseboat, and junk (Brown and McNeill, 1966).
Gleason on the development of L1 semantic
The difficulty of assessing children’s semantic knowledge arises, of course, because children’s semantic systems themselves are becoming more complex. Not only do children learn new words and new concepts, they also enrich and solidify their knowledge of known words by establishing multiple links among words and concepts. For example, children learn that the words cat and cats refer to the same category of animate object, but differ in number, while the words cats and books share the feature number even though they refer to quite different objects. The words walk, walks, walking, and walked refer to similar actions that differ in tense or duration, while eat and devour refer to actions that differ in manner. Compete, win, and lose share some semantic components, but differ in the outcome each conveys. Pain and pane are linked phonologically, as are pane, mane, and lane, though each has a different referent. Oak, spruce, and birch are linked by virtue of their co-membership in the superordinate category tree. These types of connections among words and concepts form what are called semantic networks. (2005: 131)
However, the clearest indication of lexical connection derives from word association data. In this methodology, participants are given a stimulus word,1 and then asked to produce the first word(s) which come to their mind. For example, the prompt black very often produces the response white. The assumption is that automatic responses which have not been thought out will consist of words which have the strongest connections with the stimulus word in a person’s lexicon. Table 2.3 illustrates the most frequent word association responses for four common words. They are drawn from the Edinburgh Association Site (http://www.eat.rl.ac.uk), a database of associations produced by 100 British university students. From even this small sample of data, some typical association behavior is demonstrated. The first observation is that associations from groups of respondents exhibit a great deal of systematicity. The responses are not random, otherwise one would expect nearly 100 different responses from the 100 British university students who responded. Rather we find that a large percentage of the participants gave the same responses. There is usually a single very strong response which is given far more than any other. Moreover, the top three responses often account for half or more of the total number. Clearly there is a great deal of agreement among the members of
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Quote 2.3 networks
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 59
6/9/2010 1:59:15 PM
60
Foundations of Vocabulary Research
Stimulus
black
eat
house
slowly
Responses (%)
white (58) brown (3) color (3)
food (45) drink (16) sleep (5)
home (28) garden (8) door (6)
quickly (24) fast (22) walk (3)
magic (3) night (3) belt (2) blue (2) cat (2) death (2) bag (1)
fat (4) hunger (2) hungry (2) now (2) quickly (2) a lot (1) bite (1)
boat (4) chimney (4) roof (4) flat (3) brick (2) building (2) bungalow (2)
walking (3) amble (2) car (2) crawling (2) plod (2) stop (2) surely (2)
21
19
32
34
Others (1–2%)
this group. On the other hand, there were a large number of participants who gave idiosyncratic responses. In fact, this pattern describes very well the distribution of responses for almost any stimulus word for almost any group: a small number of responses being relatively frequent, with a larger number of responses being relatively infrequent. This pattern of communality has been demonstrated across numerous studies. For example, for Lambert and Moore’s (1966) English-speaking high school and university subjects, the primary response covered about one-third of the total responses and the primary, secondary, and tertiary responses together accounted for 50–60%. This is congruent with the 57% figure reported by Johnston (1974) when she studied the three most popular responses of 10–11-year-olds. Associations can be analyzed according to what category they belong to. The three traditional categories are clang associations, syntagmatic associations, and paradigmatic associations. In clang associations, the response is similar in form to the stimulus word, but is not related semantically. An example is reflect-effect. The other two categories take into account the associations’ word class. Responses which have a sequential relationship to the stimulus word are called syntagmatic, and usually, but not always, have differing word classes. Examples from Table 2.3 would be adjective-noun pairs like blackmagic, verb-noun pairs like eat-food, and verb-adverb pairs like walk-slowly. Responses of the same word class as the stimulus are labelled paradigmatic. Examples are verb-verb pairs like eat-drink, noun-noun pairs like house and home, and adjective-adjective pairs like black and white. While syntagmatic relationships involve the contiguity (occurring in close proximity) of words in language, paradigmatic relationships are more semantic in nature. Sometimes paradigmatic pairs are roughly synonymous (blossom-flower) and sometimes they exhibit other kinds of sense relation (deep-shallow, table-furniture).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 2.3 Frequent word associations
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 60
6/9/2010 1:59:15 PM
61
Analyzing the associations gives us clues about the process in which words are acquired. Aitchison (2003) lists three basic recurrent findings. First, the responses are almost always items from the semantic field of the stimulus word: in response to needle, people usually gave words related to sewing, and not nail or poker, even though these are sharp pointed objects. Second, if a stimulus word was part of an obvious pair (husband/wife) or had a clear antonym (tall/short), the partner word was usually given as the response. Third, adults usually give a response which is the same word class as the prompt word, e.g. nouns elicited noun responses. Another recurrent finding in association studies is that responses tend to shift from being predominately syntagmatic to being predominantly paradigmatic as a person’s language matures. Conversely, there is a decrease with age in clang associations. Quite early on, it was demonstrated that L1 children have different associations from adults (Woodrow and Lowell, 1916). Later, Ervin (1961) elicited associations from kindergarten, first-grade, third-grade, and sixth-grade students and found that as the students’ age increased, their proportion of paradigmatic responses also increased. This syntagmatic → paradigmatic shift is not exclusive only to English. Sharp and Cole (1972) studied subjects who spoke Kpelle, an African language structurally different from most European languages, and found the same shift. The shift occurs at different times for different word classes, however. Research by Entwisle and her colleagues (1964, 1966) suggests that nouns are the first to shift, with adjectives next. The shift begins later for verbs and is more gradual. On the other hand, it is interesting to note that more recent research (Nissen and Henriksen, 2006) found that native informants demonstrated an overall preference for syntagmatic responses, leading those researchers to question the validity of the syntgamatic→paradigmatic shift. What can we infer about the organization of the lexicon from such association research? The large degree of agreement in native responses suggests that the lexicons of different native speakers are organized along similar lines. If natives have a ‘normal’ or ‘preferred’ organizational pattern, then it seems reasonable that nonnatives would benefit if their lexicons were organized similarly. We do not really know how to facilitate this yet, but the fact that responses usually have either syntagmatic or paradigmatic relationships with the stimulus words suggests that these relationships might be important in vocabulary teaching and learning. As for how lexical organization changes over time, the presence of clang associations indicates that word form similarity may initially play some role in the early lexical organization of L1 children. But formal similarity is obviously a less preferred way of organizing the lexicon, as evidenced by the rapid disappearance of clang associations as learners mature. Syntagmatic relationships are next to be focused upon by the young learner, suggesting a salient aspect of language at this point is contiguity. Later, as learners sort out the word class and sense relations of the word, their associations become more meaning-based and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 61
6/9/2010 1:59:15 PM
Foundations of Vocabulary Research
paradigmatic. It must be stressed that not every word passes through this progression, and as the child becomes more proficient, there will probably be no clang associations at all. Rather, the progression indicates the general evolution of lexical organization patterns as a learner’s language matures. The linking of words based on similar concepts, senses, and relations presumably aids the acquisition of new words. The interrelations between words form categorical clusters, and once these develop, they begin to connect with other clusters creating lexicons based on shared connections (Haastrup and Henriksen, 2000). These connections facilitate the quick growth of lexicons as new words can be assimilated with known words. As more words become associated, the lexical network strengthens (Haastrup and Henriksen, 2000; see this paper also for an interesting card-sorting methodology for tapping association knowledge). Although most of the association research has dealt with young native speakers, it has also been applied to second-language acquisition research. Meara (1980, 1983) surveyed the research available at that time and detected several traits of L2 associations. First, although L2 learners typically have smaller vocabularies than native speakers, their association responses are much less regular and often not of the type which would be given by native speakers. This is partly because L2 responses often include clang associations. It is also presumably because the organization of L2 learners’ mental lexicons is usually less advanced. Second, L2 subjects frequently misunderstand the stimulus words, leading to totally unrelated associations. Third, nonnative speakers, like L1 children, tend to produce more syntagmatic responses, while native-speaking adults tend towards paradigmatic responses. Fourth, L2 responses are relatively unstable. A recent association study, using an enhanced methodology, found that the association profiles of natives and nonnatives were remarkably similar, although natives produced somewhat more synonyms and collocation-based associations, and nonnatives more form-based associations and those with only a loose conceptual relationship (Fitzpatrick, 2006, see also Zareva, 2007). Some previous studies attempted to show that L2 acquisition mirrors firstlanguage acquisition in that association preferences systematically shift from syntagmatic to paradigmatic (Politzer, 1978; Söderman, 1993), but Fitzpatrick concludes that, although the nonnative response behavior did change with proficiency, there was no evidence that it became more native-like.
Quote 2.4
Fitzpatrick on word association research
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
62
It is important that future studies investigate the similarities as well as the differences between LI and L2 response patterns, and the differences as well as the similarities within each subject group. (2006: 144)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 62
6/9/2010 1:59:15 PM
Issues of Vocabulary Acquisition and Use
2.5
63
Frequency
As briefly introduced in Section 1.1.4, the frequency in which a word occurs in language permeates all aspects of vocabulary behavior. It is arguably the single most important characteristic of lexis that researchers must address. It is not difficult to see why, as it affects the acquisition, processing, and use of vocabulary. In terms of acquisition, frequent vocabulary are, by definition, the words most likely to be met in discourse. As a result, learners generally acquire more frequent vocabulary before less frequent lexis (e.g. Read, 1988; Schmitt, Schmitt, and Clapham, 2001). This effect is robust for both L1 and L2. Tremblay, Baayen, Derwing, and Libben (2008) using ERP (eventrelated potentials) methodology (Section 2.11) found that every experience of a lexical item leaves a memory trace, and that this effect holds for formulaic sequences as well as individual words: the higher the frequency of lexical bundles, the better people remembered them. Frequent vocabulary is also processed better, and this has been demonstrated in a number of ways. For example, more frequent lexical items are correctly translated more often, faster, and with fewer errors than lower frequency items (de Groot, 1992). Ellis (2002: 152) summarizes a range of research and concludes that ‘for written language, high-frequency words are named more rapidly than low frequency ones ... , they are more rapidly judged to be words in lexical decisions tasks ..., and they are spelled more accurately ... Auditory word recognition is better for high-frequency than low frequency words ... there are strong effects of word frequency on the speed and accuracy of lexical recognition processes (speech perception, reading, object naming, and sign perception) and lexical production processes (speaking, typing, writing, and signing), in children and adults as well as in L1 and L2. (See SSLA 24, 2 for frequency effects on other aspects of language.) In terms of use, frequency plays a prominent part in how lexical items are employed in discourse. For instance, more frequent words tend to have less register marking (connotation, formality), which allows them to be used in a wide variety of contexts (and thus be more frequent). Lower-frequency lexical items often have semantic and collocational constraints which limit their usage to certain contexts. For example, odd (6,162 occurrences in the 179 million-word New Longman Corpus) can be used to describe virtually anything that is not quite the usual or normal case. On the other hand, eccentric (1,014) is usually used to describe a person, or his/her behavior, and has the additional connotation of being odd in a somehow endearing manner. It also collocates with a more coherent set of nouns (behavior,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2.5.1 The importance of frequency in lexical studies
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 63
6/9/2010 1:59:15 PM
64
Foundations of Vocabulary Research
habits, millionaire, father, lady, inventor, genius). It is useful to note that even a relatively modest disparity in frequency (only 6{) is enough to highlight the differences in contextual constraints between these two words.
In fact, frequency interrelates with most other lexical characteristics. We have seen an example of frequency/meaning/collocation interaction, but the interrelationships extend beyond this. In the realm of word form, more frequent words are shorter, and less frequent words are longer. Moreover, this relationship is systematic, following Zipf’s law. Zipf’s law states there is a relatively constant relationship between the rank of a lexical item on a frequency list and its frequency of occurrence. We can see an example of this in the word-length/frequency relationship outlined in Table 2.4, where Crystal (1987: 87) illustrates data from one of the earliest corpus studies in 1898 by Kaeding. The table shows that for every additional syllable of word length, the number of words of that length roughly halves. Another way of looking at this is that for every increase in syllable ranking (e.g. 1→2, 2→3), the frequency of occurrence in the corpus systematically decreases by about half. A more modern word count of the BNC confirms this pattern (Leech et al., 2001: 121).
Table 2.4 The relationship between word length and frequency of occurrence for German words Number of syllables in word 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of word occurrences 5,426,326 3,156,448 1,410,494 646,971 187,738 54,436 16,993 5,038 1,225 461 59 35 8 2 1
Percentage of whole 49.76 28.94 12.93 5.93 1.72 0.50 all remaining = 0.22
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2.5.2 Frequency and other word knowledge aspects
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 64
6/9/2010 1:59:15 PM
65
Research has shown that frequency also varies according to mode. The most frequent words in both spoken and written discourse largely consist of function words, but there can be quite a difference in the relative frequencies of some content words. The Leech et al. (2001) word count lists the items that have the greatest frequency contrasts between speech and writing. Twenty of the most ‘distinctive’ spoken/written items are illustrated in Table 2.5. Some of these words very much have a spoken or written flavor, and it is not surprising to find that yeah and okay are used more in speech, or that thus and political are used more in writing. Indeed, the frequency disparity between the modes is largely what give these words their register distinctiveness. However, it might have been hard to predict the disparity of some other of the these words, e.g. that know and mean are more frequent in speech, or that most and new are more frequent in writing. This highlights the limitations of intuition, which is not always a reliable indicator of frequency (see below for more on this). Table 2.5
Distinctiveness list contrasting speech and writing
Word yeah no know think mean just okay like really say want mind however thus while most new political international former
Spoken frequencya
Written frequencya
7,890 4,388 5,550 3,977 2,250 3,820 950 784 1,727 2,116 1,776 246 90 8 156 199 603 71 38 21
17 230 734 562 198 982 7 7 337 512 432 51 664 228 543 607 1,208 333 242 187
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
a Per million words. Adapted from Leech et al. (2001, List 2.4).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 65
6/9/2010 1:59:16 PM
Foundations of Vocabulary Research
Table 2.6
Comparison of written and spoken frequency
startc beginc too (excluding too + adjective) also
Writtena
Spokenb
232 119 119
260 27 132
289
107
a
Occurrences in a 330,000 word written corpus. Occurrences in a 330,000 word spoken corpus. Lemmas. Adapted from McCarthy and Carter (1997: 27).
b c
McCarthy and Carter (1997) explored the differences between spoken and written discourse, and made a number of interesting observations. First, many of the frequency discrepancies were caused by the words being part of highly frequent spoken interpersonal markers, such as you know, I think, and never mind. This emphasizes one of the weaknesses of most corpus counts: they are usually based on orthographic words and seldom capture the frequencies of formulaic language. As it is becoming increasingly clear that vocabulary is largely phrasal in nature, this bias towards single orthographic words can sometimes lead to misleading results. They also found that synonyms can also have different distributions in speech and writing. It can be argued that there are no true interchangeable synonyms in language, as it would be redundant to have two words that are used exactly the same way in exactly the same contexts (McCarthy, 1990: 16–17). Rather, they will have some differences in their context of use (different collocations, syntactic behavior, or register), and here we see that one possible difference is modal preference. To illustrate this, McCarthy and Carter (1997) compare start and begin, and too and also. In terms of meaning, it is difficult to discern a difference the pairs, but frequency information shows differences in their mode of use (Table 2.6). While start and too appear with similar frequency in both spoken and written contexts, begin and also are used more in written contexts. McCarthy and Carter conclude that findings like these argue for the utility of separating spoken and written corpora when examining the distribution and usage patterns of lexis.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
66
2.5.3 L1/L2 frequency In L2 learning situations, there are, of course, two frequencies to consider: the frequency of the L2 word and of its L1 counterpart. If these were very different, it might be a confounding factor in vocabulary research. Luckily, there is some evidence that frequency in L1 and L2 can be fairly parallel.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 66
6/9/2010 1:59:16 PM
Issues of Vocabulary Acquisition and Use
67
2.5.4 Subjective and objective estimates of frequency Most work on frequency has been corpus-based for obvious reasons. Corpus counts are objective and quantifiable, and computers are well suited to fast and accurate counting. Furthermore, the above discussion shows how corpus evidence can uncover lexical behavior that would be difficult to intuit. However, it is useful to not become complacent and too trusting of automatized computer counts. A frequency count is only as good as the corpus it is based upon, and every corpus has limitations. First, no corpus can truly mirror the experience of an individual person; rather it is hopefully representative of either the language across a range of contexts (e.g. general English corpus – BNC), or of a particular segment of language (e.g. a corpus of automotive repair manuals – AUTOHALL (Milton and Hales, 1997)). Similarly, the frequency of a lexical item in a language and a person’s psychological ‘impression’ of that frequency will not necessarily always tally. Every person will have his/her own unique experience of language exposure, and will have relatively more or less exposure to particular lexical items depending on his/her environment and interests. (Pilots will be exposed to more aviation vocabulary both because they are interested in it, and because they move in circles where that vocabulary is more often used.) This means that intuitions of frequency differ from person to person, and so is an idiolectal feature, at least for all but the highest frequency items. Second, every corpus is necessarily a compromise limited by the amount and the types of language extracts which can be collected. The amount of words is usually less than the compilers would like, and some kinds of language (e.g. closed-door boardroom discussions, intimate bedtime chat, secret intelligence documents) are difficult to collect. Thus, to some extent, corpora are usually biased towards language types that are easy to collect. Third, some features (e.g. very low-frequency words, longer phrasal strings) appear so infrequently, that even extremely large corpora have difficulty providing a good picture of their usage. Given the limitations of corpus counts, in some cases it may make sense to consider the other main way of determining frequency – user intuitions. The main case one can envisage this being useful is in reflecting the amount of exposure that particular learners have received. The main language corpora represent the usage of language in native contexts. Unless
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
De Groot (1992) compared the log frequencies of target words in a 42.5 million word Dutch written corpus with those from their equivalents in a 18.8 million word written English corpus (both part of the CELEX corpora) and found they correlated at .78. However, we need to be cautious about this finding, because the corpora used were relatively small, and because it is not clear whether such a close correlation would hold for other L1/L2 combinations which are more dissimilar than Dutch and English.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 67
6/9/2010 1:59:16 PM
Foundations of Vocabulary Research
there is a corpus based on the L2 language available in a learner’s native environment (I am not aware of any such corpus), then the learner intuition of frequency may better mirror their exposure than a corpus which was compiled in countries which the learners have probably never been to, and may have little interest in. Another possibility is using ESL teachers frequency intuitions, as Wang and Koda (2005) note that ESL instructors’ familiarity ratings may be more appropriate for studies with ESL learners as participants than frequency counts made from texts and discourse for native English speakers. But this raises the question of the accuracy of frequency intuitions. Earlier research tended to show that people do have reasonable frequency intuitions (e.g. Shapiro, 1969), but more recent studies show rather disappointing correlations between corpus frequency figures and figures derived from intuition elicitation (e.g. around .67, Alderson, 2007; .53–.65, Schmitt and Dunham, 1999). Furthermore, both Alderson and Schmitt and Dunham found great variability between their raters, and even Alderson’s very linguistically-aware corpus linguist judges failed to do very well. McCrostie (2007) found that native speaker intuitions were limited to differentiating between very frequent and very infrequent words, with teachers performing no better than first-year university undergraduates. However, McGee (2008) notes that it is not surprising that corpus data and intuitive data sometimes diverge, just as different corpora will often differ on relative word frequencies. His suggestion to consider both corpus- and intuition-based information as useful seems reasonable. Overall, it seems only prudent to consult corpus-based evidence if it is available, albeit while recognizing its inherent limitations (e.g. carefully considering the content of the corpus, and realizing corpus counts do not usually capture formulaic language). If one considers it appropriate to use intuition-based frequency information, or if no corpus-based frequency counts are available, Alderson suggests that the best introspection-based indicator is to average the frequency judgements of a group of raters, because of the variability of individual ratings. 2.5.5 Frequency levels As we have seen, frequency is an absolutely crucial factor to consider in vocabulary research. But how is the broad range of frequency to be classified? After all, the frequency range in English extends from the most frequent word the, which occurs 61,847 times per million words (Leech et al., 2001), to a word like persnickety, which might occur only once in many millions. The most common distinction is high- versus low-frequency words. High-frequency words are the most basic and essential words in a language. Although it is obvious that these words will be necessary for almost any communicative purpose, there is no fixed upper limit to this category, as the general rule ‘The more vocabulary, the better’ applies. However,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
68
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 68
6/9/2010 1:59:16 PM
69
by convention, the first 2,000 items (words? lemmas? word families? See Section 5.2.1) has generally been accepted as high-frequency vocabulary. This figure partly comes from the GSL, which includes about this many headwords, and research by Schonell, Meddleton, and Shaw (1956) which showed that 2,000 families covered around 99% of the spoken language they studied. As a rule of thumb, the most frequent 2,000 items will make up about 80% of an average written text. However, the first 1,000 and second 1,000 do a disproportional amount of the work, with the first 1,000 typically covering about 70–75%, and the second 1,000 adding only around another 5–8% (Nation, 2001). Low-frequency vocabulary has been conceptualized in widely varying ways. Sometimes it has been defined as all words beyond the ‘2,000+ academic vocabulary’ level, especially in studies which have used Paul Nation’s Vocabulary Profiler, which classifies vocabulary in four categories: first 1,000, second, 1,000, academic vocabulary, and all other words. Other studies consider words beyond the suspiciously round number of 10,000 as low frequency, based partly on the fact that this is the Vocabulary Levels Test’s highest level, and on Hazenberg and Hulstijn’s (1996) finding that around 10,000 word families would provide the lexical resources for university study in Dutch. However, these traditional frequency levels have been called into question by recent research by Nation (2006). His analysis estimates that it actually requires some 6,000–7,000 word families to operate in a spoken English environment, and about 8,000–9,000 families for a written one. If these figures hold up, they will force a reappraisal of vocabulary levels. Low-frequency vocabulary should probably be thought of as vocabulary beyond the 8,000–9,000 word families needed for wide reading in English. For if a person knows enough vocabulary to read widely, and can also use this vocabulary productively, he or she should have the lexical resources to be successful in most language use situations. Any vocabulary beyond this is a luxury and clearly not essential. However, if 6,000 word families or more are necessary to speak English, then it is difficult to maintain that high-frequency vocabulary stops at 2,000 families. On the other hand, words at much beyond this level drop off in frequency quite rapidly.
Quote 2.5
Nation and Hwang on low-frequency vocabulary
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
Low frequency vocabulary consists of words that occur with low frequency over a range of texts, that are so rare that low frequency is inevitably related to narrow range, or that are the technical vocabulary of other subjects (one person’s technical vocabulary is another person’s low frequency vocabulary!). (1995: 37)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 69
6/9/2010 1:59:16 PM
Foundations of Vocabulary Research
It seems that we need a new category which can bridge the gap between the highest frequency vocabulary and the amount that is required for language use. This mid-frequency vocabulary category (2,000 to 8,000–9,000 level) is important, especially in terms of pedagogy. The classical advice has been to explicitly teach and learn the first 2,000 items, and to more-or-less disregard low-frequency vocabulary, because it does not appear enough to justify the time and effort to learn it. The vocabulary in between could be acquired through exposure, especially extensive reading, and the skilled use of learner strategies (e.g. Nation, 1990). This advice made sense when the field believed that 2,000 items allowed verbal usage, and 5,000 items written usage, as there were not such a great number to be independently acquired through exposure + strategies. However, if the learning target is 6,000–9,000 word families, it is clearly not realistic for learners to acquire the lexis beyond the 2,000 level without a great deal of help. Thus, Nation’s newer figures would suggest that all of the partners involved in the learning process (learners, teachers, materials writers, and researchers) will have to focus attention on mid-frequency vocabulary in order to help learners acquire a large enough vocabulary to be able to use language without a lack of lexis being a problem. Beyond the broad frequency bands of high-, mid-, and low-frequency vocabulary, lexis is also often described by 1,000 band levels, especially as newer technology has made this finer-grained analysis easier. To illustrate such an analysis, Table 2.7 gives frequency figures from the Lextutor website () for extracts from four different types of text. The first is a Level 1 graded reader called Inspector Logan (MacAndrew, 2002). The second is an award-winning novel (Spies, Frayn, 2002), while the third contains news stories form the British Observer newspaper (September 7, 2008). The fourth type of text is an academic journal article published in Applied Linguistics (Conklin and Schmitt, 2008). The detailed K1-K20 analysis shows clear differences between the graded reader meant for L2 learners and the other three texts meant for native speakers. As might be expected, a very high percentage (87.68%) of the text in the beginning level graded reader is made up of K1 and K2 words. Moreover, almost all of the off-list words are proper names from the story. The novel, newspaper, and article have similar frequency distributions to each other, with the K1 vocabulary making up 73–78% of the text, and the cumulative K1-K5 vocabulary making up very close to 92% for all three texts. This shows that although the vocabulary levels for different texts will vary to some extent, the typical overall frequency distribution is always likely to be evident.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
70
2.5.6 Obtaining frequency information These discussions in this section highlight the importance of considering the effects of frequency in vocabulary research. There are a number of
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 70
6/9/2010 1:59:16 PM
Issues of Vocabulary Acquisition and Use
71
Table 2.7 Lextutor 1,000–20,000 frequency profile of four text types
Frequency level
Graded reader
Novel
Newspaper
Journal
K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 K11 K12 K13 K14 K15 K16 K17 K18 K19 K20 Off-List
1679 (84.12) 71 (3.56) 34 (1.70) 4 (0.20) 3 (0.15)
2822 (78.04) 269 (7.44) 133 (3.68) 63 (1.74) 37 (1.02) 35 (0.97) 27 (0.75) 16 (0.44) 16 (0.44) 12 (0.33) 14 (0.39) 4 (0.11) 11 (0.30) 4 (0.11) 4 (0.11) 1 (0.03) 3 (0.08) 1 (0.03) 1 (0.03)
2486 (75.93) 331 (10.11) 109 (3.33) 55 (1.68) 33 (1.01) 26 (0.79) 17 (0.52) 9 (0.27) 11 (0.34) 9 (0.27) 10 (0.31) 2 (0.06) 3 (0.09) 1 (0.03)
4971 (73.24) 548 (8.07) 187 (2.76) 502 (7.40) 51 (0.75) 58 (0.85) 42 (0.62) 63 (0.93) 15 (0.22) 11 (0.16) 40 (0.59) 13 (0.19) 4 (0.06) 12 (0.18)
198 (9.92)
143 (3.95)
1 (0.03) 165 (5.04)
Total
1996 (100)
3616 (100)
3274 (100)
5 (0.25) 2 (0.10)
3 (0.09) 3 (0.09)
5 (0.07) 1 (0.01) 5 (0.07) 259 (3.82) 6787 (100)
sources of frequency information available. These corpora, concordancing tools, and vocabulary lists are described in detail in the Resources Sections 6.2, 6.3, and 6.4.
2.6
L1 influence on vocabulary learning
As noted in the section above, many of the factors that affect learning difficulty involve the relative similarity/dissimilarity between a learner’s L1 and the target L2. This includes formal aspects like phonemes, graphemes, the suprasegmental system of pitch, stress, and juncture, and the degree of sound/symbol correspondence. Morphological aspects are also important, including inflexional and derivational regularity and complexity, as well as how transparent morphological transformations are. In addition, languages
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Tokens (coverage %)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 71
6/9/2010 1:59:16 PM
72
Foundations of Vocabulary Research
can conceptualize real-world phenomenon in different ways, and how similarly two languages parse a concept’s ‘semantic space’ has an effect on learning burden.
Crosslinguistic influence
Under the influence of the behaviorist school, most L1 influence was once considered negative, making the learning of an L2 more difficult. The L1 influence was often referred to as negative transfer. However, it was eventually realized that L1 influence can aid, as well as hinder, L2 learning. A new term, crosslinguistic influence, was developed to denote this neutral view of L1 influence on L2 learning. In terms of L2 lexical acquisition, the nature of the L1 influence (whether positive or negative) largely depends on whether there are congruent cognate items in the L1 and L2.
A number of researchers have studied the role of formal similarity in second language learning. Ellis and Beaton (1993) found that the match of L1 and L2 phonological features (i.e. the ease of pronunciation of the L2 words) had a major influence on L2 vocabulary learnability. Likewise, de Groot (2006: 466) concludes that ‘words with a “cognate” translation in the FL [foreign language] (where the FL word to be learned is orthographically and phonologically similar to its L1 equivalent) were learned far better than those with a noncognate translation’. This effect stems not only from the relative similarity/dissimilarity of the forms in the L1 and L2, but also from the way those forms are processed. In her research, Koda (e.g. 1997) finds that learners often transfer their L1 processing routines over to the L2 in their attempt to process the L2 forms, whether those routines are appropriate for the L2 form system or not. This means that L1-L2 formal dissimilarity has a potential double negative effect: both the forms themselves and the underlying routines for processing those forms can be different to varying degrees. The congruence of the morphological systems in the L1 and L2 can also make lexical items easier or more difficult to learn. L2 lexical items with features such as irregularity of plural, gender of inanimate nouns, and noun cases are intrinsically more difficult to learn than items with no such complexity, but this difficulty is exacerbated if there are no corresponding features in the L1 from which the learner can draw analogies. While most second-language vocabulary learning involves attaching a L2 lexical item onto an already known concept (relabelling), there are often times where the L2 conceptualizes that concept differently than in the L1. Cases of this are illustrated in Table 2.8, where the semantic boundaries of tree, wood, and forest are parsed quite differently in three languages. There are four concepts to be realized (tree, the building material derived from
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Concept 2.2
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 72
6/9/2010 1:59:16 PM
Issues of Vocabulary Acquisition and Use
73
Table 2.8 Parsing the semantic space of tree, wood, and forest English
French
tree
arbre
Danish
Swedish
Wood (material)
bois
træ
trä
Wood (small forest)
bois
skov
skog
forest
forêt
(Swan, 1997: 158)
trees, a small group of trees standing together, and a larger group of trees). None of the three languages has a different word for each of these concepts, and Danish uses only two words. From this comparison, all other things being equal (which they never are), one would expect that the congruency of the French and English categorizations should facilitate the learning of the words by speakers of the other. Conversely, Danish speakers would find it more difficult learning the French or English words, because in addition to learning the word forms, they also have to learn to make semantic distinctions which do not exist in their own language. Even where the semantic space is parsed in basically the same way in two languages, there are often other differences. Translation ‘equivalents’ may have different types of register marking, such as formality/informality, being technical/non-technical vocabulary, or being more frequent in speech or writing. There is also a good chance they will have different collocations (German opfer bringen [bring sacrifice]; English: make sacrifice). Thus, just as it can be argued that there are no exact synonyms in a language (McCarthy, 1990), there may be few or no translation equivalents that are truly the same. It is impossible to discuss L1 lexical influence without mentioning cognates. Although de Groot (2006) uses the term cognate above to mean words with orthographical and phonological similarity between L1 and L2, the term more commonly refers to lexical items in different languages which are similar because they have descended from a single lexical parent in a common ancestor language, e.g. English five, Latin quinque, and Greek pénte all evolved from the common Indo-European *penkwe (McAurthur, 1992: 229). Research has shown that cognates can be helpful for second-language learners, but learners are also tripped up by false friends (unrelated words which look as though they are cognate but are not in fact: English actual = real or true; Spanish actual = current; French actuel = current). Moreover, cognates can range from having virtually the same meaning (French somptueux = English sumptuous) to being partially deceptive (French expérience = English experience and experiment) to being totally deceptive (French actuel ≠ English actual) (Granger, 1993). Moreover, even if the meaning of cognates is the same across two languages, other characteristics may not be, following
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
träd
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 73
6/9/2010 1:59:17 PM
74
Foundations of Vocabulary Research
Quote 2.6 Granger on the usefulness of cognates in L2 language learning Cognates are both an aid and a barrier to successful L2 vocabulary development. Teachers should therefore seek to find a happy medium between over-reliance on cognates and near-pathological mistrust of them, two attitudes which are equally detrimental to learners’ vocabulary development. (1993: 55)
Jarvis (2000) surveys the sometimes confusing research into lexical transfer from the L1, and concludes much of the discrepancies have to do with fuzzy definitions and methodology. He proposes a methodological framework for the study of L1 influence, that begins with a theory-neutral definition of L1 influence which can be empirically tested in a consistent manner: ‘L1 influence refers to any instance of learner data where a statistically significant correlation (or probability-based relation) is shown to exist between some feature of learners’ IL [interlanguage] performance and their L1 background’ (p. 252). He then lists three potential L1 effects that must be considered in a rigorous investigation of transfer. The first is intra-L1group homogeneity in learners’ interlanguage performance, i.e. when learners who speak the same L1 behave in a uniform manner when using the L2. The second is inter-L1-group heterogeneity in learners’ interlanguage performance (i.e. when comparable learners of a common L2 who speak different L1s diverge in their L2 performance). The third is the congruence between learners’ L1 and interlanguage performance, where learners’ use of some L2 feature can be shown to parallel their use of a corresponding L1 feature. As part of the methodology, Jarvis also provides a list of outside variables that should ideally be controlled. Consideration of the elements in Jarvis’ methodological framework would go some way towards making L1 transfer research more rigorous and comparable. In sum, it is worthwhile considering the congruence of the L2 target lexical items with the L1s of the participants in a study. Of course, if the participants are from a single L1, then it is more feasible to consider the various crosslinguistic factors discussed above, and these factors may be an important part of the interpretation and discussion of results. If the participant pool consists of mixed L1s, then the above factors may help to explain any differential results between the various L1 groups. With the L1 wielding such a strong influence on second-language vocabulary acquisition (and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
the ‘no translation equivalent’ argument above. Therefore researchers must be cautious when using cognates in their research, taking care to establish the degree of relationship between the cognate items.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 74
6/9/2010 1:59:17 PM
Issues of Vocabulary Acquisition and Use
75
potentially processing), it is important to control for it as much as possible in a research design and to consider its effects in the interpretation of study results.
Describing different types of vocabulary
Each lexical item is uniquely suited to its purpose. For example, nice is a very common modifier that can describe a wide variety of nouns, and no other word can replace it in all of those contexts with exactly the same meaning. Likewise, scalpel is a very particular type of medical knife, and its restricted meaning helps to ensure there is no confusion about which kind of cutting utensil a surgeon is asking for. Have a nice day is a frequent greeting, and its use identifies the speaker as a friendly North American. While each of these items have equal intrinsic merit within their own contexts of use, it is still obvious that they are different kinds of vocabulary. Nice is a high-frequency word, scalpel is a technical medical term, and Have a nice day is a formulaic sequence which is used only in informal spoken discourse. In other words, it is possible to classify these diverse lexical items into different categories. It is often useful to narrow the broad notion of vocabulary down into smaller, more manageable (and identifiable) classifications in vocabulary research, with the following being some of the more common distinctions: ● ● ● ● ● ● ● ●
word class (e.g. nouns, verbs) content and function words frequency (e.g. high-frequency vocabulary) written and spoken vocabulary formulaic sequences general vocabulary technical vocabulary academic vocabulary.
The first four distinctions have been covered already in previous sections, and formulaic language is such a huge topic that it merits it own upcoming chapter (Chapter 3), so this section will focus on the last three distinctions. General vocabulary Although general vocabulary is not really a technical term, it is sometimes used to describe the higher-frequency vocabulary necessary to achieve a basic functionality with a language. There are no agreed limits for which vocabulary this might include, as the notion itself is rather vague. The term is most often used in connection with discussions of the roughly 2,000 words in the General Service List (GSL), so named because West (1953) and colleagues2 wished to create a general service list of words, rather than one
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
2.7
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 75
6/9/2010 1:59:17 PM
76
Foundations of Vocabulary Research
1. Word frequency 2. Structural value (all structural words included) 3. Universality (words likely to cause offence locally excluded) 4. Subject range (no specialist items) 5. Definition words (for dictionary-making etc.) 6. Word-building capacity 7. Style (‘colloquial’ or slang words excluded) (Howatt, 2004: 289) The list makes it clear that the GSL is not a frequency list, although frequency was a key factor. However it was moderated by a number of other factors designed to ensure that the words would be useful in a pedagogical context, such as including words which could be used to define a large number of other words. West also wanted to exclude words whose function could be covered by other words in the list, which resulted in the inclusion of some low-frequency words and the exclusion of some high-frequency words. In fact, the selection criteria largely revolved around the purpose of creating simple reading materials, which was an interest for many of the scholars working on the list. This is one reason why the GSL is limited as a list of general English words. For example, it tends to neglect vocabulary from colloquial spoken English, and lacks many of the words necessary for everyday situations (Howatt, 2004). For these reasons, it is necessary to be cautious in using the GSL as a representative of general English vocabulary. It is now very old, being published in its final form in 1953, but the vocabulary selection began in 1935. There are some obvious examples in the GSL of ageing words which have lost much of their importance over the years (plough, crown), while some important newer words are not included (the GSL includes telegraph, but not television or computer). On the other hand, the most frequent and important words in English tend not to change much over time, and the majority of GSL words are still essential in English (e.g. a random dip into the first ten ‘D’ words show they all have retained their usefulness (damage, damp, dance, danger, dare, dark, date, daughter, day, dead)). The coverage of the GSL is still around 75% of the running words in non-fiction texts, and around 90% of the running words in fiction (Nation and Hwang, 1995). There have been several attempts to revise the GSL (see Section 6.4), and these may prove more useful to many researchers than the original. In the end, a researcher must look at the age and selection criteria of the GSL (or its revisions) and decide if it is an appropriate indicator of general vocabulary for their own purposes. If the research purpose is pedagogical, the GSL may still be of value, but if the researcher needs frequency information, a modern word count (e.g. Leech et al., 2001) will almost certainly prove more suitable.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
for any specific set of purposes. The criteria used to compile the GSL include the following:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 76
6/9/2010 1:59:17 PM
Issues of Vocabulary Acquisition and Use
77
Technical words or phrases are those which are recognizably specific to a particular field. They range from items which are unique to the field and do not occur elsewhere (computing: pixel) to items that have the same form as high-frequency items but specialized meanings within a field (computing: memory). Technical items are reasonably common within a field, but not so common elsewhere, and differ from subject area to subject area. Technical vocabulary is essential to understanding discourse in a field, and can cover 10% or more of the running words in a text from that field (Sutarsyah, Nation, and Kennedy, 1994). There may even be formal technical terms and informal technical equivalents. There are two main ways of identifying the technical vocabulary in a field. The first is through the intuitions of experts in the field. The results of this method will depend on the knowledge of the experts, how systematic they are in producing their technical lists, and how obvious the technical vocabulary in the field is to identify. The resulting lists from this method can be highly variable, as different experts may have quite different ideas of what the key vocabulary of a field is. However, if a large number of experts are consulted, and their consensus taken, the resulting list can be a useful indication of the technical vocabulary in a field. I would venture that this method is used in the compilation of most technical dictionaries. The other approach is analyzing a corpus of technical discourse, and extracting the technical vocabulary. This usually involves first creating a frequency list from the corpus and then eliminating the high-frequency words which will be common to all subject areas. Formerly, the GSL has been used for this deletion, but now it is probably better to use current frequency counts. Next, the researcher looks for items which have a wide range (occur across many texts) and reasonable frequency of occurrence. Range is important to ensure that items that are very common only to a single author or small subfield are not included on the final list. The AWL is a good example of this methodology (Coxhead, 2000). Typically, the technical vocabulary of a field is fairly restricted, with Nation (2001) reporting that technical dictionaries will usually contain about 1,000 entries. However, not all technical vocabulary is the same. There are degrees of ‘technicalness’ depending on how restricted a word is to a particular area. Nation (2001: 198–199) gives four categories of technical vocabulary, along with examples from the field of Applied Linguistics for each: ●
●
●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Technical vocabulary
Category 1. The word form appears rarely, if at all, outside this particular field (morpheme, hapax legomena, lemma) Category 2. The word form is used both inside and outside this particular field, but not with the same meaning (sense, reference, type, token) Category 3. The word form is used both inside and outside this particular field, but the majority of its uses with a particular meaning, though not
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 77
6/9/2010 1:59:17 PM
78
all, are in this field. The specialized meaning it has in this field is readily accessible through its meaning outside the field (range, frequency) Category 4. The word form is more common in this field than elsewhere. There is little or no specialization of meaning, though someone knowledgeable in the field would have a more precise idea of its meaning (word, meaning).
Technical vocabulary is usually learned in the course of the study of a particular field, and may be easier than non-technical vocabulary. Cohen, Glasman, Rosenbaum-Cohen, Ferrara, and Fine (1988) found that nontechnical vocabulary poses more difficulty to EFL learners than technical vocabulary, since technical vocabulary has fixed meanings which can be more easily learned. Moreover, these terms are often defined in the content classroom. For example, Flowerdew (1992) found that a term was defined about every two minutes on average in a foundation science course given to Omani students attending an English-medium university. For native speakers of Romance languages, English technical vocabulary may be relatively easy to learn simply because it often derives from Latin origins. Academic vocabulary Academic texts contain high frequency vocabulary, and technical vocabulary pertinent to the field in question. However, they also contain a considerable amount of non-high-frequency vocabulary which is common across academic disciplines. This vocabulary is necessary to express ideas in various disciplines, such as insert, orient, ratio, and technique. This ‘support’ vocabulary is usually termed academic vocabulary. Typically these words make up about 9–10% of the running words in an academic text, and so are very important for people learning or working in academic areas. Early lists of academic vocabulary were complied by either manually extracting words from small academic corpora, or by noting which words students annotated in their textbooks. Results from four early studies into academic vocabulary were combined into the University Word List (UWL) (Xue and Nation, 1984), which contained over 800 words and covered 8.5% of words in academic texts. This was a big improvement over the component studies, but suffered from the fact that it was an amalgam of existing lists, and so lacked consistent selection principles. Coxhead (2000) took advantage of computing power to create a new word list from scratch. She first collected a large and diverse corpus of academic texts which totalled 3.5 million words (70,377 types) from 414 academic texts written by more than 400 authors, balanced across four broad areas of about 875,000 words each: arts, commerce, law, and science. Each of these areas was further broken down into seven subject areas, e.g. Arts = Education, History, Linguistics, Philosophy, Politics, Psychology, and Sociology. Coxhead then created a frequency list from the corpus and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 78
6/9/2010 1:59:17 PM
79
eliminated the GSL words. Words were selected from the remaining list on the basis of range (occurred at least 10 times in each of the four broad areas, and in 15 or more of the 28 subject areas) and frequency (occurred at least 100 times in the academic corpus). The resulting Academic Word List (AWL) contains 570 word families and covers 10% of the academic corpus. Thus, the AWL has fewer words than the UWL, but has greater coverage. The AWL is the best list of academic vocabulary currently available, and is widely used in vocabulary research. It has also inspired a wide range of pedagogic materials, including textbooks from several publishers. However, it is also important to recognize the limits of the AWL. Hyland and Tse (2007) found that although AWL covered 10.6% of their 3.3 million word academic corpus (different from Coxhead’s), the individual items on the list occurred and behaved in different ways across the various subject areas in terms of range, frequency, collocation, and meaning. They argue that this various usage undermines the notion of an academic vocabulary which is general in nature, and suggest focusing on the more restricted academic vocabulary which occurs in more contextualized, discipline-specific environments.
Quote 2.7
Coxhead on the uses of the AWL
An academic word list should play a crucial role in setting vocabulary goals for language courses, guiding learners in their independent study, and informing course and material designers in selecting texts and developing learning activities. (2000: 214)
The full AWL can be accessed from Coxhead (2000), as well as a number of websites (Section 6.5), many of which include pedagogic tools and material based on the list.
2.8
Receptive and productive mastery
From the discussion so far, we see that vocabulary knowledge is multifaceted, and contains a number of interrelated, though separable, aspects. The word-knowledge framework helps to illustrate the range of these vocabulary knowledge aspects, but as Meara and Wolter (2004) point out, its comprehensiveness is also its weakness. It is virtually impossible to measure all of the word-knowledge aspects for words for at least three reasons. The first is that many of the word knowledge aspects do not have accepted methods of measurement. While there are numerous formats for measuring a word’s meaning, anyone attempting to measure a person’s intuitions about that word’s register characteristics, for example, will have to develop their own new methodology. A second reason has to do
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 79
6/9/2010 1:59:17 PM
Foundations of Vocabulary Research
with time. A test battery that measured all of the word-knowledge aspects for words would be extremely unwieldy and time consuming. Although such a battery might be practical in a research context, where one has the luxury of focusing on target words for an extended period of time, it would be totally impractical for any kind of pedagogical purpose. It would simply take too long, and only a very limited number of words could ever be covered. A third reason is related to the difficulty of controlling for cross-test effects. The various types of word knowledge are interrelated, and so organizing the different word-knowledge tests in test battery so that answers to one do not affect the others is not always straightforward. For these reasons, many pedagogic and research purposes require a much simpler conceptualization of vocabulary depth of knowledge. One of the most common is the distinction between receptive and productive knowledge (sometimes referred to as passive and active mastery). This dichotomy has great ecological validity, as virtually every language teacher will have experience of learners understanding lexical items when listening or reading, but not being able to produce those items in their speech or writing. Unsurprisingly, studies have generally shown that learners are able to demonstrate more receptive than productive knowledge, but the exact relationship between the two is less than clear. Melka (1997) surveyed several studies which claim the difference is rather small; one estimates that 92% of receptive vocabulary is known productively. Takala (1984) suggests the figure may be even higher. Other studies suggest that there is a major gap between the two: Laufer (2005a) found that only 16% of receptive vocabulary was known productively at the 5,000 frequency level, and 35% at the 2,000 level. Other studies conclude that around one-half to three-quarters of receptive vocabulary is known productively (Fan, 2000; Laufer and Paribakht, 1998). The inconsistency of these figures highlights the difficulties and confusion involved in dealing with the receptive/productive issue. One problem is the lack of an accepted conceptualization of what receptive and productive mastery of vocabulary entails. The second problem concerns measurement issues, where the productive/receptive results are highly dependent on the types of tests used (Laufer and Goldstein, 2004). Because measurement problems often stem from an unclear conceptualization of the construct to be measured, let us look at the theoretical issue first. The distinction between receptive and productive mastery has been incorporated into theoretical accounts of vocabulary knowledge in various ways. For example, Henriksen (1999) lists three components of vocabulary knowledge, of which receptive/productive mastery is one:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
80
1. partial → precise knowledge of word meaning 2. depth of knowledge of the different word knowledge aspects 3. receptive knowledge → productive knowledge.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 80
6/9/2010 1:59:17 PM
81
While descriptions like this are useful in pointing out that receptive/productive mastery is an important component of overall vocabulary knowledge, they do not actually tell us much about the receptive/productive relationship itself (although Henriksen does hypothesize about this relationship later in her article). Melka (1997) suggests that receptive and productive mastery lie on a continuum, and that knowledge gradually shifts from receptive mastery towards productive mastery as more is learned about the lexical item. If this is true, Read (2000) notes that the problem lies in determining the threshold where receptive mastery turns into productive mastery in this incremental process. He poses the essential question: ‘Is there a certain minimum amount of word knowledge that is required before productive use is possible?’ (p. 154). To date there has been little research to inform on this key issue. Most research has compared the ratios between receptive and productive vocabulary, but very little has explored the type and amount of lexical knowledge necessary to enable productive use of lexical items. From a word-knowledge perspective, the minimum would appear to be a formmeaning link, with the learner being capable of producing either the verbal or written form. Meara (1997) proposes a different possibility, that the move from receptive to productive mastery is the result of a fundamental change in the way a lexical item is integrated into the mental lexicon. Rather than being the ends of a continuum of mastery, he takes a lexical organization perspective and suggests that receptive and productive vocabulary may reflect differing types of connection between lexical items. He wonders whether productively-known lexical items are those which can be activated by their links to other items in a lexical network. Thus when lexical items connected to a ‘productive’ item become active, it somehow ‘lights up’ the item, and it becomes accessible for the person to use. Conversely, receptively-known items have no ‘incoming’ links from the lexicon, and so cannot be recalled unless activated by some outside stimulus. This is shown in Figure 2.2, where the word (W) is at a receptive state. When it is read or heard and understood, it can linked to the rest of the mental lexicon (L) and used. However, the lexicon itself cannot activate the word, because there are no links in that direction, and so the word is not at a productive state, as it cannot be activated by other words in the lexicon. As there is no natural progression from a receptive to a productive state in this view, it has the potential to explain how students can learn some words productively with very little input over a short period of time (e.g. as in studies using word lists). It can also explain why words sometimes seem to be known productively and at other times do not: if the words in the lexicon connected to the item are activated, the item will be accessible. However, if the particular words connected to the item are not activated, even though other parts of the lexicon are, the item will not be recallable. On the assumption that not all lexical items in the lexicon are connected
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 81
6/9/2010 1:59:17 PM
82
Foundations of Vocabulary Research
L
Figure 2.2 Receptive knowledge in a lexical organization framework (Meara, 1997: 120).
to each other, then productive mastery is largely a matter of the number of other words the item is connected to, since a greater number of total connections provide a greater chance of a connection to an active word in the lexicon. This view has parallels with connectionism, and measurement of the strength of productive mastery would presumably require determining the relative number of links to other members of the lexicon. (See Section 2.10 for matrix models of acquisition based on this lexical organization perspective.) Thinking of vocabulary knowledge from a word-knowledge perspective makes receptive/productive knowledge even more complex, as it is obvious that learners do not acquire all of the word knowledge components in a uniform manner. Rather, if one follows the ‘continuum’ metaphor, then each of the word-knowledge aspects will lie at various points upon a receptive-productive cline at any point in time. This was certainly the case for me in Japan, where I had a pretty good productive mastery of the spoken form and meaning of many Japanese words, but only a tenuous receptive mastery of a very few of the Chinese written characters. Extrapolating from a range of vocabulary studies, it appears that some word-knowledge aspects will reach a productive level of mastery sooner than others. For example, I found that my advanced English learners generally could produce the spelling of the base form of target words, but often could not produce some of that word’s derivative forms and meaning senses (Schmitt, 1998a). In general, one would expect that the ‘contextual’ word-knowledge aspects, like collocation and register, are especially likely to lag behind in reaching a productive state, as this type of knowledge requires a great deal of exposure to acquire. However, little is known about the relative progression along the receptive/productive continuum for the various word-knowledge aspects, as research into this topic requires both a multi-component approach and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
W
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 82
6/9/2010 1:59:18 PM
83
receptive and productive measurement, and few studies have studied vocabulary acquisition in this level of detail. (For an exception, see a discussion of Webb (2005) below.) Highlighting the problem of measurement, Read (2000) points out how receptive and productive vocabulary has been measured in different ways. Discrete/selective/context-independent test formats tend to focus on recognition and recall. Recognition is when ‘test-takers are presented with the target word and are asked to show that they understand its meaning, whereas in the case of recall they are provided with some stimulus designed to elicit the target word from memory’ (2000: 155). This is often tested with L1-L2 translations. For recognition, the L2 form would be presented and its meaning shown with an L1 translation. Conversely, recall involves an L1 stimulus prompting the meaning, and then the participant remembering and producing the L2 form. In these translations, meaning is related to the L1 item, and so assumed to be fully and automatically known. Thus these translations do not really test meaning; it is already in place through the L1. What is being measured is the form-meaning link of the L2 item. To measure this, the translations focus on the L2 form (the meaning is already known). Thus in recognition, the key is recognizing the form of the L2 item, while in recall, it is the ability to recall and produce the L2 form. Read contrasts recall and recognition with comprehension and use, which are more typically used for embedded/comprehensive/context-dependent vocabulary. For example, comprehension could entail learners reading a passage and then being tested on how well they understood the words in the text. Use could be measured by analyzing the vocabulary produced in a task designed to elicit target lexical items (e.g. describing a picture). These four ways of measuring the mastery of vocabulary have often been confounded and used interchangeably. The problem is that different measures lead to different scores, which has resulted in inconsistent research findings, which then do less than they should to clarify receptive/productive issues. Waring (1999) found that scores of receptive and productive vocabulary varied considerably depending on the type of measurement used. In extreme cases, a difficult receptive test format could even lead to lower scores than a relatively easy productive one. It obviously would be extremely useful for the field to develop more consistent ways of measuring and reporting receptive and productive mastery.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
Quote 2.8 Waring on receptive and productive mastery of vocabulary The notions of Receptive and Productive vocabulary are part of the folklore surrounding vocabulary acquisition. The distinction between them is rarely questioned. One major hurdle that the researcher interested in Receptive and
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 83
6/9/2010 1:59:18 PM
Foundations of Vocabulary Research
Productive vocabulary must overcome and tiptoe through is the definition, description and categorization of these notions we have come to blithely accept as a ‘given’ ... Rarely do we see researchers or theorists working within pedagogy or language acquisition get down the nitty gritty of what is actually meant by Receptive vocabulary and by Productive vocabulary or even the relationship between the two. These notions on closer examination are extremely difficult to pin down, despite the average teacher and language researcher being able to come up with a ‘good enough’ definition or description. (1999: Section 1.1)
One step in this direction for measuring the form-meaning link is Laufer and Goldstein’s (2004) work in developing a computer-adaptive test of vocabulary knowledge (CATSS). They developed a categorization of vocabulary knowledge, based on the relationships between supplying the form for a given meaning versus supplying the meaning for a given form, and being able to recall versus only being able to recognize (whether form or meaning). This results in four possible degrees of form-meaning knowledge (note that the authors prefer the terms active and passive to productive and receptive):
Active (retrieval of form) Passive (retrieval of meaning)
Recall
Recognition
1. Supply the L2 word 2. Supply the L1 word
3. Select the L2 word 4. Select the L1 word
(Laufer and Goldstein, 2004: 407)
The item format involves various permutations of L1/L2 translations. They are illustrated below in tests for German-speaking learners of English (German hund = English dog). 1. Active recall: d ——— hund Participants have to provide the word form. The first letter is given to minimize the suppliance of other English words with the same meaning. 2. Passive recall: dog h ——— Participants are required to provide the L1 equivalent, which demonstrates understanding of the meaning of the L2 word.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
84
3. Active recognition: hund a. cat
b. dog
c. mouse
d. bird
Laufer and Goldstein argue that recognizing word form once meaning is known can be considered as active. This is debatable, and using that term
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 84
6/9/2010 1:59:18 PM
Issues of Vocabulary Acquisition and Use
85
for this test format is probably something of a misnomer. However, the particular terminology used is not so important. What matters is that Laufer and Goldstein have given this particular meaning-form test format a standard descriptor which can be consistently used.
c. maus d. vogel
Here the L2 word is given, and the participant must recognize its form and select the L1 word with the same meaning. Laufer and Goldstein tested 435 high school and university L2 learners with all four formats. They found evidence that the four categories form a very reliable hierarchy of difficulty, at least for the higher-level students they studied (> = more difficult than): active recall > passive recall > active recognition > passive recognition However, the results of Laufer, Elder, Hill, and Congdon (2004) distinguished three rather than four different modalities, as the results for the two recognition categories were not significantly different from one another. The authors suggested that picking the correct definition of a word may not necessarily be easier than choosing the word form that matches a given set of definitions. Also, it must be noted that most of the students studied were relatively high-level, and it remains to be demonstrated that the implicational scaling of the categories also works with lower-level students. However, despite these limitations, Laufer and colleagues have provided four categories of meaning-form link, and have gone some way towards empirically demonstrating their relative placement within a hierarchy. Their research studied Hebrew and Arabic learners of English in Israel (Laufer and Goldstein, 2004), and intermediate to advanced nonnatives in New Zealand and Australia (Laufer et al., 2004), so the hierarchy seems to work with a variety of learners. As expected, the production of the L2 word from a L1 meaning prompt (active recall) is the most difficult test format, and presumably represents the highest degree of form-meaning knowledge strength. Likewise the multiple-choice recognition of meaning (passive recognition) appears to be the easiest, and represents the minimum formmeaning strength. The advantage of using these categories is the avoidance of the type of confusion Read (2000) points out in his discussion receptive/productive mastery. The categories are worth using when possible in order to clearly state what aspects of receptive/productive knowledge are being tapped into, and to make form-meaning research more comparable across studies. Even
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
4. Passive recognition: dog a. katze b. hund
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 85
6/9/2010 1:59:18 PM
Foundations of Vocabulary Research
if a decision is made not to use such categories, the underlying distinctions between recognition and recall, and meaning and form, need to be taken on board. However, Laufer and Goldstein’s categories do have a limitation. The translation tasks would not work well in mixed L1 participant groups, where it would be infeasible to make a separate test version for each L1, and even then the different test versions may not be equivalent. Of course, there are other methods for measuring the form-meaning link besides using the L1, and these could be substituted for translation tasks. Laufer and Goldstein’s distinctions are important, but I find that their terminology tends to confuse both my students and myself. I think the distinctions could be more useful if they carried more transparent descriptions. To do this, let us look at the four distinctions in detail, to see what is given, and what is being elicited in tests. Active recall refers to the case when the meaning is given and the L2 form must be produced. Active recognition is when meaning is given and the form must be recognized (i.e. usually selected from a number of options). In passive recall, the form is given, and the meaning must be produced. Finally, passive recognition refers to when the form is given, and the meaning must be recognized (again, usually from options). It seems to me that what is being addressed here is not so much an active/passive distinction, but rather which elements of word-knowledge are given and which are being elicited. Focusing on form and meaning, we come up with a relabelled version of Laufer and Goldstein’s table:
Word knowledge
Word-knowledge tested
Given
Recall
Recognition
Meaning
Form recall (supply the L2 item)
Form recognition (select the L2 item)
Form
Meaning recall (supply definition/L1 translation, etc.)
Meaning recognition (select definition/L1 translation, etc.)
Of course, this is the same table as before, but with terms which I feel are much more transparent. Because of the transparency, I advocate the use of form recall, form recognition, meaning recall, and meaning recognition to cover Laufer and Goldstein’s categories. If we match these labels with the relevant test formats, the construct being measured is much more obvious, both in terms of what aspect is required, and the degree of mastery (recall versus recognition):
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
86
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 86
6/9/2010 1:59:18 PM
Issues of Vocabulary Acquisition and Use
Form recall: d———hund Meaning recall: dog h ——— Form recognition: hund a. cat b. dog c. mouse d. bird Meaning recognition: dog a. katze b. hund c. maus d. vogel
It is also interesting to consider how these terms fit with the more general notions of receptive and productive knowledge. The above terms cover permutations of the form-meaning link, but of course there is much potential vocabulary knowledge beyond this link. A person will presumably use all of the various vocabulary knowledge they possess for a lexical item when engaging with it in real life, i.e. recognizing and understanding vocabulary when listening or reading, or recalling and producing vocabulary when speaking or writing. These skills-based usages of vocabulary are what is commonly thought of when using the terms receptive and productive, and it makes sense to reserve them for this purpose. Thus, receptive knowledge entails knowing a lexical item well enough to extract communicative value from speech or writing. Productive knowledge involves knowing a lexical item well enough to produce it when it is needed to encode communicative content in speech or writing. Following this reasoning, receptive/productive knowledge of vocabulary would be usage-based definitions of mastery, and would presumably need to be measured with skill-based instruments. Conversely, the form-meaning level of mastery discussed above can be measured in isolation, as they are in Laufer and Golstein’s test formats. Even so, form recall and meaning recall can be related to skills-based productive knowledge and receptive knowledge, respectively, and presumably are a necessary component of this knowledge. We might consider meaning recall as the first step along the road to receptive mastery, and form recall as the first step in productive mastery. We do not yet understand the process, but one could speculate that incremental vocabulary acquisition might proceed something like this: a learner would first establish meaning recall, then perhaps begin building up other aspects of vocabulary knowledge (e.g. grammatical or morphological) which would facilitate receptive recognition of the word when listening or reading. Eventually, the learner might achieve a form recall level of knowledge, but would require more time to fill in the ‘contextualized’ elements of word knowledge (e.g. collocation, register) to a point where the lexical item could be confidently used in an appropriate manner in a variety of spoken and written contexts. On the other hand, form recognition and meaning recognition probably only come into play in reference look-up situations in the real world. If a learner looks up a concept in a thesaurus or reference book like the Longman Language Activator (1993), this might be considered form recognition. Likewise, when
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1. 2. 3. 4.
87
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 87
6/9/2010 1:59:18 PM
Foundations of Vocabulary Research
a learner looks up a polysemous word in a dictionary, and considers the various meaning senses, this resembles meaning recognition. However, this does not happen in interpersonal communication, as people are not given a choice of options of form or meaning. Rather they are expected to have the form-meaning link established at meaning recall at least. Thus, form recognition and meaning recognition levels of knowledge are useful in measuring the initial stages of vocabulary acquisition, but have limited utility in describing usage-based receptive and productive mastery. Although not using terminology in the way I advocate above, Webb (2005) provides an exceptional example of a test of vocabulary depth, which provided a very rich description of the amount and type of receptive and productive vocabulary acquisition which took place. Sixty-six Japanese EFL learners tried to learn ten target nonwords.3 Two tasks were contrasted: (1) reading three example sentences with an attached L1 translation for the target words, and (2) receiving the L1 translation and writing a sentence using the words. There was an immediate test to measure learning, but the point of interest here is its comprehensiveness. It contained ten components measuring both receptive/productive mastery and a number of word knowledge aspects:4 Test 1: Receptive knowledge of orthography Test 2: Productive knowledge of orthography Test 3: Receptive knowledge of meaning and form Test 4: Productive knowledge of meaning and form Test 5: Receptive knowledge of grammatical functions Test 6: Productive knowledge of grammatical functions Test 7: Receptive knowledge of syntax Test 8: Productive knowledge of syntax Test 9: Receptive knowledge of association Test 10: Productive knowledge of association. Both tasks lead to considerable learning in a short time. When the same amount of time was spent on both tasks, the reading task was superior, but when the allotted time on tasks depended on the amount of time needed for completion (with the writing task requiring more time), the writing task was more effective. Moreover, it seems that productive learning is superior to receptive learning, not only in developing productive knowledge, but also in producing larger gains in receptive knowledge. In terms of classroom practice, it appears that writing a sentence (productive task) might be better than reading three sentences (receptive task), both for facilitating both receptive and productive knowledge, as long as adequate time is available.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
88
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 88
6/9/2010 1:59:18 PM
Issues of Vocabulary Acquisition and Use
89
The experiments [in his 2005 study] also highlight the importance of using multiple tests to measure vocabulary gains. Many vocabulary acquisition studies have measured only one aspect of knowledge – meaning – with only one test. Experiment 1 showed that no significant differences would have been found between the groups if only a receptive measure of meaning had been used. However, there were significant differences on four of the five productive tests and one of the receptive tests; this indicates that using only receptive or productive tests to measure learning might provide misleading results. Using receptive and productive tests to measure an aspect of knowledge and testing multiple aspects of knowledge may give a much more accurate assessment of the degree and type of learning that has occurred. (2005: 504)
In sum, future research needs to be clearer about what facets of recognition/recall and receptive/productive mastery are being addressed in a study. There is a good case for using both receptive and productive measures of vocabulary in lexical research when possible. If, for practical or other reasons, a researcher chooses to use only a receptive or productive measure, then it should be made plain in the interpretation and discussion what this does (and does not) tell us about overall vocabulary knowledge. A great many vocabulary studies have used a meaning recognition test format. With this being the case, the strength of the vocabulary acquisition and knowledge they report lies at the very beginning stages of learning in terms of depth of knowledge, and probably in terms of automaticity and lexical integration as well. However, this is seldom discussed, with studies typically referring to the target items as being ‘learned’. While this is true in the sense of the form-meaning link being established in an initial manner, it is necessary not only to discuss the learning that has occurred, but also to discuss the strength of that learning.
2.9
Vocabulary learning strategies/self-regulating behavior
Research into the area of language strategies began in earnest in the 1970s as part of the movement away from a predominantly teaching-oriented perspective, to one which included interest in how the actions of learners might affect their acquisition of language. By the 1990s, a number of studies on vocabulary learning strategies had been carried out. These studies showed that many learners do use strategies for learning vocabulary, probably because learning individual lexical items is more manageable than strategically tackling larger, more holistic elements of language proficiency like
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Quote 2.9 Webb on measuring vocabulary knowledge more comprehensively
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 89
6/9/2010 1:59:19 PM
Foundations of Vocabulary Research
the four skills or grammatical knowledge. Chamot (1987) found that high school ESL learners reported more strategy use for vocabulary learning than for any other language learning activity, including listening comprehension, oral presentation, and social communication. A number of large scale studies identified the most frequently used vocabulary learning strategies, although the individual strategies they included in their surveys tended to vary considerably. For example, Table 2.9 shows the top 10 most frequently used strategies from two of the larger studies (Gu and Johnson, 1996, N = 850; Schmitt, 1997, N = 600). Other studies focused on the category of strategies which were used. An example of this approach is Fan (2003), illustrated in Table 2.10. Most studies (like the three illustrated above) tended to look at vocabulary learning strategies as discrete phenomena, but this approach fell out of favor for reasons discussed by Tseng, Dörnyei, and Schmitt (2006). The first relates to the diverse conceptualizations of ‘learning strategies’. There has
Table 2.9 Top 10 vocabulary strategies of L2 English learners Gu and Johnson (1996)
Ma
1. Beliefs: Learn vocabulary and put it to use 2. Dictionaries: Use for comprehension 3. Beliefs: Acquire vocabulary in context 4. Dictionaries: Use extended dictionary strategies 5. Guessing strategies: Use wider context 6. Metacognitive: Self initiation 7. Dictionaries: Looking up strategies 8. Guessing strategies: Use immediate context 9. Note taking strategies: Usage-oriented note taking 10. Metacognitive: Selective attention
5.74 4.97 4.94 4.82 4.60 4.58 4.55 4.47 4.27 4.23
Schmitt (1997)
%b
1. Bilingual dictionary 2. Verbal repetition 3. Written repetition 4. Study the spelling 5. Guess from textual context 6. Ask classmates for meaning 7. Say new word aloud when studying 8. Take notes in class 9. Study the sound of a word 10. Word lists
85 76 76 74 74 73 69 64 60 54
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
90
a
1 = the strategy/belief was extremely unlikely to be used/believed. 7 = the strategy/belief was extremely likely to be used/believed. b Percentage of respondents reporting that they used the strategy.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 90
6/9/2010 1:59:19 PM
91
never been coherent agreement on the defining criteria for language learning strategies, including whether they should be regarded as either observable behaviors or inner mental operations, or both. This is evident in Gu and Johnson’s list in Table 2.8, where beliefs are mixed together with learning strategies. With this definitional confusion, it was difficult to confidently distinguish strategic learning from ‘ordinary’ learning. There is also the argument that an activity becomes strategic when it is particularly appropriate for the individual learner, in contrast to general learning activities which a student may find less helpful. Accordingly, learners engage in strategic learning if they exert purposeful effort to select, and then pursue, learning procedures that they believe will increase their individual learning effectiveness. This, however, means that learning strategies conceptualized in this vein can only be defined relative to a particular person, because a specific learning activity may be strategic for one learner and non-strategic for another. In other words, it is not what learners do that makes them strategic learners, but rather the fact that they put creative effort into trying to improve their own learning. This is an important shift from focusing on the product – the actual techniques employed – to the selfregulatory process itself and the specific learner capacity underlying it. The second problem noted by Tseng et al. concerns the measurement of strategies. Strategy use has typically been measured by self-report questionnaires in the past, since strategic learning is driven by mental processes that do not often lend themselves to direct observation and, therefore, for an accurate assessment of the extent of their functioning, we need to draw on the learners’ own accounts. The self-report questionnaires were based on the Table 2.10 Mean scores in frequency of use by nine categories Category
Ma
Guessing Known wordsb Analysisc Dictionary Sourcesd Repetitionc Groupingc Associationc Managemente
3.54 3.51 3.25 3.22 3.07 3.04 2.54 2.51 2.51
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
a
1 = never use, 5 = very often use. Using known words as part of learning, e.g. revisiting words recently learned. c Different strategies for establishing meaning. d Replaces the social/affective category. e Metacognitive strategies. b
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 91
6/9/2010 1:59:19 PM
Foundations of Vocabulary Research
assumption that strategy use and strategic learning are related to an underlying trait because items ask respondents to generalize their actions across situations rather than referencing singular and specific learning events (Winne and Perry, 2000). However, in practice most questionnaire scales do not load on a single trait. To illustrate this, let us consider the ‘Motivated Strategies for Learning Questionnaire’ (MSLQ), developed at the University of Michigan by Paul Pintrich and his colleagues (Pintrich, Smith, Garcia, and McKeachie, 1991). The MSLQ is aimed at college students and, as the name of the instrument indicates, the items cover two broad areas, motivation and learning strategies. The Learning Strategies category includes 50 items, each using a seven-point Likert scale anchored by ‘Not at all true of me’ (1) and ‘Very true of me’ (7), and is divided into two sections: (a) ‘Cognitive and metacognitive strategies’, which includes subscales labelled rehearsal, elaboration, organization, critical thinking, and metacognitive self-regulation; and (b) ‘Resource management strategies’, which includes the subscales of time and study environment, effort regulation, peer learning, and help seeking. All these subscales are cumulative in the sense that composite subscale scores are formed by computing the means of the individual item scores in a subscale. Let us compare this to the ‘Strategy Inventory for Language Learning’ (SILL) (Oxford, 1990), a frequently used instrument for assessing general language learning strategy use, but the points made will hold true for many vocabulary strategy studies as well. The SILL consists of six scales, including ‘Remembering more effectively’ (memory strategies) and ‘Using your mental processes’ (cognitive strategies). Scale scores are obtained, similarly to the MSLQ, by computing the average of the item scores within a scale. The SILL items all involve five-point rating scales ranging from ‘Never or almost never true of me’ to ‘Always or almost always true of me’. At first sight, these scales are similar to the scales used in the MSLQ discussed above, but a closer look reveals two fundamental differences. First, although both scale types use the term ‘true of me’, the MLSQ scales range from ‘not at all’ to ‘very’ whereas the SILL scales from ‘never or almost never’ to ‘always or almost always’. Second, the items themselves are of a different nature. The items in the MSLQ are general declarations or conditional relations focusing on general and prominent facets of the learning process (i.e. when doing this ... I try to ...). On the other hand, the SILL items are more specific, each one more or less corresponding to a language learning strategy. These two changes, however, result in a major difference in the psychometric character of the two inventories. The items in the MSLQ scale tap into general trends and inclinations and can therefore be assumed to be in a linear relationship with some corresponding underlying learner trait. This is further enhanced by the rating scales asking about the extent of the correspondence between the item and the learner, answered by marking a point on a continuum between ‘not at all’ and ‘very’. Thus, every attempt
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
92
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 92
6/9/2010 1:59:19 PM
93
has been made to make the items cumulative, which is why the scale scores can be formulated by pooling all the scale items. The SILL, on the other hand, focuses on specific strategic behaviors and the scale descriptors indicate frequencies of strategy use (ranging between ‘never’ to ‘always’). These items are, therefore, behavioral items, which means that we cannot assume a linear relationship between the individual item scores and the total scale scores; for example, one can be a good memory strategy user in general while scoring low on some of the items in the memory scale (e.g. acting out a new word or using flashcards). Thus, the scales in the SILL are not cumulative and computing mean scale scores is not justifiable psychometrically. To illustrate the problem in broad terms, a high score on the SILL is achieved by a learner using as many different strategies as possible. Therefore, it is largely the quantity that matters. This is in contradiction with more recent learning strategy theory, which has indicated clearly that in strategy use it is not the quantity but the quality of the strategies that is important (e.g. the point above about ‘appropriateness’ as a critical feature of learning strategies). At one extreme, one can go a long way by using only one strategy that perfectly suits the learner’s personality and learning style; and even if someone uses several strategies, it does not necessarily mean that the person is an able strategy user because, as Ehrman, Leaver, and Oxford (2003: 315) have found, ‘less able learners often use strategies in a random, unconnected, and uncontrolled manner’. Such qualitative aspects, however, are not addressed by the SILL, or vocabulary questionnaires which focus on frequency of use. The conceptual fuzziness and the inadequacy of the psychometric instruments that have been developed to measure the capacity of strategic learning have driven a conceptual shift towards a notion of self-regulation (Tseng et al., 2006) drawn from the field of educational psychology. Rather than focusing on the outcomes of strategic learning (i.e. the actual strategies and techniques the learners apply to enhance their own learning), this conceptual approach highlights the importance of the learners’ innate self-regulatory capacity that fuels their efforts to search for and then apply personalized strategic learning mechanisms. That is, in line with contemporary theories of self-regulation in educational psychology (e.g. Zeidner, Boekaerts, and Pintrich, 2000), the approach targets the core learner difference that distinguishes self-regulated learners from their peers who do not engage in strategic learning.
Concept 2.3
Structural equation modelling (SEM)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
Structural equation modelling is a modern multivariate statistical technique that allows a set of relationships to be examined simultaneously. It is useful for determining the relationships between a large number of variables. A hypothetical model of these relationships is first hypothesized by the researcher, and SEM can either support or refute this model, allowing for further refinement, until the most parsimonious explanation of the data is arrived at.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 93
6/9/2010 1:59:19 PM
94
Foundations of Vocabulary Research
●
●
●
●
●
Commitment control, which helps to preserve or increase the learners’ original goal commitment (e.g. keeping in mind favorable expectations or positive incentives and rewards; focusing on what would happen if the original intention failed). Metacognitive control, which involves the monitoring and controlling of concentration, and the curtailing of any unnecessary procrastination (e.g. identifying recurring distractions and developing defensive routines; focusing on the first steps to take when getting down to an activity). Satiation control, which helps to eliminate boredom and to add extra attraction or interest to the task (e.g. adding a twist to the task; using one’s fantasy to liven up the task). Emotion control, which concerns the management of disruptive emotional states or moods, and the generation of emotions that will be conducive to implementing one’s intentions (e.g. self-encouragement; using relaxation and meditation techniques). Environmental control, which helps to eliminate negative environmental influences and to exploit positive environmental influences by making the environment an ally in the pursuit of a difficult goal (e.g. eliminating distractions; asking friends to help and not to allow one to do something).
The SEM approach was taken a step further by Tseng and Schmitt (2008), who, building upon the self-regulation model, developed an enhanced model for the vocabulary learning process as a whole. It takes a process-oriented perspective, operationalized as the process whereby strategic behaviors are instigated, sustained, and evaluated, drawing on work by Dörnyei (2001a, 2001b, 2005) on the stages of motivation. The model is given in Figure 2.4, and the six latent variables labelled. Interested readers are referred to the article for a complete explanation of the model, including a description of the various facets (or more technically, indicators) making up each latent variable. For example, the variable Self-Regulating Capacity in Vocabulary Learning (SRCvoc) is made up of the five facets illustrated in Figure 2.3. The model indicates that the vocabulary learning process is cyclical in nature. It starts with an Initial Appraisal of Vocabulary Learning Experience, which is conceptualized as the initial motivational level of vocabulary learning, which can be indicated by value, interest, effort, or desire. This affects a learner’s Self-Regulating Capacity in Vocabulary Learning. The current view of the nature of self-regulating capacity is that it is an aptitude which is developable, i.e. it can change incrementally with experience and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Tseng et al. (2006) used a structural equation modelling (SEM) approach in an attempt to describe what self regulation entails in terms of vocabulary learning. They developed a model (illustrated in Figure 2.3) which suggests that this self-regulation consists of five facets:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 94
6/9/2010 1:59:19 PM
Issues of Vocabulary Acquisition and Use
95
Commitment control 0.88
Satiation control
0.85
0.84 0.88
Emotion control
Self-regulatory capacity in vocabulary learning
0.69
Environment control
Figure 2.3 learning.
A structural equation model of self-regulatory capacity in vocabulary
(Tseng et al., 2006: 93).
instruction, and the model indicates it is dependent on the instigation of the initial appraisal of vocabulary learning experience, with its related motivational state. Self-regulating capacity in turn drives the use of vocabulary strategies. However, in this model, strategic behavior is divided into two components: Strategic Vocabulary Learning Involvement and Mastery of Vocabulary Learning Tactics. The former refers to a ‘quantity’ dimension of strategy use, which concerns effortful covert or overt acts to discover or improve the effectiveness of particular tactics. (We used the term tactic to avoid the baggage associated with the term strategy, and the two are essentially interchangeable.) It entails the overall involvement with vocabulary learning and the attempts made to pursue it. This includes several elements: how frequently a learner is involved in vocabulary learning behaviors, the range of vocabulary learning behaviors a learner is involved with, and having a metacognitive awareness of how to best enhance the effectiveness of vocabulary learning tactics. One might think of SVLI as a learner’s general experience with, and understanding of, their vocabulary learning behaviors. The latter refers to the ‘quality’ dimension of strategy use, which concerns mastering specific or special covert or overt learning methods to acquire vocabulary knowledge. This mastery dimension is about using specific vocabulary learning behaviors effectively. Reaching the mastery level entails developing an awareness of what learning tactics to use and when
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Megacognitive control
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 95
6/9/2010 1:59:19 PM
Foundations of Vocabulary Research e3
e2
e1
ANX
ATT
EFF
.69
−.75
.81 META
e6
SAT
e7
EMOT
e8
ENV
e22
−.72
.89
.79 .68 *
.82
PAVLT
.67*
COM
e5
e23
SKILL HELP SATIS
IAVLE
D1
e4
e24
D6
.56*
D2
.83 . 71 .71
SRCvoc
VOCkno
SIZE
e21
DEPTH
e20
. 67
.79 .43
D5
.62*
.48* .46*
MVLI
D3 .62
.69
.78
.71
.87
MVLT
.59
.73 .74
SSB
SEB
SIMB
SAB
SIB
IMAG
e9
e10
e11
e12
e13
e14
D4
.54
.81
LINK COMP HILIT
e15
e16
e17
SOCI HAND
e18
e19
Figure 2.4 A structural equation model of motivated vocabulary learning IAVLE = initial appraisal of vocabulary learning experience SRCvoc = self-regulating capacity in vocabulary learning SVLI = strategic vocabulary learning involvement MVLT = mastery of vocabulary learning tactics VOCkno = vocabulary knowledge PAVLT = postappraisal of vocabulary learning tactics.
and how to use them effectively. The model indicates that having a wide range of vocabulary learning involvement and experience helps organize a learner’s strategic options and helps learners gain mastery over the learning tactics that prove useful, i.e. the repeated appropriate usage of tactics (as governed by SVLI) eventually also leads to mastery over those tactics. The skilled and appropriate use of strategies/tactics directly leads to increased Vocabulary Knowledge, which is indicated by both size and depth components. After a learning experience, it is only natural for a learner to think about how well they have done. This period of self-reflection of task processes when the task is completed is represented in the model by Postappraisal of Vocabulary Learning Tactics. Dörnyei (2001b: 91, emphasis in original) argues that this phase is very important in that such a ‘critical retrospection contributes
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
96
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 96
6/9/2010 1:59:20 PM
97
significantly to accumulated experience, and allows the learner to elaborate his or her internal standards and the repertoire of action specific strategies’. In particular, it has been found that learners’ causal attributions as a result of task retrospection exert a critical influence on subsequent expectancy for success, self-efficacy belief, achievement behaviors, and emotional responses (Dörnyei, 2001b). Hence, it seems that not only does initial motivational state influence the processes of task performance, but also a retrospection of task performance is likely to in turn influence this state. Thus in the model, the term ‘initial motivational state’ should be understood as the current motivational state in the subsequent recursive stages of the evaluation process. This model is certainly not comprehensive, but it does take into account some of the recent thinking on the dynamic role of motivation on language learning (Dörnyei, 2001a, 2001b). Motivation appears to be involved in all stages of learning (instigating, sustaining, and evaluating), thus permeating the whole process. Another aspect taken into consideration is the necessity for the learners to self-regulate their learning. Learners need to understand the way they learn best and be proactive in pursuing methods of learning that are effective for themselves. Much of the value of the model is that it begins to show the relationship between a number of learning-based variables, and a number of implications seem supportable. First, the vocabulary learning process is systematic and cyclic in nature. Second, initial motivation and self-regulation both have important parts to play in the vocabulary learning process. Third, metacognitive control of vocabulary learning tactics is necessary for efficient learning. Finally, postlearning evaluation is important to the learning process. Overall, such a dynamic, integrated perspective of lexical strategic behavior is a step forward from viewing strategies as independent learner behaviors, just as thinking of vocabulary in terms of an integrated lexicon is a step closer to reality than thinking of it as a bunch of independent lexical units.
2.10
Computer simulations of vocabulary
It is a common observation that there is currently no overall theory of vocabulary acquisition (e.g. Nation, 1995; Read, 2000; Schmitt, 2000). This is perhaps not surprising given the complexity of vocabulary knowledge, the large number of lexical items that have to be learned, and the diversity of those items. Of course there have been many theories limited to how specific aspects of lexical knowledge are acquired or used. Below are just a few: ● ●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
the fast mapping of initial meaning (Carey, 1978) the initial establishment of a form-meaning link by attaching the L2 form to an existing L1 meaning (‘parasitic model’) (Barcroft, 2002; Hall, 2002; Jiang, 2002)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 97
6/9/2010 1:59:20 PM
● ●
●
●
●
prototype theories of concept categorization (Aitchison, 2003) Levelt’s (1989) view of the lexicon in listening comprehension and speech production the dynamic model of the multilingual mental lexicon (de Bot, Lowrie, and Verspoor, 2005) frequency/exposure and pattern-extraction-based models of vocabulary acquisition (Ellis, 2002) the DEVLEX model of the growth of L1 lexicons (Li, Farkas, and MacWhinney, 2004).
It is beyond the scope of a methodology book such as this to discuss and evaluate the various theories. What is of interest are the tools which allow researchers to explore the lexical features and properties which these theories are trying to capture. One of these tools is the computerized simulation of vocabulary learning and processing. While there are some quite complex simulations of vocabulary learning (e.g. the DEVLEX model), Meara (2006) notes that a truly comprehensive model of the mental lexicon is extremely difficult. Even a small lexicon will have 2,000 or more words, and each of these will have multiple links to each other (orthographic, phonologic, morphological, semantic, grammatical, collocational, etc.). Modelling this complexity in any realistic way is verging on the impossible. However, it is possible to use much simpler models, which are easier to both understand and to manipulate when exploring the nature of small lexicons. Although relatively basic, they can be useful in suggesting how real, more complex, mental lexicons behave. Meara (2004, 2005, 2006) has been in the forefront of developing these basic models of lexicon behaviour. He conceptualizes the lexicon as a Random Autonomous Boolean Network, where each word’s varying levels of formal, grammatical, and semantic activation are reduced to a simple binary distinction. In the following truncated and slightly edited extract, he explains the basics of his Boolean models (2006: 625–630). In these simplified models, each lexical network consists of a set of WORDS. Each word has two states: ACTIVATED or UNACTIVATED, depending on the way it interacts with other words in the lexicon (corresponding roughly to productive and receptive vocabulary respectively). Each word is randomly connected to only two other words in the network, and each word receives an INPUT from each of these link words. Words react to inputs in different ways. Some words (conventionally called AND words) become activated only when both the words they are linked to are activated, while others (conventionally called OR words) become activated if only one of their input words is activated. (That is, some words have low activation thresholds, while for other words activation is more difficult.) A simple example will illustrate these features.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
98 Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 98
6/9/2010 1:59:20 PM
Issues of Vocabulary Acquisition and Use
A
99
B c d
Figure 2.5 A simple random autonomous Boolean network (a)
(b) A
A
B c
c
e
d
A
B
(c)
e
d
A
B
(d)
c e Figure 2.6 stimulus
B
c d
e
d
How a random autonomous Boolean network responds to an external
Activated units are shaded grey. Figure 2.6a Word B has been activated by an external stimulus ... Figure 2.6b ... causing Word C and Word D to become activated Figure 2.6c This activates both Word E and Word D ... Figure 2.6d ... resulting in the reactivation of Word C.
AND units are labelled with uppercase letters, OR units are labelled with lower case letters. Arrows show the direction of activation. In Figure 2.5, we have a simple five word lexicon. Word A and Word B are AND words: they have a high activation threshold, and only become activated if BOTH of their inputs are already activated. The other words are OR words: they have a low activation threshold, and will become activated if either one of their inputs is already activated. Word A gets its inputs from Word C and Word E; Word B gets its inputs from Word A and Word C, etc. Initially, all the words are unactivated. However, let us suppose that an external stimulus temporarily activates Word B. This causes a ripple of spreading activation to percolate through the entire system, as shown in Figure 2.6.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
e
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 99
6/9/2010 1:59:20 PM
Foundations of Vocabulary Research
Activating Word B (Figure 2.6a) causes Words C and D to become activated. Both are OR words, which activate easily. The activation of Word B was temporary, so it reverts to the deactivated state (Figure 2.6b). The activation of Words C and D spreads to Word E. Word D is still receiving input from Word C, so it remains in the activated state. Word C is no longer receiving any activation, so it reverts to the deactivated state (Figure 2.6c). Activation in Words D and E causes Word E to remain activated, and causes Word C to be reactivated. Word D is not receiving any input, so it reverts to the deactivated state (Figure 2.6d). This process continues until network settles down into a stable state. First impressions might lead us to expect that large Boolean networks would be extremely unstable, but surprisingly, this is not the case. Irrespective of size, Boolean networks with the characteristics that we have described above quickly settle down into an attractor state, a stable configuration in which some words are permanently activated, while other words are permanently deactivated, and the overall pattern of activation in the network remains stable as long as it receives no external stimulation. Sometimes, a small number of words form an oscillating pattern, where individual words move between the two states, but it is unusual for these oscillations to be very large. Figure 2.7 shows an example of a Boolean network moving from an initial random configuration into a stable attractor state. Here we have a lexical network consisting of 1,000 words, where the initial connections and activation values of the words are set at random. When the simulation starts, it iterates through the implications of these activation patterns, in much the same way as we did in Figure 2.6, but on a much larger scale. The illustration shows that the number of activated words in the initial state of the model is about half of the total. In each subsequent iteration of the model, this figure changes, until the model eventually settles down into an attractor state where just under two-thirds of the words are activated, and the remaining words are inactive. The wobble in the number of activated words indicates that a small number of words are oscillating between the activated and deactivated states. Once a network reaches one of its attractor states, it will tend to resist any arbitrary changes that are inflicted on it by external agencies. For instance, let us define a kick event, as a temporary activation of a handful of words (e.g. 50) which are deactivated in the attractor state. (In real life, a kick event would correspond to some kind of interaction which activates a number of deactivated words. Reading a text, for example, or interacting with another speaker might have this effect.) Events of this type generally result in only a small flurry of activity which spreads through the network, and rapidly dies away. Typically, the effects of the additional activation last for only a few iterations of the model, and the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
100
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 100
6/9/2010 1:59:21 PM
101
Figure 2.7 A Boolean network reaching one of its attractor states
activation level associated with the attractor state is quickly restored. It is possible to nudge a network into a new attractor state by using kick events, but usually only very large kicks can bring about a change of this kind. It is also possible to force a network rather than kick it. In forcing, we activate a small number of different words repeatedly. This repetition sometimes prevents the network from returning to its attractor state. Rather surprisingly, constant dripping of this sort produces very different effects from what we get with a single kick event, even a relatively large one. Typically, forcing produces a massive increase in the number of activated words in the network. For example, if we use repeated individual forcing events of only five words, even this very small input forces the overall level of activation in the network to rise rapidly, and very large numbers of words are affected by these small events: in one simulation, 200 forcing events were sufficient to raise the overall activation level of the network by more than a third, although it eventually returned to its stable attractor state.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
These basic tools, as Meara describes above, consist of only a very simple network structure, and a small repertoire of simple operations. However, computerized simulations can be used in a variety of ways to make interesting observations about real lexicons, including the issues of attrition, the nature of multilingual lexicons, and measuring productive lexicons.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 101
6/9/2010 1:59:21 PM
Foundations of Vocabulary Research
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
102
Figure 2.8 Ten examples of attrition in a network (Meara, 2004: 143).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 102
6/9/2010 1:59:21 PM
Issues of Vocabulary Acquisition and Use
103
2000 1500 1000 500 0 0
25
50
75
100
125
150
175
200
225
Attrition events Figure 2.9 Pooled data abstracted from the ten cases in Figure 2.9 (Meara, 2004: 145).
Meara (2004) notes that typical attrition studies look at whether individual lexical items are retained or forgotten, but do not explore how attrition of individual items affects the overall structured network lexicon. In fact, there is no way of doing this with real human subjects at present, but computer simulations can allow exploration of attrition at a network level. Meara defined attrition as changing OR words into AND words (i.e. words which are easily activated become words which are less easily activated). He then ran ten simulations where such a switch was made on a random word in a 2,500 word network, and the network allowed to iterate five times so the impact of the change could be absorbed. This was repeated 255 times, and any resulting degradation noted. The networks were allowed to reach a stable attractor state before the attrition events began. The ten trials are illustrated in Figure 2.8. A number of interesting observation about attrition can be made from the graphs in Figure 2.8. The overall trend is similar across the cases, as all trials start at a very high level of activation, and they all end up with relatively low levels of activated words after a period of very heavy loss. However, the individual trials are quite different in the progress of this decline. Sometimes it sets in quite early, but in other trials it is significantly delayed. Although the average of these ten trials suggests a steadily declining pattern of vocabulary loss (Figure 2.9), this ‘average’ pattern is very misleading. This result suggests we need to be cautious when interpreting averaged results from ‘real’ attrition studies, as the variability of individual results may well be more important than the averaged trend (see Section 4.7). Meara also notes that not all attrition events result in an immediately observable loss of activity in the network, which indicates that it may be important to differentiate between the attrition events (i.e. changes in characteristics of individual lexical items) and the resulting vocabulary loss (when items become no longer activatable). In these simulations, vocabulary loss is always triggered
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Activated words
2500
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 103
6/9/2010 1:59:22 PM
104
Foundations of Vocabulary Research
Quote 2.10 Meara on the nature of vocabulary attrition in computerized simulations Although attrition events [in computer simulations] do not necessarily result in immediate vocabulary loss, they do weaken the structure of the vocabulary. Each attrition event that does not result in immediate loss produces a cluster of words which are more dependent on each other than they were before, and more likely to be affected by a change in one of their immediate neighbours ... It seems then, that the attrition process has the effect of re-structuring the lexical network so that it is very vulnerable to small changes in activation. (2004: 147)
Meara (2006) also used Boolean networks to explore the relationships between the languages in multilingual lexicons. The patterns which spontaneously emerged from the simulations mirrored closely some of the behavior reported in real lexical behavior. He found that his bilingual simulations exhibited something reminiscent of a lexical switching mechanism, which bilinguals need to switch between the vocabularies of two languages (i.e. allowing rapid activation of one language and the simultaneous deactivation of the other language). His trilingual models showed that, under certain conditions, activity in an L3 can sometimes generate spontaneous reactivation of words in an L2, a phenomenon often reported by trilingual speakers. Research along these lines is useful in that it can suggest lexical behaviors which are emergent properties of a systematicallyorganized lexicon. Another application of computer simulations is to explore the validity of vocabulary measurement instruments. Meara (2005) used a different type of computer simulation – a Monte Carlo analysis – to explore the behavior of the Lexical Frequency Profile (LFP – see Section 5.2.4) (Laufer and Nation, 1995). His analysis suggests that the LFP can reliably distinguish between learners with relatively large vocabulary size differentials (1,000–2,000+ words), but that it is not particularly sensitive in distinguishing between learners who have similar vocabulary sizes. However, Laufer (2005b) questions many of the assumptions underlying Meara’s analysis. For our purpose of discussing the methodology of computer simulations, the most important criticisms are (1) that computer simulations do not always reflect real world phenomenon, potentially resulting in misleading results because key characteristics are not accounted for in the simulations, and (2) that
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
by attrition events, but attrition events do not always trigger vocabulary loss events. Finally, large vocabulary loss events can result from a relatively small number of attrition events, In a 2,500 word lexical network, the loss of less than 200 items typically leads to catastrophic vocabulary loss.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 104
6/9/2010 1:59:22 PM
105
simulated data is not necessarily as valid as real data. These are legitimate points, as Meara himself acknowledges, and serve to reinforce the need to be very careful when formulating the underlying assumptions behind simulations. Despite being a relatively new research tool, computer simulations appear to a have great deal of potential. The number of trials that can be run far exceeds the number of human participants which could be gathered for a study. Similarly, the parameters of research design can be easily adjusted to explore different scenarios, something which is not practical when new participant groups are needed for each research design variant. Perhaps most importantly, computer simulations challenge researchers to explicitly and specifically define the linguistic behaviour they are exploring. For more discussion of lexical computer simulations, see Meara (2006). Some tools for carrying out this type of research are available on the _lognostics website (http://www.lognostics.co.uk/).
2.11 Psycholinguistic/neurolinguistic research Vocabulary research has tended to focus on issues of vocabulary size, and consequently there are a number of techniques and measures available to tap into this aspect of vocabulary knowledge. With some exceptions (e.g. word associations), vocabulary researchers have only recently turned their attention to depth of knowledge, and so techniques and measures in this area are much less well-developed. However, there has been comparatively little research into lexical processing in the applied linguistics arena, or how lexical knowledge is automatized. Conversely, the fields of psycholinguistics and neurolinguistics have given a great deal of attention to lexical processing and the physiological mechanisms underlying it. As a consequence, they have developed a number of measurement techniques for dealing with these issues. These have great potential to be informative in a broad range of vocabulary research, as they often tap types of lexical knowledge and learner behavior which traditional size/depth methodologies cannot address. Furthermore, they often give more precise and nuanced measurements of knowledge, allowing quantification of smaller amounts of learning, especially at the beginning stages of the incremental learning process. As such, they are an important component of the vocabulary researcher’s toolkit. Below I outline some of the major techniques categorized according to whether they primarily address issues of speed of processing, connections between lexical items in input or in the mind, or processing mechanisms. (Note that all of these categories are of course interrelated, and this division is not intended to suggest that they are discrete processes.) I will illustrate some of the techniques with studies, many of them drawn from research into formulaic language carried out at the University of Nottingham’s Centre for
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 105
6/9/2010 1:59:22 PM
106
Foundations of Vocabulary Research
Research in Applied Linguistics (CRAL). Many of these techniques are relatively new to mainstream vocabulary research, but have very great potential in the hands of innovative researchers. (See Dörnyei, 2009, and Harley, 2008, for more on psycholinguistic/neurolinguistic research techniques and their possibilities and limitations.)
Part of mastering lexical items means that processing becomes more automatized. This is evidenced in faster recognition/comprehension speeds when listening or reading, and fast retrieval/production speeds when speaking or writing. In other words, learners need to develop what most language specialists would call fluency. Adequate speed of processing is essential for efficient language use, and this is perhaps most clearly illustrated in reading. If lexical recognition is not fast enough (natives read at 200–300 words per minute (Grabe and Stoller, 2002)), then reading slows down to a frustrating word-by-word (or even letter-by-letter) decoding, in which meaning construction is impaired and the overall flow of the text cannot be understood. Also, it seems that even though advanced L2 learners can match native performance in certain ways (e.g. answering comprehension questions after reading), they find it difficult to do this at a native-like speed (McMillion and Shaw, 2008, 2009).
Concept 2.4
Milliseconds
A millisecond is one one-thousandth (1/1,000th) of a second. Since human processing is very fast, it is the standard unit of measurement for timed experiments in psycholinguistics.
Speed of processing is often referred to as automaticity, although automaticity can also contain the notion of an absence of attentional control in the execution of a cognitive activity. Segalowitz and Hulstijn (2005) give the example of the recognition of the single letter A. It is thought that recognition of a letter like this by a proficient reader requires no conscious effort or effortful attention, is extremely rapid, and cannot be interfered with by other ongoing activities. In fact, a fluent reader cannot help but recognize the letter, and so the process is thought of as ‘automatic’. In contrast, Segalowitz and Hulstijn relate the case of a novice L2 reader who may require considerable consciously directed effort, applied slowly over an interval much longer than it takes to recognize a letter in their L1. ‘Thus, the relatively rapid, effortless, and ballistic (unstoppable) activities underlying fluent letter recognition are said to be automatic, standing in contrast to slower, effortful activities that can be interrupted or influenced by other ongoing internal processes (e.g. distractions, competing thoughts)’ (2005: 372).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Speed of processing
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 106
6/9/2010 1:59:22 PM
Issues of Vocabulary Acquisition and Use
de Bot on lexical automaticity in speaking
When we consider that the average rate of speech is 150 words per minute, with peak rates of about 300 words per minute, this means that we have about 200 to 400 milliseconds to choose a word when we are speaking. In other words: 2 to 5 times a second we have to make the right choice from those 30,000 words [in the productive lexicon]. And usually we are successful; it is estimated that the probability of making the wrong choice is one in a thousand. (de Bot 1992: 11)
At the moment, not much is known about the acquisition and development of automaticity, with empirical research into the facilitation of automaticity and its impact on subsequent skills just beginning (Segalowitz and Hulstijn, 2005). As such, there is certainly great scope for research into the processing speeds of virtually all aspects of lexical mastery, and how to facilitate its development. For example, in addition to measuring whether a person knows a word’s meaning, it would also be informative to measure how quickly they could recognize its written form when reading. Measuring automaticity typically entails using techniques which have a timed element, i.e. measuring the time taken by participants to complete some task. The type of timed task varies, but they are usually quick judgement tasks which are less likely to be contaminated by conscious thought processes. (See Section 5.4 for methodologies to measure automaticity/speed of processing.) Connections between lexical items in input or in the mind Word association research has illustrated the connections between lexical items in the mental lexicon. Research has also shown that lexical items are not processed in isolation, but are affected by their surrounding context. This is particularly true of the preceding context. For example, when reading a James Bond story about international espionage, words like spy are recognized more quickly than non-espionage words of the same frequency level. That is, the story context primes the word spy, i.e. facilitates its processing in terms of speed and accuracy.
Quote 2.12
McDonough and Trofimovich on priming
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Quote 2.11
107
Priming refers to the phenomenon in which prior exposure to language somehow influences subsequent language ... Priming is believed to be an implicit process that for the most part occurs with little awareness on the part of individual language users ... In other words, the exact forms and meanings that speakers use can be affected by the language that occurred in discourse they recently engaged in. (2008: 1–2)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 107
6/9/2010 1:59:23 PM
Foundations of Vocabulary Research
Priming can accrue from either repetition of form, or from meaning relationships between words. In terms of form, people have a tendency to process a word or word combination more quickly and more accurately when they have had previous exposure to that word or word combination. For example, if participants listen to a list of words such as glasses, chair, and picture spoken one at a time, and then are asked to listen and repeat words like mug, printer, and chair, they will repeat the words that were on the initial list (chair) more quickly and accurately than the words that did not appear on that list (mug and printer). Similarly, there is a tendency for people to process a word more quickly and more accurately when they have been previously exposed to a word that is related in meaning (semantic priming). For example, participants will correctly identify boy as a word more quickly if they recently read the word girl as opposed to an unrelated word like road (McDonough and Trofimovich, 2008). It is useful to note, however, that priming effects are short-lived and last a matter of seconds rather than days or weeks. Because priming is a well-known and robust effect, it can be used to determine the effects of repetition in language exposure, and whether there are semantic relationships between words. This can be illustrated by a study inquiring into whether L2 learners form collocational memory traces. Durrant and Schmitt (in press) selected a number of adjective-noun combinations which could logically occur together, but do not in fact do so in the BNC. Therefore the word combinations (wonderful drink) are not collocations in the English speech community, but were plausible nonetheless. Non-collocation pairs were selected so that participants would have no prior collocational mental link between the two words, as the purpose of the experiment was to determine if these were formed from exposure. Durrant and Schmitt exposed their L2 participants to various adjective-noun combinations in sentence contexts, including the non-collocation pairs. They then administered a priming test to determine whether using the adjective component of a noncollocation pair (e.g. wonderful) as a prime allowed the participants to produce the related noun in a completion task (dr— — —). Durrant and Schmitt found that even one exposure to the noncollocation pair in the input stage led to a small, but significant, facilitation of noun completion. In other words, even one exposure to a particular word combination led to an initial collocational memory trace. Furthermore, they found that two repetitions of the word combinations led to a large facilitation effect. Thus the very sensitive priming paradigm allowed the measurement of the very initial stages of collocation formation. (For a good source for advice on a number of different priming techniques, including a discussion of language studies using them, see McDonough and Trofimovich, 2008).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
108
Processing A number of techniques now make it possible to ‘look inside’ the brain (both figuratively and literally), and to gain insights into how language is
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 108
6/9/2010 1:59:23 PM
109
processed. They are potentially some of the most exciting ways of exploring vocabulary questions which have long evaded answers up until now, such as whether different types of vocabulary are stored in different areas of the brain, and whether acquisition/attrition involves physiological change in the brain. Below are three techniques with considerable potential for researchers with the vision to use them creatively. Eye movement studies track the movement of the eye while doing linguistic tasks. The apparatus can either track the eye as it reads text on a monitor screen, or note which of several picture images it flicks to first when doing a language task. One major advantage is that the participant does not have to do anything (like push a button), which eliminates any physiological ‘noise’ out of the experiment (e.g. slow reflexes, pushing the wrong button by mistake). If the task focuses on reading, the eye-tracking paradigm is as close to normal reading as is possible in an experimental setting (Duyck, van Assche, Drighe, and Hartsuiker, 2007). Reading-based eye movement studies To illustrate an eye movement study which focuses on lexical issues, let us look at a follow-on study to the Conklin and Schmitt (2008) self-timed reading study discussed in Section 5.4. The study is particularly interesting because it demonstrates how the same research questions can be researched with different methodologies. The researchers (Siyanova, Conklin, and Schmitt, under review) were interested in how formulaic sequences are read in context. While the Conklin and Schmitt self-timed reading methodology showed that formulaic sequences were read more quickly than non-formulaic sequences, the eye-movement technique provided a much richer description of the reading behavior. This included measures of not only the ‘first pass’ at reading a particular word or phrase (in the region of interest), but also successive rereadings in that region. Siyanova et al. examined the following five measures, which give an indication of the type of information which the eye-movement paradigm provides (illustrated in Figure 2.10). Total Reading Time (TRT) – the sum of all fixation durations made within a region of interest. This measure indicates how much time the participant spent reading the target lexical items and includes all fixations which landed on those items.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
First Pass Reading Time (1PRT) – the sum of all fixation durations made within a region of interest until the point of fixation leaves the region either to the left or to the right (also, known as gaze duration). This measure tells us how long the reader fixated on the target item the first time it was encountered, and excludes any possible rereadings and regressions.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 109
6/9/2010 1:59:23 PM
110 Foundations of Vocabulary Research
Regression Path Duration (RPD) – the sum of all fixation durations starting with the first fixation within a region of interest up to but excluding the first fixation to the right of this region. This measure gives us the durations of all fixations that were made on the target item plus all later regressions to the left of the target. Rereading (RR) – the regression path duration for the region of interest minus first pass reading time for this region. Rereading time gives an indication of the time the participant spent rereading the text after having encountered a problem. Stimuli similar to the Conkin and Schmitt passages were used, containing idioms (left a bad taste in my mouth) and matched control phrases (the bad taste left in his mouth). Whole passages were presented on the monitor screen and the participants were eye-tracked while they read these passages. The eye-movement analysis found that the native-speaking participants processed the idioms significantly faster than the non-formulaic controls, and that there were no differences between figurative and literal readings of idioms, as indicated by TRT, FC, RPD, and RR.5 This result replicates Conklin and Schmitt’s findings, but is much more powerful, as it includes measures not only of total reading time (similar to self-paced reading), but also of the amount of reading time including regressions and rereading, as well as number of fixations. For the nonnative participants, there was no evidence on any of the measures for idioms being processed any differently from the matched controls, but the figurative readings seemed to be read slower than literal readings. The sum result from all of the eye-movement She’s always been as cold as ice with her children. 1
2
3 5
4 6
7
8
Total Reading Time (TRT) = 3 + 4 + 6 First Pass Reading Time (1PRT) = 3 + 4 Fixation Count (FC) = 3 + 4 + 6 Regression Path Duration (RPD) = 3 + 4 + 5 + 6 Rereading (RR) = 5 + 6
Figure 2.10 of interest
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Fixation Count (FC) – the number of all fixations made within a given region of interest. This measure indicates how many times the target was fixated upon and includes all possible regressions to and rereadings of the target item.
Hypothetical eye-movement record. Shaded area represents the region
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 110
6/9/2010 1:59:23 PM
Issues of Vocabulary Acquisition and Use
111
measures builds a strong case that native speakers and L2 learners process idiomatic language quite differently. Visual world paradigm
when participants are simultaneously presented with spoken language whilst viewing a visual scene, their eye movements are very closely synchronised to a range of different linguistic events in the speech stream ... eye movements are closely synchronised to the referential processing of the concurrent linguistic input ... the probability of fixating one object amongst several when hearing that object’s name is closely related, and the fixations closely timelocked, to phenomena associated with auditory word recognition. (Altmann and Kamide, 1999: 249) In other words, when the mind is processing speech, eye movements reflect that underlying processing. This can be used in a number of ways in lexical research, including researching how lexical items constrain the subsequent items which can logically follow, helping the mind to ‘predict’ those subsequent items, facilitating fast lexical access. For example, the modifiers frizzy and blonde will cumulatively constrain the choices for the follow-on noun down to little more than hair. Altmann and Kamide (1999) explored whether this type of constraint also applies to verbs. They set up an eye-movement study where participants looked at pictures and listened to a number of sentences. One sentence included a verb for which only one of the objects in the picture made sense, while another sentence included a verb for which any of the four or five objects could logically follow. This is exemplified by Figure 2.11, which shows a boy sitting next to a toy car, a ball, a toy train, and a cake. The stimulus sentences for this picture included The boy will move the cake and The boy will eat the cake. The researchers found that the verb eat allowed the participants to shift their fixation to the picture of a cake much faster than did the verb move. Moreover, in 54% of the cases, the shift to the picture of the cake started before the onset of cake in the speech stream. This provides evidence that knowledge of a lexical item is not only used in the recognition and processing of that individual item, but also aids in the processing of downstream items as well. The visual world paradigm has also been used to look at how knowledge of a first language influences processing in a second language. Conklin, Dijkstra, and van Heuven (under review) looked at processing by highproficiency Dutch learners of English and an English monolingual control
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
The eye movement methodology can also be used to determine how the eye fixates on non-linguistic features, such as pictures on a screen. This is useful because research has shown that
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 111
6/9/2010 1:59:23 PM
Figure 2.11
Visual world paradigm picture stimulus
(Altmann and Kamide, 1999: 250).
group to determine whether knowledge about words in a L1 influences the processing of words that mean the same thing (either cognates or noncognates) in the L2. Participants listened to English while viewing a corresponding visual scene containing a cartoon character (Donald Duck) and an inanimate object (tractor). Interpretation of pronouns was investigated following the presentation of inanimate nouns which have gender in Dutch, but not English (i.e. tractor is masculine in Dutch). To refer to a previously mentioned tractor in Dutch, the masculine singular pronoun hij is used. However, in English he refers to an animate object. Upon hearing the pronoun he, the Dutch native speakers had increased looks and fixation durations to inanimate objects for cognates (e.g. tractor) relative to the animate character (Donald Duck) that the English pronoun referred to. However, there were only increased looks to inanimate non-cognates initially (e.g. kite, which is vlieger in Dutch), while the effect for cognates was long-lasting. Monolinguals only had looks to the animate character. Results indicate Dutch learners of English activate information about Dutch gender when processing spoken discourse in English. Further, this demonstrates that the amount of overlap between the two languages influences processing, with a closer relationship (i.e. cognateness) leading to more crosslinguistic influence.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
112 Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 112
6/9/2010 1:59:23 PM
113
Event-related potentials (ERP) ERP methodology measures the brain’s electrical activity during language comprehension through a number of sensors placed around the scalp (normally attached to a kind of swimcap-type headgear). It gives a very precise millisecond-by-millisecond record of brain activity, and so can provide a window into the online processing of language. The electrical activity is plotted onto graphs with the time element on the x-axis, and the amplitude of the electrical activity plotted on the y-axis, such as in Figure 2.12. The figure illustrates the two major language-based ERP patterns which have emerged in the research. The first is called N400 (N = negative voltage wave; 400 = 400 milliseconds after a word is read or heard). (Note that negative waves are indicated above the x-axis and positive waves are below it.) N400 seems to be generated by lexical and semantic processing (both form and meaning), but does not seem to be affected by syntactic variables. Conversely, the second pattern (P600: positive 600 millisecond onset) is sensitive to syntactic but not lexical variables, and so will not be discussed further. The N400 wave is particularly sensitive to semantic anomaly between a target word and its preceding context.6 For example, in Figure 2.12, the word bake, which does not make sense in the context, generates a much higher N400 peak than the word eat, which generates a noticeable, but smaller, peak. For native speakers, the results are usually as follows: ‘the N400 amplitude is largest for pronounceable, orthographically legal nonwords
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 113
6/9/2010 1:59:24 PM
Figure 2.12
ERP plots showing N400 and P600 phenomena
(Osterhout, McLaughlin, Pitkänen, Frenck-Mestre, and Molinaro, 2006: 204).
(pseudowords; e.g. flirth), intermediate for words preceded by a semantically unrelated context, and smallest for words preceded by a semantically related context’ (Osterhout et al., 2006: 209). So overall, the N400 effect grows with stimuli that are harder to integrate semantically. It also informs about how well the word is known, in the sense that the N400 amplitude indicates whether a participant has picked up on the congruency/incongruency of the word with its preceding text. This is expanded upon by Kutas, Van Petten, and Kluender (2006: 668), who believe that ‘The correct characterization of the N400 context effect is thus not that anomalous or unrelated words elicit unusual brain responses, but rather that a large negativity between 200 and 500 ms or so (N400) is the default response, and that its amplitude is reduced to the degree trial context aids in the interpretation of a potentially meaningful stimulus.’
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
114 Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 114
6/9/2010 1:59:24 PM
115
ERP offers the exciting potential method of measuring implicit lexical knowledge, without being confounded with declarative knowledge. It also offers a method of documenting the incremental development of lexical knowledge over time, particularly at the beginning stages of the learning process. In a very interesting study, Osterhout et al. (2006) show how this can be accomplished using ERP methodology. They studied beginning L2 French learners (L1 = English), and measured their knowledge of French words and nonwords at 14 hours, 60 hours, and 140 hours of instruction. They asked the participants to make word/nonword judgements, and took ERP readings while they did this. At 14 hours they found that the conscious lexical judgements were at chance level, but that the N400 results were more robust for pseudowords than real words. They interpret this to mean that ‘the French learners rapidly extracted enough information about French word forms so that their brains could discriminate between actual words and pseudowords, even if the learners themselves could not do so’ (2006: 211). In other words, they were learning about word form after only 14 hours. By the 60th hour of instruction, the participants were learning about meaning, which was indicated by smaller N400s when the target items were preceded by related words than when preceded by unrelated words. By the 140th hour, the amplitude of the N400s approximated native results. In contrast, the conscious lexical judgements remained very poor. Thus, while the learners were not able to demonstrate much learning in the lexical decision task (and so would be unlikely to do well on typical vocabulary size and depth tests either), the ERP methodology was able to show that vocabulary learning was accruing below this ‘conscious threshold’. ERP would appear to be a very useful technique for tapping into the earliest stages of vocabulary learning. It can show that a participant has some sort of lexical entry that has a semantic component. For nonnatives, N400 is an ideal indicator of whether learners have started to create a lexical representation. ERP might also be useful for obtaining measures of implicit vocabulary knowledge which could be contrasted with explicit knowledge obtained from conscious declarative techniques. However, ERP cannot show how complete the knowledge of a lexical item is. Functional Magnetic Resonance Imaging (fMRI) Blood flow and blood oxygenation are closely linked to neural activity. This is because neurons in the brain do not have internal reserves of energy, and so after they fire, more energy needs to be brought in quickly via the bloodstream. This causes an increase in blood flow to regions of increased neural activity. These active brain regions take more oxygen out of the blood than less active regions. There is a difference in the magnetic signature between oxygenated or deoxygenated blood, and a fMRI scan can be used to detect this difference. Although the difference is very small, through numerous repetitions of a thought, action or experience, statistical procedures make
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues of Vocabulary Acquisition and Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 115
6/9/2010 1:59:25 PM
116 Foundations of Vocabulary Research Movement
Foot movements
Figure 2.13
Finger movements
b
Tongue movements
Passive reading of action words
Leg-related words
Arm-related words
Face-related words
fMRI brain location results
(Pulvermüller, 2005: 578).
it possible to determine the areas of the brain which reliably have more of this difference, and therefore which areas of the brain are active during that mental process. The fMRI technology was developed in the early 1990s, and has become one of the neural imaging methods of choice, because it is non-invasive, safe (not using radiation like some other techniques) and has excellent spatial resolution. In terms of vocabulary research, fMRI can be used as a tool to locate the parts of the brain which are active during various types of lexical processing. Pulvermüller (2005) gives an illustration of what fMRI can do. He discusses the processing of ‘action’ verbs like lick, pick, and kick. The fMRI scans showed that these verbs were processed not in a single area of the brain, but rather near the areas of the brain which control movement of the tongue, fingers, and feet, respectively (Figure 2.13). In addition, when face-, arm-, and leg-related words were given to participants, they also activated the physiological centers in the brain for that related part of the body, e.g. silently reading kick activated the part of the brain which controls the leg. It thus appears that at least some lexis is intimately connected with the physiological response, and part of ‘knowing’ action words includes this automatic activation of the brain’s motor control centers. This is early research, but hints that it may be possible to study the learning of word meaning through large-scale neurophysiological techniques. For one introduction to fMRI research methodology, see Jezzard, Matthews, and Smith (2003).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
a
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_03_cha02.indd 116
6/9/2010 1:59:26 PM
3
As introduced in Section 1.1.3, we are now aware that vocabulary typically behaves not as single words which are held together by syntax, but rather has a strong tendency to occur in multiple word phraseological units. The phenomenon of formulaic language has long been recognized in the case of idioms, because they have the very noticeable trait of non-compositionality, i.e. the meaning of an idiom cannot be derived from the meaning of its component words. However, as research on formulaic language developed (largely through the use of computerized corpora), it became obvious that it was no mere peripheral feature, but rather was ubiquitous and must be a core characteristic of language (e.g. Biber, Johansson, Leech, Conrad, and Finegan, 1999; Sinclair, 1991). Not only is formulaic language very common in language overall, a great deal occurs in both spoken and written modes.1 While much of the research has been done on written discourse (largely because it is more easily turned into computer-readable data), it is equally, if not more, important in spoken discourse (Altenberg, 1998; Davou, 2008; McCarthy and Carter, 2002; Kuiper, 2004; O’Keeffe, McCarthy and Carter, 2007). Oppenheim (2000) counted the multi-word stretches of talk that occurred identically in practice and final renderings of a short speech on the same topic. She found that between 48% and 80% (overall mean of 66%) of the spoken output produced by six nonnative participants consisted of these identical strings. Some of this considerable amount must surely be formulaic in nature, but it is impossible to tell from her report how much. Sorhus (1977) calculated that speakers in her corpus of spontaneous Canadian speech used an item of formulaic language once every five words. (This includes one-word fillers, such as eh, well, and OK, but even without these, there is still a very high frequency of formulaic sequences like for example, at times, and a lot of.) Erman and Warren (2000) calculated that 52–58% of the language they analyzed was formulaic, and Foster (2001) came up with a figure of 32% using different procedures and criteria. Biber et al. (1999) found that around 30% of the words in their conversation corpus consisted of lexical bundles, and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
117
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 117
6/9/2010 3:04:19 PM
Foundations of Vocabulary Research
about 21% of their academic prose corpus. Howarth (1998) looked at frequent verbs in a social science/academic corpus and found that they occurred in either restricted collocations or in idioms in 31–40% of the cases. Rayson (2008) found that 15% of text is formulaic according to a Wmatrix analysis (see Section 6.3). Furthermore, formulaic language has been found in a range of languages, including Russian, French, Spanish, Italian, German, Swedish, Polish, Arabic, Hebrew, Turkish, Greek, and Chinese (Conklin and Schmitt, 2008). Although this does not prove that formulaic language is a universal trait of all languages (and most of the research has still been done on English), the widespread existence of formulaicity in the above languages strongly suggests that it is a common phenomenon. Being such a big part of language, it is not surprising that formulaic language as a category is not homogenous (although many researchers treat it as if it is). It realizes different purposes in language use, including transacting routinized meanings (That’ll be X dollars = typical way for American shopkeepers to state the cost of a bill), lexicalizing various functions (Pardon me = a short conventionalized form of apologizing), and smoothing social interaction (yeah, it is = a routinized way of agreeing with an interlocutor’s assertion) (Schmitt and Carter, 2004). It also provides the building blocks upon which one can create more extended strings of language (e.g. with collocations (valid point), and with lexical bundles: it should be noted that = a standard academic phrase which highlights a point of interest (Biber et al., 1999)). Idioms are one type of formulaic language which, being very salient, have attracted perhaps the greatest amount of research. Although they are generally not frequent as individual items, there are a large number of them, with Moon (1997: 48) noting that ‘the largest specialist dictionaries of English multi-word items ... contain some 15,000 phrasal verbs, idioms and fixed phrases, but the total number of multi-word items in current English is clearly much higher’. Despite the low frequencies of the individual items, such large numbers of idioms inevitably means that they are going to be a noticeable element in language, at least in some forms of discourse. For example, Nippold (1991, cited in Cain, Oakhill, and Lemmon, 2005) found that 6–10% of sentences in (American) reading programme books designed for 8–12-year-olds contained idiomatic expressions. Moreover, in genrespecific corpora (e.g. meetings, supermarket checkout-operator talk), the frequency of the central formulaic sequences, including idioms, can be very high (Keller, 1981; Kuiper and Flindall, 2000). This means that, while there are all-purpose formulaic sequences which might cross genre boundaries, each individual genre might have its own formulaic characteristics with its own particular formulaic sequences. However, as Kuiper, Columbus, and Schmitt (2009) note, idioms are only a small part of the phrasal lexicon of both a language and individual speakers
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
118
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 118
6/9/2010 3:04:19 PM
Formulaic Language
119
chunks formulaic speech multi-word units collocations formulas prefabricated routines conventionalized forms holophrases ready-made utterances. Just as the five blind men of Hindustan who went out to learn about an elephant felt different parts of the elephant’s body and came to very different conclusions about what an elephant is like, researchers seem to be looking at different aspects of formulaic language and using terminology to make sense for that aspect. For example, Nattinger and DeCarrico (1992) stressed the relationship of formulaic language and their functional usage and called the forms lexical phrases. Work on collocations mainly look at the relationships between two-word pairs. Terms like prefabricated expressions and chunks focus on the holistic storage of the forms. However, when looking at formulaic language as a phenomenon, one must include all of these types. Moreover, one reason that formulaic language is so widespread is that they realize a wide number of referential, textual, and communicative functions in discourse. They can be used to express a concept (Get out of Dodge = get out of town quickly, usually in uncomfortable circumstances), state a commonly believed truth or advice (Too many cooks spoil the broth = it is difficult to get a number of people to work well together), provide phatic expressions which facilitate social interaction (Nice weather today is a non-intrusive way to open a conversation), signpost discourse organization (on the other hand signals an alternative viewpoint), and provide technical phraseology which can transact information in a precise and efficient manner (two-mile final is a specific location in an aircraft landing pattern) (Schmitt and Carter, 2004). Likewise, Nattinger and DeCarrico (1992) argue that formulaic language fulfils the functions of maintaining conversations (How are you?, See you later), describing the purposes for which the conversations take place (I’m sorry to hear about X, Would you like to X?), and realizing the topics necessary in daily conversations (When is X? (time), How far is X? (location)). In fact, one might suppose that for every conventional activity or function in a culture, there will be associated phrasal vocabulary. If that is so, there are bound to be a large number of formulaic expressions, perhaps even a larger number than that of single word vocabulary. Formulaic sequences become particularly important in language use when we consider their pragmatic value. For instance, they are very often used to accomplish recurrent communication needs. These recurrent communicative
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
of a language. In fact, there seem to be many types of formulaic language, varying in degree of fixedness, institutionalism/conventionality, and opacity/non-compositionality. This lack of homogeneity is one reason for the wide range terminology in the area. For example, when researching what she termed formulaic sequences, Wray (2002: 9) found over 50 terms to describe the phenomenon of formulaic language, such as:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 119
6/9/2010 3:04:19 PM
needs typically have conventionalized language attached to them, such as I’m (very) sorry to hear about ——— to express sympathy and I’d be happy/glad to ——— to comply with a request (Nattinger and DeCarrico, 1992: 62–63). Because members of a speech community know these expressions, they serve as a quick and reliable way to achieve the desired communicative effect. Formulaic sequences also realize a variety of conversational routines and gambits and discourse objectives (Coulmas, 1979, 1981). They are typically used for particular purposes and are inserted in particular places in discourse. For instance, formulaic sequences regularly occur at places of topic-transition and as summaries of gist (Drew and Holt, 1998). Most (all?) conventional speech acts are realized by families of formulaic language and not normally by original expressions (I’m very sorry versus I am feeling apologetic towards you). Overall, understanding the pragmatic role of formulaic language can tell us much about the nature of interaction (McCarthy and Carter, 2002). Moreover, formulaic sequences do more than just carry denotative meaning and realize pragmatic function. They can often have a type of register marking called semantic/collocational prosody (Hunston, 2007; Sinclair, 2004; Stubbs, 2002). This is often negative, for example, the verb cause frequently has a negative evaluation (cause pain, cause inflation). However, semantic prosody can also be positive, as in collocations that form around the word provide (provide information, provide services). This semantic prosody is one means of showing a speaker/writer’s attitude or evaluation. For example, his/ her stance can be indicated concerning the knowledge status of the proposition following the formulaic item (I don’t know if X indicates uncertainty about X), his/her attitude towards an action or event (I want you to X shows a positive attitude towards this action), and his/her desire to avoid personal attribution (it is possible to avoids a directly attributable suggestion) (Biber, Conrad, and Cortes 2004). Likewise, the choice of formulaic sequences can reflect an author’s style and voice (Gläser, 1998). Formulaic sequences can also be used to encode cultural ideas, as Teliya, Bragina, Oparina, and Sandomirskaya (1998) have demonstrated for Russian.
3.1 Identification The extent and diversity of formulaic language makes it very tricky to define and identify. In fact, identification is probably the biggest problem in researching formulaic language. Definitions like Wray’s (2002: 9) oft-cited one for formulaic sequences were intentionally broad and inclusive, and was meant to capture the widest range of formulaic language possible:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
120 Foundations of Vocabulary Research
a sequence, continuous or discontinuous, of words or other elements, which is, or appears to be, prefabricated: that is, stored and retrieved whole from memory at the time of use, rather than being subject to generation or analysis by the language grammar.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 120
6/9/2010 3:04:19 PM
Formulaic Language
121
However, for the purposes of identification, tighter definitions are helpful in guiding the selection decisions. Wray (2008:12) proposes a more restricted definition, termed morpheme equivalent unit:
There have been several approaches to identification based on various research purposes, and on the richness of the data available (e.g. is it possible to access phonological information?). It is possible to discern at least four of these. In L1 acquisition studies, the criteria tend to focus around the lexis that children repeat. Much of this is formulaic in nature. Sometimes the formulaic sequences are fused strings which the child has constructed from grammar rules and lexical items and stored whole for later use, and which may or may not be fully adult-like. Sometimes the sequences are extracted from the input the child receives, and which may or may not yet be fully analyzed into the component words (Wray, 2002).
Quote 3.1
Kuiper on the nature of formulaic language
Koenraad Kuiper was most helpful in pointing out that there are two underlying properties which define [formulaic sequences]: a) the units of formulaic language are not merely any sequence of words, but phrases, and b) they are lexical items exactly like other lexical items such as words, and with the same properties as words would have if they were phrases. (In Schmitt and Carter, 2004: 4)
The acquisition approach is related to what might be called the ‘psycholinguistic’ approach, where formulaic language is assumed to be holistically stored in the mind. There is evidence for this on the phonological front: formulaic sequences are typically spoken more fluently, with a coherent intonation contour, to the extent that this has been accepted as one criterion of formulaicity (e.g. van Lancker, Canter, and Terbeek, 1981; Peters, 1983: 10). This criterion means that there should be no hesitation pauses within the chunk when it is spoken (Kuiper, 1996), and neither should there be any internal errors or transformations (to be honest with you; *to be honest [pause] with you; *to be honest on you; *to be with you honestly), although Nooteboom (1999) notes that the pronunciation may be more ‘sloppy’, possibly because chunks may not get as much attentional resources as novel utterances do in production. (However, see Ashby, 2006, and Lin (in preparation) for a somewhat different view on phrasal prosody, and for possible differences between child and adult formulaic speech.)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
a word or word string, whether incomplete or including gaps for inserted variable items, that is processed like a morpheme, that is, without recourse to any form-meaning matching of any subparts it may have.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 121
6/9/2010 3:04:20 PM
Although hesitation pauses and errors are relatively easy to identify, variation within a formulaic sequence is a trickier issue. While we would indeed expect a chunk that is preformulated in a speaker’s mind to be articulated without any changes, how do we know whether it is in fact preformulated? Using a criterion of ‘no transformation’ would be circular. To get around this, we often use corpus data to establish norms of the way formulaic sequences are used in the speech community. The problem is that despite the overall idea that formulaic language is fixed, in fact it tends to be quite variable. For example, Moon (1997, 1998) shows that formulaic language (illustrated here with idioms) can vary across a number of factors: British/American variations: not touch someone/something with a bargepole (British) not touch someone/something with a ten foot pole (American) ● Varying lexical component: burn your boats/bridges ● Unstable verbs: show/declare/reveal your true colours ● Truncation: every cloud has a silver lining/silver lining ● Transformation: break the ice/ice-breaker/ice-breaking (Moon, 1997: 53) ●
This widespread tolerance for different types of variation makes it difficult to define the ‘standard form’ of any formulaic sequence, and therefore a criterion of whether a person’s output of that sequence is appropriate or not. This means phonological features are usually a more practical identification criteria. Thus the psycholinguistic approach mainly focuses on spoken discourse, because of the time-sensitive nature of the on-line output. Also, it usually works with spontaneous data, because a chance to rehearse may allow the production of strings resembling formulaic sequences even though they are not preformulated. Another method to identification, which could be called a ‘phraseological approach’, has been taken up by scholars of the ‘Russian tradition’ like Vinogradov, Amosova, Kunin, Mel’cuk and Cowie, who define formulaic language in terms of transparency and substitutability (see Cowie, 1998, Chapters 1 and 10). For example, they look at words like cut and slash, and note how they are constrained by particular collocation restrictions: cut/*slash one’s throat versus slash/*cut one’s wrists. They might also note that several modes of transportation collocate with the verb drive: drive a car/bus/ truck, but some do not: drive a *bicycle/*motorcycle. However, the phraseological approach to identification is problematic for several reasons (Durrant, 2008). First, it is not easy to operationalize the criteria of transparency and substitutability (see, e.g. Nesselhauf, 2005: 25–33). Second, phraseological approaches rely on human analysts to identify formulaic language. This makes analysis extremely labour intensive, and so only suitable for limited enquiries. It is difficult to see, for instance, how an entire corpus of any
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
122 Foundations of Vocabulary Research
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 122
6/9/2010 3:04:20 PM
Formulaic Language
123
Concept 3.1
Concordances and concordancers
Concordances (or concordance lines) are lines of data from a corpus which are lined up so that comparison of the queried item (node) is facilitated. (See Chapter 6, Research Project 6, for an example of concordance lines.) This allows quick and easy comparison of the immediate context surrounding the node item. Concordancers are the software programs which search the corpus and sort the data for the user into concordance lines. However, the term concordancer has become the generic name for software programs which do all types of corpus enquiry. Thus concordancers typically build frequency lists, compare texts, and various other things, as well as building concordance lines.
By far the most common means of identifying formulaic language is through corpus statistics.2 The main idea here is identifying sequences which recur in a corpus, based on the underlying criterion of frequency, and this has the great advantage of being easily automated. To extract formulaic sequences, concordancers can be asked to identify all of the word combinations according to a predetermined frequency criterion. For example, Biber et al. (1999) interrogated the 40 million word Longman Spoken and Written English Corpus to find strings of various lengths which occurred a minimum number of times (five to ten). This produced numerous three-word (I don’t think; in order to), four-word (I don’t want to; in the case of ), five-word (I see what you mean; the aim of this study), and six-word (do you know what I mean; from the point of view of ) combinations. Taken together, these combinations were very frequent, making up about 30% of conversation (almost 45% if two-word contractions are included (I don’t)) and about 21% of academic prose. Biber et al. refer to these combinations as lexical bundles, but other scholars refer to the results from this kind of procedure N-Grams, i.e. fixed strings of N length. It is interesting to note that this kind of procedure tends to produce strings which are not complete structural units (e.g. the end of the), as opposed to the Russian school, which focuses on formulaic language strings which relate to identifiable meaning or function correlates, as so tend to be structurally complete (e.g. on the other hand is structurally complete because the form fully realizes the notion of contrast). Simple frequency can also be used to inform whether word combinations are collocations or not. In this approach, the concordancers count
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
size could be analyzed. Third, the human analysis also makes the process rather subjective. Unless multiple raters were used and coordinated, it would impossible to know whether any results were the outcome of an individual analyst’s preferences and prejudices, or whether they had more general validity. It is also not clear if manual identification can capture the whole range of formulaic sequences, e.g. would only ‘interesting’ sequences be identified, or could the process also capture frequent, but perhaps less salient, sequences as well?
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 123
6/9/2010 3:04:20 PM
how many times these combinations occur in a corpus. For example, nice day occurs 189 times in the 170 million word New Longman Corpus, while crazy day occurs only four times. On the basis of this, one might assume that nice day is likely to be a collocation, while crazy day is much less so. However, this method suffers from two problems. First, the most frequent combinations consist of function words (which are the most frequent words in language) (Hunston, 2002: 69–70) and so the results (e.g. of the (1,069,458 occurrences); and a (86,772)) will be made up of words which co-occur by chance simply because they are so frequent, and not because they have an interesting relationship. Second, this method also misses real collocations at the other end of the frequency spectrum simply, because the combinations are so infrequent (cloven hoof (4), sheer lunacy (7)). To get around these problems, researchers usually use strength of association measures. These measures compute the likelihood of two words occurring together as opposed to the likelihood of their occurring separately. However, there are two conceptually-different approaches to making these calculations (asymptotic hypothesis tests and mutual information), and the resulting collocation lists can be quite different (Durrant, 2008). The different methods, their formulas, and characteristics are discussed below.
3.2
Strength of association – hypothesis tests
The principal ‘hypothesis testing’ strength of association measures include z-score, t-score, chi-squared and log-likelihood tests. These all test the null hypothesis that words appear together no more frequently than we would expect by chance alone. All of these methods start by calculating how many times we would expect to find word pairs together in a corpus of a certain size by chance alone, given the frequencies of their component words. This is calculated by first determining the probability that a word combination, if chosen at random from the corpus, would occur: P(Word 1Word 2) = P(Word 1) * P(Word 2) This simply states that the probability that any randomly selected pair of words will be the combination (Word 1 Word 2) is equal to the probability of Word 1 occurring on its own multiplied by the probability of Word 2 occurring on its own. In the case of black coffee, the word black appears in the New Longman Corpus 44,422 times, and the word coffee appears 10,305 times. Since the New Longman Corpus consists of 179,012,415 words, the probabilities of occurrence for each of these words are: P( black ) =
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
124 Foundations of Vocabulary Research
44,422 ≈ 0.00025 179,012,415
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 124
6/9/2010 3:04:20 PM
Formulaic Language
10,305 ≈ 0.000058 179,012,415
Thus, if we selected any word at random from the New Longman Corpus, the probability that it would be black is 0.00025 and the probability that it would be coffee is 0.000058. Using the above formula, we can calculate that the probability that a two-word combination, picked at random from the New Longman Corpus, would be the pair black coffee is: P(black coffee) = 0.00025 * 0.000058 = 1.45e-08 This is a very low probability, but given the very large size of the New Longman Corpus, the word combination black coffee will still occur by chance alone at some point. By multiplying the above probability by the size of the corpus, we can predict that it will occur about 2.60 times: 1.45e-08 * 179,012,415 ≈ 2.60 times Consulting the New Longman Corpus, we find that in fact it occurs 139 times. This strongly suggests that the pair collocates more frequently than by chance. However, in research, results that appear obvious are not always reliable. The various hypothesis testing methods determine whether the apparent differences are statistically significant, thus giving a much firmer basis for interpretation. The following formulas and discussion are taken from Durrant (2008) and Manning and Schütze (1999: 162–163), and Evert (2004). We can calculate the z-score with the formula: z-score =
O–E E
O = the observed frequency of occurrence of the combination E = the expected frequency of occurrence based on the null hypothesis that there is no relationship between the words For black coffee, the figures are:
z -score =
139 – 2.60 2.60
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
P( coffee ) =
125
= 84.59
A problem with z-score is that, because it takes expected occurrence as its denominator, a misleadingly high score can be returned if the words
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 125
6/9/2010 3:04:20 PM
126
Foundations of Vocabulary Research
involved are infrequent in the corpus (Evert, 2004). The t-score test tries to avoid this problem by taking observed occurrence as the denominator. It is calculated as follows: O–E O
Thus, for black coffee: t -score =
139 – 2.60 = 11.57 139
Both of these statistics can be criticized on the grounds that they assume an approximately normal distribution of results. However, this is probably not the case for rare events like collocations (Dunning, 1993). The hypothesis testing methods conceive of a corpus as a series of bigrams, each of which may have a value of 1 (the bigram is the word pair being examined) or 0 (the word pair is not the word pair being examined). Twooutcome tests of this sort (analogous to a series of coin tosses) generate a binomial distribution. Where the mean number of positive outcomes is relatively high (as in the case of getting heads from a coin toss), the binomial distribution approximates the normal distribution. However, where the mean number of outcomes is relatively low (as in the case of collocation), the binomial distribution is heavily skewed (Dunning, 1993: 64–65). One way of getting around this problem is by using non-parametric tests, which do not rely on the assumption of normality. One such test is Pearson’s chi-squared. This relies on the following 2 × 2 contingency tables showing the expected and observed occurrences in the corpus of each word and its collocate: Expected: Word 2 = X Word 1 = Y
E11
=
E21
=
Word 1 ≠ Y
R1C1 N R2 C1 N
Word 2 ≠ X
E12
=
E22
=
R1C2
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
t -score =
N R2 C2 N
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 126
6/9/2010 3:04:21 PM
Formulaic Language
127
Observed:
Word 1 = Y Word 1 ≠ Y
O11 O21
O12 O22
= R1 = R2
= C1
= C2
=N
If we plug in values for the word pair black coffee, we get the tables: Expected:
Word 2 = coffee Word 2 ≠ coffee Word 1 = black Word 1 ≠ black
2.6 10,302.4
44,419.4 178,957,690.6
Observed:
Word 2 = coffee Word 2 ≠ coffee Word 1 = black Word 1 ≠ black
139 10,166 C1 =10,305
44,283 R1 = 44,422 178,957,827 R2 = 178,967,993 C2 =179,002,110 N = 179,012,415
On the basis of these tables, chi-squared is calculated as follows: x2 =
N (O11 – E11 )2 E11 E22
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Word 2 = X Word 2 ≠ X
Thus, for black coffee: x2 =
179,012,415(139 – 2.6)2 = 7282.3 2.6 * 178,957,690.6
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 127
6/9/2010 3:04:22 PM
128 Foundations of Vocabulary Research
log -likelihood = 2
Oij
∑ O ln E ij
ij
ij
Thus, for black coffee: ⎡⎛ ⎤ ⎛ 10,166 ⎞ ⎞ ⎛ 139⎞ ⎞ ⎛ ⎟⎠ ⎟⎠ + ⎜ 10,166 * ln ⎜ ⎢⎜⎝ 139 * ln ⎜⎝ ⎥ ⎟⎠ ⎟⎠ ⎝ 2.6 10,302.4 ⎝ ⎥ 2*⎢ ⎢ ⎛ 44,283 ⎞ ⎞ ⎛ 178,957,827 ⎞ ⎞ ⎥ ⎛ ⎛ ⎢+ ⎜ 44,283 * ln ⎜ ⎥ + 178,957,827 * ln ⎜ ⎝ 44,419.4 ⎟⎠ ⎟⎠ ⎜⎝ ⎝ 178,957,690.6 ⎟⎠ ⎟⎠ ⎥⎦ ⎢⎣ ⎝ = 840.1
As noted above, the rationale behind all of these statistics is that of testing the null hypothesis that a word combination appears together no more frequently than we would expect by chance alone. If we take this conception literally, we can consult tables of critical values to determine a confidence level in rejecting the null hypothesis. A t-score of greater than 2.576, e.g. would enable us to reject the null hypothesis with 99.5% confidence (Manning and Schütze, 1999: 164). However, Durrant (2008) cautions us to note exactly what is meant by a word pair’s being more frequent that we would expect ‘by chance’. The calculation of expected occurrence in the above formulas is based on a model in which words are drawn entirely at random, as if from a hat. But language is far more regular than a ‘random word generator’ (Manning and Schütze, 1999: 166). Semantics, grammar, discourse organization, and real-world occurrences all mold and constrain the construction of language, and so it is not uncommon for word combinations to co-occur ‘more frequently than random’, regardless of collocational relations. With this in mind, levels of ‘statistical significance’ are probably not best used as cut-off points in identifying collocations. Rather, they are much more useful in ranking word combinations according to their relative collocational strength (Durrant, 2008; Manning and Schütze, 1999: 166; Stubbs, 1995: 33). In fact it is difficult to set minimum scores for the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
All procedures seem to have some drawback, and chi-squared is known to be inaccurate when small numbers are involved. Dunning (1993) therefore recommends using the log-likelihood ratio instead, which is more robust at lower frequencies. Log-likelihood also makes use of the contingency tables described above, and is calculated as follows (this version of the equation comes from Evert (2004)):
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 128
6/9/2010 3:04:22 PM
129
identification of collocations, although a figure of 2 has been suggested for t-score (Hunston, 2002). However, one interesting study looked at this issue by comparing several strength of association measures and determining how well they predicted word association results. Durrant (2008) calculated the minimum score from each measure for predicting which word pairs would be linked on association lists (i.e. whether one word pair member as stimulus would elicit the other word pair member as a response in association tasks). He did this using all association responses (including any association given by at least one respondent), and also with only more ‘robust’ associations (those given by at least 5% of the respondents). The minimum scores are given in Table 3.1, and provide some initial guidance as to the lower end of the values which could be acceptable for identifying collocations, at least those that have psychological (word association) correlates. Values for identifying collocations that describe textual collocation, but not psychological association, may be different, as collocation cannot always be equated with association. Of course they are often linked, but there are also many associations which are not strong collocations. Reanalyzing two widely-used sets of association norms, Hutchison (2003: 787) finds only 15.7% of associates to be ‘phrasal associates’. Similarly, Fitzpatrick (2006) found only about a quarter of native-speaker word associations to be based on collocation. This merely reinforces the point that formulaic language (including collocation) is difficult to unambiguously identify, and that few firm selection criteria currently exist.
Table 3.1 Scores at which each measure becomes informative about collocations which are also word associations All associations a Robust associations
Measure Raw frequency
16
16
Z-score
38
56
T-score
3.9
4.2
1,520
3,112
Log-likelihood
60
142
MI b
3.7
5.0
0.21
.056
Chi-squared
Conditional probability
b
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
a
Occurrences per 100 million words. b See discussion of these statistics below. (Adapted from Durrant, 2008).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 129
6/9/2010 3:04:23 PM
130 Foundations of Vocabulary Research
3.3
Strength of association – mutual information
The other strength of association approach is mutual information (MI) (Church and Hanks, 1990). It employs the following formula:
= log 2
O E
For black coffee, the figures look like this:
MI
= log 2
139 2.6
= 5.8
Although the formula also compares expected and observed occurrences of word pairs, the results are often quite different from the hypothesis testing statistics. Mutual information can be thought of as a ‘measure of how much one word tells us about the other’ (Manning and Schütze, 1999: 178). That is, when we encounter one part of a word pair which has a high mutual information score, there is a high probability that the other member of the pair is nearby. This is fundamentally different from the hypothesis testing methods described above. Clear (1993: 279–282) lucidly spells out this difference: ‘MI is a measure of the strength of association between two words’, while hypothesis testing methods are measures of ‘the confidence with which we can claim there is some association’ (emphases in original). The practical effect of this is that different types of word pairs tend to be retrieved by the two methods. This can be illustrated with a word pair having a high MI score: tectonic plates. This pair is not particularly frequent in general English, and occurs 73 times in the New Longman Corpus. (Note this figure has been pushed higher than one might expect by the science component of the corpus.) However, crucially, 73 out of the 214 occurrences of tectonic in the corpus appear with the word plate(s). This means the two words are strongly associated, because where we find tectonic, we are also likely to find plate(s). Conversely, every day is typical of a pair with a high score based on the hypothesis testing statistics. The pair appears together much more frequently (5,991 occurrences) than tectonic plate, and thus the connection is more reliable, even though the strength of association between the two words is weaker. In short, MI tends to highlight word pairs which may have relatively low frequency, but which are relatively ‘exclusive’ to one another; hypothesis testing methods highlight items which maybe less closely associated but which occur with relatively high frequency.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
MI
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 130
6/9/2010 3:04:23 PM
131
A commonly cited threshold for statistical significance for MI is 3 (e.g. Hunston, 2002; Evert and Krenn, 2001). However, it is important to note that, like z-score, MI uses ‘expected occurrence’ as its denominator. It can therefore give very high scores for collocations which include low-frequency words, even if the total number of occurrences of the collocation is very low. To safeguard against accepting word pairs as strong collocations on the basis of minimal evidence, MI needs to be used in conjunction with a minimum frequency threshold3 (e.g. Church and Hanks, 1990: 24). Minimum figures of 3–5 occurrences have been suggested (e.g. Church and Hanks, 1990; Clear, 1993; Stubbs, 1995). Another suggestion is that MI be used in conjunction with a minimum t-score threshold to ensure the MI collocations are valid (Church and Hanks, 1990) with Stubbs (1995) suggesting a t-score cut-off of 2. It is debatable which strength of association measure is the best, although t-score and MI are probably the ones most commonly used. The choice will depend on what type of collocation you wish to work with. Hypothesis testing statistics like t-score tend to highlight frequent collocations made up of relatively frequent words (e.g. fresh air), while MI score tends to highlight collocations made up of less frequent words, but those with stronger and more exclusive links (cloven hoof ).
3.4
A directional measure of collocation
The measures of collocation discussed above are all non-directional, in the sense that it makes no difference which part of the word pair is taken as node and which as collocate. However, for some collocations, directionality may be a feature. Stubbs (1995: 35) points out that though the pair kith and kin have the same score on all of the measures regardless of which word is taken as the node, the relationship between the two words is clearly not symmetrical: kith predicts kin with virtually 100% certainty, whereas kin can stand alone. The same asymmetry is found with to and fro, and starlit night. The non-directionality of the above measures may not be problematic in most corpus research, after all, flexibility is a feature of many (most?) collocations: suspicious circumstances; the circumstances were suspicious. The situation is different when identifying collocations for use in pyscholinguistic experiments. In this case, one member of a word pair may well prime the other better than vice versa. For example, it seems highly likely that any associative links running from kith to kin will be stronger that those running in the opposite direction, and this could have an effect in experiments using techniques like word associations or timed judgements. It would therefore be useful to have a statistic which could describe any directionality bias. Durrant (2008) proposes one potential procedure for calculating the conditional probability of one word, given the other. He suggests dividing the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 131
6/9/2010 3:04:23 PM
132 Foundations of Vocabulary Research
frequency of the word pair by the frequency of the node. Since the conditional probabilities will usually be rather small, this figure is multiplied by 100 for ease of reading: P (Collocate Word |Node Word) = Frequency of the Word Pair Frequency of Node
For kith and kin, the New Longman Corpus frequencies are kith (17), kin (994), and kith and kin (16). Thus, the conditional probability of the collocate kith, given the node kin, is: 100 ×
16 994
= 1.61
Conversely, the conditional probability of the collocate kin, given the node kith, is: 100 ×
16 17
= 94.12
This provides quantitative evidence supporting our previous impression that the associative links running from kith to kin will likely be much stronger that those running from kin to kith.
3.5
Formulaic language with open slots
Up until now we have looked at procedures for analyzing the fixed elements of formulaic language. However, as Moon (1997, 1998) shows, there is actually much about formulaic language that is not fixed. One of the most important types of non-fixed formulaic language is the ‘open slot’ variety. This type combines a number of words which are frozen, but also allows variety in one or more slots. These slots can be filled with various words or phrases, but they also involve semantic constraints. For example, the phrase a(n) ——— ago is usually completed with a word or words which have the meaning of ‘time’, e.g. hour, year, very long time. The phrase is a common way of expressing a particular meaning, i.e. signifying a point in the past. The useful thing about this phrase is that the slot allows the ‘time point’ to be adjusted, and so is maximally useful in describing many different temporal settings. We can see the same thing in a longer formulaic item with two slots: ——— thinks nothing of ———. Again we find that the slots are semantically
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
100 ×
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 132
6/9/2010 3:04:23 PM
Formulaic Language
133
constrained, with first being filled an ‘animate object’, and the second with some ‘activity which is surprising, unexpected, or unusual’. It is commonly used to express the meaning ‘someone habitually does something which we would not expect’. It is very flexible in allowing us to express this underlying meaning about a wide range of situations:
He thinks nothing of sleeping 15 hours a day. The body builder thinks nothing of having eight raw eggs for a post-workout snack. The semantic constraints of each slot are easy to see if they are violated, as the results are either amusing or strange: The house thinks nothing of standing on the side of a cliff. (not animate) She thinks nothing of eating lunch every day. (not very unusual) In teaching seminars (although not in his published work), Sinclair called these flexible open slot phrases variable expressions and argued that they are very widespread, simply because they realize semantic concepts that people commonly wish to use, and because their flexibility allows their use in a wide range of contexts. They may well be a major (perhaps the major) component of language, but research is still embryonic (See Sinclair, 2004, for some analyses of variable expressions.) The main problem with researching variable expression is that their variable slots make them difficult to identify and describe. Unlike N-grams, where computers can be told to automatically extract contiguous strings of various lengths, the slots in variable expressions can be filled with a wide variety of different words. As computers cannot currently search for semantic categories (such as ‘person’ or ‘unusual activity’), this type of analysis needs to be done manually. It can take hours to identify and describe a single variable expression, and this can severely limit the scope of any study. An alternative is to develop semantic/ functional tagging of corpora, so that concordancers can use these tags in their searches. This approach is now being pursued with the International Corpus of Learner English by Sylviane Granger (personal communication). (See also Fellbaum, 2007.) There is one automatized approach which has considerable promise. It is called ConcGrams, and is designed to find ‘all of the permutations of constituency variation and positional variation generated by the association of two or more words’ (Cheng, Greaves, and Warren, 2006: 414). That is, the program searches for the patterning which forms around a number of specified words, rather than only a single node word. Furthermore, the patterning does not have to be contiguous, but rather within a preset span (e.g. +/– 5 words). The ConcGram procedure, being automatized, has real potential for
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Diane thinks nothing of running 20 miles before breakfast.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 133
6/9/2010 3:04:24 PM
exploring the extent of variable expression in language. This is especially true as it is available on the web and is now part of a mainstream concordancing software package, Mike Scott’s WordSmith Tools (see Section 6.3). Variable expressions may be part of even more widespread patterning in language. Extensive phraseology has been a prominent feature of recent accounts of the interrelationship between lexis and grammar. For example, Hunston and Francis (2000) propose a description of language in terms of patterns. They give the example of the word matter, which is found often to occur in the expression ‘a matter of -ing’ (as in a matter of developing skills; a matter of learning a body of information; a matter of being able to reason coherently). The structure ‘a ——— of -ing’ may therefore be described as a characteristic pattern of matter. It is possible that this type of patterning (combining some fixed and some open components) will turn out to be the major feature of the lexico-grammatical system of language organization.
3.6
Processing formulaic language
Formulaic language is very common in language overall, but one might ask why it is so widespread? The answer must be that it achieves some useful purpose in communication. Several of these purposes can be discerned. First, the syntagmatic aspects of phraseology help to shape, define, and enhance meaning, following Firth’s (1935) proposal that some of a word’s meaning is derived from the sequences in which it resides. This can be shown with the word border. In isolation it means ‘the edge or boundary of something’. It might also be assumed that the various inflections of border (bordered, bordering, borders) carry a similar meaning, but this would be wrong. Schmitt (2005) looked at the behavior of the border lemma4 in the 100 million word British National Corpus and came up with the following figures: From these figures we can see that border and borders (mainly noun forms)
border borders bordering bordered
BNC frequency
X + on
Figurative sense
8,011 2,539 367
89 (1%) 84 (3%) 177 (48%)
71%
356
99 (28%)
75%
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
134 Foundations of Vocabulary Research
are the most frequent members of the family. This is not at all surprising as most word families have more and less frequent members. However, once we put the words into phrases (in this case by adding the preposition on), the behavior changes dramatically. Only 1–3% of the cases of border and borders occur in combination with on, but about one-quarter of the
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 134
6/9/2010 3:04:24 PM
135
occurrences of bordered do, as do almost one-half of the occurrences of bordering. Clearly there is a strong tendency for bordered and bordering to occur in a pattern with on. But the patterning involves not only the combination of the words; it also affects the meaning. Whereas border and borders almost always refer to the expected or literal meaning of ‘edge or boundary’ (even when in combination with on), in about three-quarters of the cases bordering on and bordered on refer to some ‘figurative’ meaning not to do with edges or boundaries. In fact, when we look at concordance lines from the BNC, we find quite a different usage: – His passion for self-improvement bordered on the pathological. – But his approach is unconscionable, bordering on criminal. For further evidence of this usage, here are some other words which occur to the right of bordered/ing on: a slump a sulk acute alcoholic poisoning antagonism apathy
arrogance austerity bad taste blackmail carelessness
chaos conspiracy contempt cruelty cynicism
There is clearly a trend here, and I would argue that it stems from an underlying variable expression that looks something like this: SOMETHING/ SOMEONE
(be) bordered/bordering on
AN UNDESIRABLE STATE (OFTEN OF MIND)
The point to take from all this is that the lexical patterning is intrinsically linked to meaning, and in this case, changes the meaning of border from ‘boundary’ to ‘nearing an undesirable state’. Phraseology also serves to separate synonyms. Although their underlying meaning is similar, near synonyms like sheer, pure, complete, utter and absolute can be distinguished in terms of their typical collocates (Partington, 1998: Chapter 2). Similarly, Hoey (2005: Chapter 5) shows how the different meaning senses of polysemous words are systematically distinguished by their characteristic co-occurrences, and how violation of these distinct preferences may lead to ambiguity or humor. However, the main reason for widespread formulaicity must be that formulaic language typically is attached to common meanings or functions which people need to use. As we have seen, formulaic language is tightly connected to functional and transactional language use and much of the communicative content of language is tied to these phrasal expressions. As such, they ease the cognitive burdens of language production and comprehension;
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 135
6/9/2010 3:04:24 PM
i.e. they represent an easy-to-employ and easy-to-understand conventionalized way of realizing real-world needs. This ‘easiness’ has often been asserted since Pawley and Syder (1983) and Kuiper and Haggo (1984) made the case for the notion that formulaic sequences offer processing efficiency because single memorized units, even if made up of a sequence of words, are processed more quickly and easily than the same sequences of words which are generated creatively. In effect, the mind uses an abundant resource (long-term memory) to store a number of prefabricated chunks of language that can be used ‘ready-made’ in language production. This compensates for a limited resource (working memory), which can potentially be overloaded when generating language on-line from individual lexical items and syntactic/discourse rules. The case for this has always looked convincing, but it is only recently that the assertion has been put to empirical test. There is now considerable converging support for the notion that formulaic language provides processing advantages over creatively generated (i.e. non-formulaic) language. Results into the processing of idioms (e.g. Gibbs, Bogdanovich, Sykes, and Barr, 1997) provide evidence that L1 readers quickly understand formulaic sequences in context and that they are not more difficult to understand than literal speech. Formulaic sequences are read more quickly than non-formulaic equivalents, as shown by eye-movement studies (Underwood, Schmitt, and Galpin, 2004; Siyanova, Conklin, and Schmitt, under review) and self-paced reading tasks (Conklin and Schmitt, 2008). Grammaticality judgements of formulaic items were both faster and more accurate than judgements for matched non-formulaic control strings (Jiang and Nekrasova, 2007). Finally, looking at actual language use in the real world, Kuiper (1996, 2004) found that the speech of ‘smooth talkers’ (people who need to produce fluent speech under severe time pressure, such as auctioneers and sports announcers) was largely formulaic in nature. This mirrors findings by Dechert (1983), who found that the spoken output of a German learner of English was smoother and more fluent when she used formulaic language. These formulaic sequences were so useful in providing a platform for more fluent and accurate output, that Dechert called them ‘islands of reliability’, suggesting that they may anchor the processes necessary for planning and executing speech in real time.
3.7
Acquisition of formulaic language
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
136 Foundations of Vocabulary Research
The learning of individual words is incremental and each word has its own particular learning burden (Schmitt, 2000; Nation, 1990), and there is no reason to believe that formulaic language is any different in this respect. This would suggest that many formulaic sequences are partially known for a number of exposures until the point where they become mastered.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 136
6/9/2010 3:04:24 PM
137
While some may be learned quickly as wholes, especially short salient ones like Go Away!, there are good arguments for why some formulaic sequences are not learned in an ‘all-or-nothing’ manner. Some first language (L1) acquirers seem to acquire an initial phonological mapping of formulaic sequences proceeding from the whole to the individual parts, but with some elements still incompletely grasped, especially the unstressed phonemic constituents (Wray, 2002, Chapter 6). In these cases, the formulaic sequences are learned over time, with the later stages of acquisition consisting of ‘filling in’ the gaps in the initial incomplete rendering of the sequence. Likewise, some of the component words in the formulaic sequence, as well as the syntactic structure may not be known initially either. Peters (1983) suggests that these elements may be later extracted from the formulaic sequence through a process of segmentation (see also Myles, Hooper, and Mitchell, 1998). Another way formulaic sequences are learned over time involves the flexible slots in variable expressions. If the formulaic sequences are initially acquired with these slots as part of the structure, one might expect that it would take longer to learn the appropriate language insertions for these slots than to learn the fixed elements of the sequence. Alternatively, if the slots are created when paradigmatic variation is noticed at one location in a previously fully-fixed string, then this learning is also incremental in the sense that a fixed formulaic sequence must first be acquired before it is analyzed to form a formulaic sequence with slots. Moreover, shorter formulaic sequences can be combined together into longer and more complex formulaic sequences (Peters, 1983: 73), which means that the component formulaic sequences need to be learned as the initial step to acquiring the subsequent formulaic sequence. Certainly, some L1 acquirers do learn and use formulaic sequences before they have mastered the sequences’ internal makeup. Moreover, the acquisition of formulaic sequences might depend to some extent on whether children are referential or expressive learners, that is, whether they are ‘system learners’ more than they are ‘item-learners’ (Cruttenden, 1981) (see also Brown, 1973; Peters, 1983). Nelson (1973) found that children who had referential preferences (naming things or activities and dealing with individual word items) usually learned more single words, particularly nouns. Conversely, children who had more expressive tendencies (having interactional goals; focusing on the social domain) were more likely to learn whole expressions which were not segmented. The reason for these preferences may be psycholinguistic in nature (Bates and MacWhinney, 1987), or may only reflect what the child ‘supposes the language to be useful for’: predominantly naming things in the world or engaging in social interaction (Nelson, 1981: 186). It may also reflect the input a child receives: games for naming things in the world or social control clumps such as ‘D’ya wanna go out?’ (Nelson, 1981). Regardless of the underlying reason, there seems to
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 137
6/9/2010 3:04:24 PM
Foundations of Vocabulary Research
be a link between the need and desire to interact and the use of formulaic sequences. In L2 acquisition, formulaic sequences are also relied on initially as a quick means to be communicative, albeit in a limited way. This can lead to quicker integration into a peer group, which can result in increased language input. Wong Fillmore (1976) found this was the case with five young Mexican children trying to integrate into an English-medium school environment. She identified eight strategies the children used, and at least three of them directly involved formulaic language: ●
● ●
Give the impression, with a few well-chosen words (phrases), that you speak the language Get some expressions you understand, and start talking Look for recurring parts in the formulas you know.
The use of formulaic sequences enabled the realization of these strategies even though the children’s language capabilities were quite limited. Furthermore, the use of formulaic sequences to facilitate language production is not restricted to L2 children. Schmidt’s (1983) study of ‘Wes’ is a good example of the phenomenon in L2 adults; Wes’s speech is filled with formulaic language as a means of fulfilling his desire to be communicative, but not necessarily accurate. Additionally, Adolphs and Durow (2004) found that the amount of social integration into the L2 community (with presumably a commensurate need to be communicative in the L2) was linked to the amount of formulaic language produced in the speech of L2 postgraduate students. But formulaic sequences may provide language learners with more than an expedient way to communicate; they might also facilitate further language learning. For L1 learners, it has been proposed that unanalyzed sequences provide the raw material for language development, as they are segmented into smaller components and grammar (Peters, 1983). For example, when a child realizes that the phrase I wanna cookie (previously used as a holistic unit) is actually I wanna + noun, he or she gains information about the way syntax works in the language, as well as the independent new word cookie. Wray (2000) looked at a number of studies and concluded that some children segment formulaic sequences into smaller units, and in doing so, advance their grammatical and lexical knowledge. It seems that formulaic sequences serve the same purpose for L2 learners (e.g. Bardovi-Harlig, 2002; Myles, Hooper, and Mitchell, 1998). Moreover, there is little doubt that the automatic use of acquired formulaic sequences allows chunking, freeing up memory and processing resources (Kuiper, 1996, and Ellis, 1996, who explores the interaction between short-term and long-term phonological memory systems). These can then be utilized to deal with conceptualising and meaning, which must surely aid language learning.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
138
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 138
6/9/2010 3:04:25 PM
Formulaic Language
139
They are acquired and retained in and of themselves, linked to pragmatic competence and expanded as this aspect of communicative ability and awareness develops. At the same time, they are segmented and analyzed, broken down, and combined as cognitive skills of analysis and synthesis grow. Both the original formulas and the pieces and rules that come from analysis are retained. (2002: 5)
L1 children are exposed to many formulaic sequences in their input, but how do they decide what to analyze and what to keep at the holistic level? Wray (2002) suggests that a ‘needs-only analysis’ is the mechanism. Rather than segmenting every sequence into the grammar system, children will operate with the largest possible unit, and only segment sequences when it is useful for social communication. Thus the segmentation process is driven by pragmatic concerns (communication), rather than an instinctive urge to segment in order to push grammatical and lexical acquisition. The default would be to not analyze, and to retain holistic forms. Thus children maintain many formulaic sequences into adulthood, even though the components of those sequences are likely to be stored individually as well (perhaps being acquired from the segmentation analyses of other formulaic sequences). This suggests that dual storage is the norm. Of course, relying on holistic versus analytical approaches to language acquisition and use is not an either/or proposition, and children will use both approaches in varying degrees. However, Wray and Perkins (2000) and Wray (2002) suggest that the relative ratios between the approaches may change according to age. During Phase 1 (birth to around 20 months), the child will mainly use memorized vocabulary for communication, largely learned through imitation. Some of this vocabulary will be single words, and some will consist of sequences. At the start of Phase 2 (until about age 8), the child’s grammatical awareness begins, and the proportion of analytic language compared to holistic language increases, although with overall language developing quickly in this phase, the amount of holistically-processed language is still increasing in real terms. During Phase 3 (until about age 18), the analytic grammar is fully in place, but formulaic language again becomes more prominent. ‘During this phase, language production increasingly becomes a top-down process of formula blending as opposed to a bottom-up process of combining single lexical items in accordance with the specification of the grammar’ (Wray and Perkins, 2000: 21). By Phase 4 (age 18 and above), the balance of holistic to analytic language has developed into adult patterns. The course of formulaic sequence development is more difficult to chart in L2 learners. Typically there is early use of formulaic sequences, often after
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Quote 3.2 Wood on the possible double role of formulaic sequences in language acquisition
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 139
6/9/2010 3:04:25 PM
Foundations of Vocabulary Research
a silent period. As learners’ proficiency improves, there is the reasonable expectation of language which is more accurate and appropriate. In natives, this is achieved to a large extent through the use of formulaic sequences. Unfortunately, the formulaic language of L2 learners tends to lag behind other linguistic aspects (Irujo, 1993), but this is not so much a case of the amount of formulaic language use, but rather a lack of native-like diversity. This is probably largely due to a lack of sufficient input. In 1986, Irujo suggested that one specific class of formulaic language (idioms) is often left out of speech addressed to L2 learners, leading to a lack of idioms in learner output. More recently, Durrant and Schmitt (2009) show that a more general type of formulaic language (collocations) seems to be tuned to frequency, with L2 learners producing frequent, but not infrequent, collocation pairs. Furthermore, Siyanova and Schmitt (2008) showed that spending a year in an English-speaking country (with presumably a great increase in the amount of L2 input) led to better intuitions of collocation. However, it may not be just the amount of input that is crucial, but also the quality. Siyanova and Schmitt (2007) found that the amount of exposure to native-speaking environments did not have an effect on the likelihood of using the multi-word verbs. This, however, might be explained by Adolphs and Durow’s (2004) findings that sociocultural integration was the key to their case study learner’s acquisition. This suggests that it may not be exposure per se that is important, but the kind of high-quality exposure that presumably occurs in a socially-integrated environment.
Quote 3.3 Dörnyei, Durow, and Zahran on the sociocultural aspects of L2 formulaic language acquisition Success in the acquisition of formulaic sequences appears to be the function of the interplay of three main factors: language aptitude, motivation and sociocultural adaptation. Our study shows that if the latter is absent, only a combination of particularly high levels of the two former learner traits can compensate for this, whereas successful sociocultural adaptation can override below-average initial learner characteristics. Thus, sociocultural adaptation, or acculturation, turned out to be a central modifying factor in the learning of the international students under investigation. (2004: 105)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
140
The nature of formulaic language and its acquisition is likely to become of ever-greater interest as the field turns to more pattern-based models of language acquisition (e.g. pattern grammar (Hunston and Francis, 2000) and construction grammar (Tomasello, 2003)), which posit that the human facility for language learning is based on the ability to extract
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 140
6/9/2010 3:04:25 PM
141
patterns from input, rather than being under the guidance of innate principles and parameters which determine what aspects of grammar can and cannot be acquired (see Ellis, 1996, 2002; SSLA 24, 2). This line of thinking suggests that we learn the letter sequences which are acceptable in a language (the consonant cluster sp can be word-initial in English, but hg cannot) simply by repeatedly seeing sp at the beginning of words, but not hg. This learning is implicit, and may not be amenable to conscious metalinguistic explanation. Of course, learners may eventually reach the point where they can declare a ‘rule’ for this consonant clustering, but the rule is an artefact of the pattern-based learning, rather than the underlying source of learning. This pattern-based learning also works for larger linguistic units, such as how sequences of morphemes can combine to form words (un-questionable, un-reli-able, un-fathom-able). Moving to words, we gain intuitions about which words collocate together and which do not (blonde hair, *blonde paint; auburn hair, but only for women, not men). Many of these collocations must be based solely on associative pairing, because there is often no semantic reasoning behind acceptable/non-acceptable combinations (*blonde paint makes perfectly logical sense). Neither are most collocations likely to be learned explicitly, because they are not normally taught, and even if they are, only possible cases are illustrated, not inappropriate combinations. Longer formulaic strings, which are also based on patterns rather than rules, seem to fit very nicely with such sequence-based models of acquisition as well. Time will tell whether this kind of model best captures the mechanics of formulaic sequence acquisition (and that of language in general), but one thing seems certain. Given the increasingly evident importance of formulaic sequences in language use, convincing explanations of the mechanics of their acquisition must become an essential feature of any model of language acquisition.
3.8 The psycholinguistic reality of corpus-extracted formulaic sequences We know that formulaic language occurs very frequently in language output, as evidenced by corpus data, and that formulaic language is an important part of overall language processing. This makes the relationship between the two (formulaic sequences extracted from corpora and their psycholinguistic bases in the mind) a very interesting issue. Some scholars believe that collocation is an entirely textual phenomenon, and is not indicative of how language is represented in the mind (e.g. Bley-Vroman, 2002). They believe that collocations arise spontaneously in text as an epiphenomenon of the meaningful use of language in context, rather than being linguistic patterns which can be learned and used. For example, the words dark night occur together simply because nights are dark, and so people will naturally
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 141
6/9/2010 3:04:25 PM
Foundations of Vocabulary Research
use these words together when speaking about the nighttime. However, given all the evidence for the processing advantages of formulaic language, it is difficult to believe that it does not somehow exist in the mind. For one thing, if all language were generated without any formulaic component, we would not expect the degree of conventionality and repetition that we find in corpus evidence. So the question is whether the formulaic sequences extracted from corpus data are psycholinguistically real. There is indirect evidence from word association studies that they may be, because collocation is one of the main categories of association response (Aitchison, 2003; Fitzpatrick, 2006). However, to date there have been few studies which have addressed the issue directly. An exception is Schmitt, Grandage, and Adolphs (2004). They identified a number of different types of formulaic sequence and embedded them in a spoken dictation task. Each ‘burst’ of dictation was longer than short-term memory could hold (i.e. 20–24 words), so the respondents were not able to repeat a burst from rote memory. This meant they were forced to reconstruct the language. The researchers assumed that if the formulaic sequences in the bursts were stored holistically, then they would be reproduced intact, with no hesitation pauses or transformations. The results showed that many of the formulaic sequences in the dictation responses did meet this ‘holistic’ criterion, but also that many did not. A sort of continuum of holisticness seemed to emerge. The authors concluded that many of the corpus formulaic sequences were not stored holistically, but that this varied from individual to individual. From this one study, it seems that any particular formulaic sequence extracted from a corpus may or may not be stored holistically by any particular person.
3.9
Nonnative use of formulaic language
We have seen that formulaic language is very widespread in L1 language use (e.g. Biber et al., 1999; Erman and Warren, 2000; Foster, 2001). In other words, native speakers use formulaic sequences a lot. But what about nonnative speakers? There is a widespread feeling that formulaic language is especially problematic for L2 learners, and its lack/misuse is a major reason why L2 output can feel unnatural and nonnative-like, at least in their compositions (most research on formulaic language has focused on written discourse). Research has only partially supported this impression. We can look at nonnative mastery of formulaic language along at least three dimensions: amount of use, accuracy/appropriacy of use, and goodness/speed of the underlying formulaic intuitions. There is a growing literature about the first two dimensions (based largely on learner corpus data; see Magali Paquot’s web-based bibliography – Section 6.6) but only embryonic research on the last. Let us look at each dimension in turn.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
142
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 142
6/9/2010 3:04:25 PM
Formulaic Language
143
It is easy to assume that that problem with nonnatives is that they simply do not use as much formulaic language as natives. This is largely incorrect, although there can be an element of avoidance (Laufer and Eliasson, 1993; Laufer, 2000b). A series of studies5 have found that L2 usage depends on which formulaic sequences one is focusing on. It is now clear that nonnatives actually use more of certain favorite formulaic sequences which they know well and tend to overuse as ‘safe bets’, compared to natives (De Cock, 2000; Foster, 2001; Granger, 1998). Conversely, they use fewer of other sequences, presumably because they do not know them as well and are not as confident in their use (e.g. Foster, 2001; Granger, 1998; Howarth, 1998). One type of formulaic sequence which seems to be particularly underused is multi-word verbs. For example, Altenberg and Granger (2001) found that their EFL learners had great difficulty with the verb make, especially the delexicalized uses, such as make a decision and make a claim. This is particularly problematic as high frequency verbs like make, look, and do are used in numerous formulaic sequences. Interestingly though, Granger, Paquot, and Rayson (2006) compared formulaic sequences in a 1 million word nativespeaker academic corpus and 1 million words from the ICLE (International Corpus of Learner English) learner corpus and found more cases of overused formulaic sequences than underused ones. Durrant and Schmitt (2009) go some way in explaining which formulaic sequences are overused and which underused. Using a corpus composed of written academic output from Turkish and Bulgarian university EFL students and a mixed group of international university students studying in the UK, they found that these students tended to use frequent premodifier-noun collocations at a rate similar to native students. (Congruently, Siyanova and Schmitt (2008) found that their nonnatives used adjective-noun collocations in frequencies similar to natives.) These are the kind of collocations which are identified by the hypothesis testing measures which focus on frequency such as t-score (good example, long way, hard work). However, the nonnatives produced many fewer low-frequency collocations (densely populated, bated breath, preconceived notions), even though these were very strongly linked (the kind identified by MI). Because of their strong ties, and relative infrequency, they are likely to be especially salient for natives, and so their absence in nonnative output is particularly noticeable. The authors conclude that the lack of these ‘MI’ collocations is one key feature which distinguishes native from nonnative production. In terms of acquisition, L2 learners seem to be able to acquire and use the collocations which appear frequently, but do not seem to pick up as many non-frequent collocations, whose individual component words may also be infrequent in themselves. This is highly suggestive of the role of frequency in the acquisition process. This is supported by Ellis, Simpson-Vlach, and Maynard (2008), who found that for natives, it is predominantly the MI of
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Amount of Use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 143
6/9/2010 3:04:25 PM
144
Foundations of Vocabulary Research
a formula which determines processability, while for nonnatives, it is predominantly the frequency.
Oppenheim (2000) found that much of the language which her nonnative subjects produced in consecutive speeches on the same topic consisted of the same recurrent word strings, but that most of these were idiosyncratic in comparison to native-speaker norms. Thus, just because L2 learners produce formulaic language, it is not necessary formulaic in sense of matching what natives would produce. Nesselhauf (2003, 2005) gives us some idea of how formulaic language can be ‘nonnative’. She extracted 1072 English verb-noun combinations from 32 essays in the ICLE written by German university students. Almost one-quarter of these collocations were judged to be incorrect; moreover, the L1 was deemed to be an influence in 45% of the errors.6 However, the incorrect usage was often the result, not of combining words in an unconventional way, but of using conventional word pairs which are not appropriate (Nesselhauf, 2005). This suggests that the difficulty learners have is not only that of learning which words go together, but also learning how to employ the chunks they know. Therefore, at least for the more frequent collocations, the problem may not be so much in the amount of formulaic language learners use, but in using the formulaic sequences they know appropriately in the right contexts. Goodness/automaticity of intuitions of formulaic language So formulaic sequences can be overused, underused, and misused by nonnative writers (most of this research has been based on analysis of written text), but they are definitely used; there is no question that L2 output is devoid of formulaic language. But how good are the nonnative intuitions of this language? There is little research which addresses this; however, three studies found that nonnative intuitions were not as well-developed as native intuitions. Siyanova and Schmitt (2008) directly compared native and nonnative judgements of the frequency of high-frequency, mid-frequency, and low-frequency adjective-noun collocations on a six-point Lickert scale. They also measured how long it took to make these judgements. They found that the natives had fairly good intuitions of the collocation frequency, and that they made their frequency judgements relatively quickly. Compared to these native norms, the nonnatives judged the high-frequency collocations as being lower frequency, and judged the low-frequency collocations as being much higher. Furthermore, natives were able to distinguish the frequency difference between mid- and high-frequency collocations, but the nonnative as a whole were not. Interestingly though, the nonnatives who spent a year or more in an English-speaking country, were able to make this distinction. Also, the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Accuracy/appropriacy of use
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 144
6/9/2010 3:04:25 PM
145
nonnatives took much longer to make their frequency judgements. Taken together, Siyanova and Schmitt conclude that the nonnatives’ intuitions were not as developed as the natives’, nor were they as automatized. Hoffman and Lehmann (2000) elicited native and non-native speakers’ intuitions regarding 55 collocations from the BNC with high log-likelihood scores (mainly adjective-noun and noun-noun pairs). Respondents were presented with each node in a questionnaire, and were to supply the collocates. On average, the native speakers supplied the ‘correct’ collocate in 70% of cases, which, like the results in Siyanova and Schmitt, indicates relatively good intuitions by the natives. The nonnatives did far less well, achieving an average accuracy of only 34%. This shows a major gap between the native and nonnative intuitions, although in absolute terms, the nonnative results (producing about one-third of the collocates) still indicate considerable knowledge. Phongphio and Schmitt (2006) found that their 21 Thai university undergraduates were quite confident of their ability to recognize multi-word verbs when listening or reading, but they only scored only 55% on a multiple choice test. Indeed, there was little relationship between the self-rating intuitions and multiple choice test percentages, showing that the learners did not know the verbs as well as they thought they did. However, when given a context to guess the meaning of the verbs, the learners were able to produce a Thai definition 75% of the time. This 20 percentage-point increase indicates that the Thai learners were able to use the context relatively successfully to infer the meanings of many of the unknown multi-word verbs. This suggest the usefulness of lexical inferencing as a strategy in dealing with formulaic language. This poorer intuitive mastery is reflected in learners’ production. While natives tend to resort to formulaic language to get through time-pressurized communicative situations, nonnatives do not seem to make greater use of formulaic language in such cases, either in speech or writing (Foster, 2001; Nesselhauf, 2005). In terms of speech, nonnatives tend to use many recurrent dysfluency markers (such as filled pauses and hesitation markers), although it seems that extensive interaction with native speakers enables them to overcome this (Adolphs and Durow, 2004; De Cock, Granger, Leech, and McEnery, 1998). However, in terms of writing, neither amount of use nor accuracy of collocation appears to increase with time spent in an Englishspeaking country (Nesselhauf, 2005; Yorio, 1980). So, although a year or more spent in an English-speaking country can lead to better intuitions of collocation (Siyanova and Schmitt, 2008), it seems difficult to extend this into increased production of formulaic language.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Formulaic Language
Summary It seems that mastery of formulaic language takes a long time to acquire, and is a hallmark of the highest stages of language mastery. Language testers
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 145
6/9/2010 3:04:26 PM
Foundations of Vocabulary Research
have picked up on this and often include items which focus on phraseology in their highest level examinations. Formulaic language is an important element of language overall, perhaps the essential element. Research into it is only now gaining momentum, but given its ubiquitousness and demonstrated processing advantages, it looks to be one of the most important areas of enquiry in the applied linguistics field for the foreseeable future. This will become even more so as interest increases in newer approaches to language description which focus on larger lexico-grammatical units, such as pattern grammar and construction grammar. For the moment, there are many important questions about its acquisition and use that remain unanswered. For example, one basic one concerns the size of the formulaic lexicon, i.e. just how many formulaic sequences do natives and nonnatives know (Kuiper et al., 2009; Schmitt, Dörnyei, Adolphs, and Durow, 2004)? Furthermore, the questions posed by Schmitt and Carter in 2004 cover some other key areas which are still awaiting research attention, and which would make excellent topics for PhD research: 1. How are formulaic sequences acquired in naturalistic and formal settings? What is the same/different about learning formulaic sequences in these settings? What is the best way to teach formulaic sequences? Can they be taught at all? 2. What is the relationship between knowledge of formulaic sequences and knowledge of their individual component words? 3. How many exposures are necessary to learn formulaic sequences with various kinds of input? Is it the same as for individual words? 4. What is the nature of attrition of formulaic sequences? Are some elements retained better than others, or is the whole chunk either retained or forgotten? 5. Which elements of a formulaic sequence are most salient? Do formulaic sequences cluster around a key word or core collocation? 6. Are formulaic sequences learned in an all-or-nothing manner? 7. Does giving attention to formulaic sequences increase the chances of their acquisition?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
146
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_04_cha03.indd 146
6/9/2010 3:04:26 PM
Part 3
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 147
4/13/2010 2:52:10 PM
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_05_cha04.indd 148
4/13/2010 2:52:10 PM
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
4
4.1
Qualitative research
Although this book focuses mainly on quantitative research, it is worth noting the value and uses of qualitative research. Current best research practice is to use multiple measurements to triangulate results from more than one approach in order to achieve more robust findings. Qualitative methodologies can often enhance the information we get from quantitative approaches. For example, interviewing a sample of the participants used in a quantitative study can often provide much rich information which supplements the statistical findings. A good example of this is in the validation study for the Vocabulary Levels Test (VLT), where Schmitt, Schmitt, and Clapham (2001) interviewed a subset of the participants who completed the VLT and asked them in individual interviews about the meanings of the words on the test. In this way, the researchers could confirm whether the words were actually known or not, and this could then be compared to the results on the test. Such in-depth interviews can also be very informative about self-report data, such as questionnaire responses. Interviewing a sample of informants can tell much about how carefully and accurately they completed their surveys. Similarly, observation can be useful in verifying self-report behavioral data. For example, this is an obvious methodology for confirming the selfreport questionnaire data usually used in strategy research, but is hardly ever taken up, leading to serious questions about the veracity of most of the questionnaire results. Rich qualitative descriptions of vocabulary (such as from case studies) can go some way towards providing an account of how well lexical items are used. They can be particularly useful in studies of productive lexical output, where post hoc statistical analyses of vocabulary (e.g. type token analyses) have often proved to be less than informative. In one such case study, Li and Schmitt (2009) followed a single advanced learner (‘Amy’) and described her acquisition of formulaic sequences. They reported that Amy learned
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology
149
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 149
4/13/2010 2:52:10 PM
Researching Vocabulary
166 sequences over a ten-month MA course. However, their in-depth interviews and analyses also allowed them to discuss the source of acquisition for these sequences, and Amy’s improvement in her appropriacy of usage with them. This is the kind of information that can often only be acquired from concerted qualitative engagement. In short, vocabulary researchers should consider what value qualitative methodologies might add to their studies. A number of research manuals provide guidance on qualitative approaches, including Dörnyei (2007), Heigham and Croker (2009), and Wray and Bloomer (2006).
4.2
Participants
One of the most difficult aspects of doing research is often getting subjects. Even if you have an unusually cooperative subject pool, it is important to find a ‘carrot’ for the informants (and often the teachers of student informants), for, without the feeling that there is something ‘in it for them’, it is difficult to maintain motivation and interest of the participants, or even gain access to them in many cases. This is particularly true in studies which require large numbers of subjects, or require longitudinal data. A good example of this is Albrechtsen, Haastrup, and Henriksen (2008), who had to maintain interest in their 140 participants over four four- to five-hour sessions completed within two weeks (i.e. 15 hours and 30 minutes timeon-task plus breaks). Personal conversation with the researchers revealed the full extent of their motivational techniques, including payment, cookies, reminders of each upcoming session, some pleading, plus a party and a lottery for a trip to London! Unsurprisingly, there was still a certain amount of attrition, but the researchers’ efforts kept this to a minimum. This points out the wisdom of starting out with more subjects than you wish to end up with in longitudinal studies, as attrition is inevitable, although it can be managed to some extent if participant motivation is addressed in the initial research design. Meara (1996a) notes that lexical researchers need to carefully consider the number of subjects required for a study. This should be enough to iron out the variation due to individual differences. For psycholinguistic tasks with very precise measurement (Section 2.11), this may require only 10 or 20 subjects. For some statistical procedures (e.g. IRT (Item Response Theory) analysis, factor analysis) it may take hundreds. Even more crucially, the characteristics of the learners need to be considered. The L1 is of paramount importance, as the following example by Meara clearly illustrates:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
150
... different types of language present quite different learning problems to individual learners. Take for instance, the cases of a Dutch speaker, a Spanish speaker, an Arab and a Vietnamese learning English. By and large, the Dutch speaker will find basic English vocabulary easy, since
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 150
4/13/2010 2:52:10 PM
most of it is cognate with items in his own language. He might have problems with less frequent vocabulary, but by the time he gets to that stage, he probably has reached a high level of independence and autonomy anyway. In contrast, a Spanish speaker will generally find basic English vocabulary difficult: it is structurally very different from basic vocabulary in Spanish, and there are few cognates. However, Spanish speakers have a huge latent vocabulary of low frequency English words which are cognate with Spanish items, and this should mean that their ability to acquire new words improves dramatically with their general level of competence in English. The Arab and the Vietnamese speakers have no such help from their L1, and the process of acquiring new words will never get any easier for them. At the same time, however, these two learners will find English vocabulary difficult in different ways because of the way their L1 lexicons are shaped and structured. (1996a: 6) Just as lexical items have different characteristics which affect their acquisition and use, so do participants. As part of controlling potential confounding factors in a study, it is necessary to consider informant characteristics as well. Perhaps the most obvious one is the informants’ L1. Another learner variable is L2 language proficiency, as acquiring words in a completely new language is quite different from acquiring the same number of words in a language that is moderately well-known. Subjects who are stark beginners will not have any background knowledge of the target language to help them learn. Conversely, more advanced learners will have previous language knowledge which facilitates their learning. Such learners will already have developed a good feel for the formal aspects of words in the target language. This should reduce the learning burden considerably and make it easier to acquire the target language words, the more proficient the learner is. At the same time, morphological information and comparisons with known words of similar meaning should also make it easier to fix the meaning and form of a target language word. (Meara, 1996a: 5) Language proficiency also determines to what degree learners can take advantage of any contextualization in language learning tasks or tests. Unless it is carefully adjusted to their level, beginners may not understand much of the language in a contextualized task/test, and so it may offer no more support than a non-contextualized one. However, lexical acquisition and use can also vary systematically according to several other factors, including: experience, exposure, type of exposure (e.g. academic texts, casual conversation), and type of strategy used (e.g. reading technical manuals in one’s field, talking with native taxi drivers). It is important to think about all these factors in the initial research design
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 151
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 151
4/13/2010 2:52:10 PM
Researching Vocabulary
stage, so that there is a good match between the research goals and the informant base. In addition, it is useful to gather detailed biodata during the study, as otherwise unexplainable cases of vocabulary variation can sometimes be resolved from this information. For example, the type of previous exposure and strategy preferences might explain why some participants might know more academic and technical vocabulary (exposure mainly from technical written discourse), while others might know much more about casual spoken conversation (strategy of talking to native speakers). From a practical standpoint, the information gathered might turn out to be crucial to the analysis, but even if not, its non-use will not spoil the study. The above considerations also mean that researchers need to be very careful when generalizing from a group of participants in a study to the wider population of L2 language learners in general. Most often, generalizations will have to be constrained in terms of various learner characteristics mentioned above, most particularly L1 and proficiency.
Quote 4.1 Meara on individual differences in vocabulary acquisition It seems to me that the question of how much individual variation there is in vocabulary skills really needs to be made a top priority in L2 vocabulary acquisition research. (1996a: 8)
4.3
The need for multiple measures of vocabulary
We have seen that vocabulary knowledge is a complex construct, and that any single measure of it will give only a very minimal impression of the overall lexical knowledge constellation. This means that good vocabulary research is advantaged by multiple measures of vocabulary, to better capture a wider range of lexical knowledge. This can include facets of vocabulary size, depth, automaticity, and network richness. Of course, it is virtually impossible to measure all of these aspects at the same time. But vocabulary researchers should be committed to using more thorough measures when their research contexts allow, in order to provide a better description of the lexical effects they are exploring. The need to use multiple measurement takes on several aspects. One is describing vocabulary knowledge according to receptive/productive mastery. If using a productive measure, it might be reasonable to assume that a demonstration of productive knowledge also implies receptive mastery, based on research which shows that this is generally true. The real danger is making generalizations in the other direction. Meara (1996a) warns of
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
152
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 152
4/13/2010 2:52:10 PM
making generalizations about lexical items being ‘learned’ on the basis of only receptive multiple choice tests, as is very often the case in vocabulary studies. Most receptive multiple choice tests measure only the formmeaning link, and then only at a recognition level. This can probably be considered the most basic initial stage along the incremental learning process, and so should be described as such when this type of item is interpreted. Researchers need to take the ideas in Section 2.8 on board and report clearly the degree of receptive/productive mastery their study is addressing. This caveat about overgeneralization can probably be extended to any receptive format test format. Another reason to include both receptive and productive measures is that they can give quite different impressions of vocabulary knowledge. For example, Groot (2000) describes tests on participants who learned vocabulary with word lists and a computerized vocabulary training program. When tested with receptive translation/definition tests, it was found that word lists lead to better learning. However, when tested with a productive cloze format, then the computer program led to better results. Thus, each individual test format provided only partial information, and if interpreted by themselves, would give a misleading picture of the vocabulary learning which took place. By combining these measures, Groot concluded that it is probably best to use both teaching methods in conjunction. The word list builds on L1 knowledge, and the learning program reinforces this knowledge and goes on to enhance it.
Quote 4.2
Meara on translation as a vocabulary test format
[The studies Meara critiqued] simply ask for the Target Language word to be translated into English, and this means that even in the experiments where words were initially learned in contexts, only the ability to recognise decontextualised words was measured. It is not obvious to me that this measure is a good test of how well vocabulary items have been learned. At best it tests passive recognition skills rather than active acquisition of items ... Testing in this way gives no indication of whether a particular word can be put to active use, or whether some partial knowledge might have been acquired which could facilitate learning in future encounters. Furthermore, this kind of testing gives no indication of how resistant the word might be to forgetting or to confusion with other words, both problems which increase as the number of words to be learned gets larger. (1996a: 7–8)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 153
Another aspect involves describing lexical knowledge according to the various types of word knowledge. Again, most studies measure only the form-meaning link, and if only one thing can be measured for practical reasons, this is a logical choice, because it is the essential ‘core’ knowledge,
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 153
4/13/2010 2:52:10 PM
Researching Vocabulary
without which little constructive use can be made of the lexical item. Furthermore, measuring the form-meaning link makes sense for beginners, who are unlikely to have developed much of the more advanced types of word knowledge for any of the items in their lexicon. In fact, this is true for even more advanced learners and native speakers: any time a new lexical item has just been learned, little more is likely to be known than the formmeaning link. However, for learners beyond the beginning stage, there are likely to be lexical items which are more advanced along the incremental pathway, and form-meaning test items are likely to miss much of the additional knowledge in place. Thus, for more advanced learners (or even some of the lexical items relatively well-known by beginners), it may be appropriate to measure word knowledge types beyond just form and meaning. It is also useful to note that there is an interaction between word knowledge aspects and receptive/productive mastery. For example, learners may know the form-meaning link of a lexical item productively, but only know its various derivative forms receptively. This makes it important to discuss both receptive/productive mastery and word knowledge when reporting results. Similar kinds of observations can be made concerning knowledge of spoken versus written vocabulary. Up until now, the discussion has focused on individual items, but based on the ideas in Section 2.4, it may be appropriate to also consider how well lexical items cohere to make up a systematic lexical network (i.e. lexical organization). Likewise, automaticity is a key requirement of real-time vocabulary use. This aspect has generally been ignored in applied linguistic lexical research (although a mainstay in the psycholinguistic approach), and is deserving of attention. Again, we can hypothesize an interrelationship with the other aspects of lexical knowledge (e.g. Henriksen, 1999). Improvement of overall lexical mastery is likely to consist of both declarative knowledge (e.g. the form-meaning link), and the ability to use this knowledge with ever-greater (hopefully) speed and ease. An improvement of mastery may well involve a plateau of knowledge, but an increase in speed of processing. We might speculate1 that an increase in knowledge would speed up processing by adding connections between the lexical item and the rest of the lexicon. (Or would the additional lexical information slow down the processing initially, until it is integrated?) Measures of only the ‘knowledge’ aspects of lexical mastery will fail to spot any improvements in automaticity. The exception might be if there is a timed element added to the measurement. The upshot is that different measures tap into different facets of lexical knowledge. When reporting results, it is essential to report exactly what lexical knowledge can be inferred from the item format(s) used, and preferable to also highlight what knowledge cannot be inferred, so that readers have a very clear idea of the extent of lexical knowledge/gain in the study. If all researchers do this, the various studies will be much more comparable, and there will be a much greater chance of subsequent studies being able
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
154
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 154
4/13/2010 2:52:11 PM
Issues in Research Methodology 155
to build upon previous ones in a logical and coherent manner, in the way Meara (1996a) envisages.
The need for longitudinal studies and delayed posttests
Although there have been non-treatment studies which sought to describe the development of vocabulary over time (e.g. the research reported in Albrechtsen et al., 2008), most vocabulary acquisition studies involve some form of treatment (various kinds of instruction or input) and then an immediate posttest to determine the effect of the treatment. While this ‘immediate posttest’ information can be informative, it is limited in a number of ways. First, one or a small number of exposures are unlikely to lead to longterm acquisition, and so an immediate posttest has the very real danger of overestimating the degree of durable learning. Similarly, learning is always liable to attrition. In fact, one of the most reliable findings in vocabulary studies is that scores on immediate posttests almost inevitably drop when later measured on a delayed posttest. This means it is not possible to interpret scores on an immediate posttest as long-term learning. Immediate posttests can only give a snapshot of vocabulary knowledge, and cannot inform about the dynamic and incremental nature of the learning process. We know that attrition occurs in any learning, and so need to include delayed posttests in order to capture the long-term learning. Second, learning is not linear, and so the rate of learning achieved in a study may or may not be applicable over the longer term. For example, if a student was able to ‘learn’ five new words from a 300-word reading passage, this does not mean she would necessarily be able to learn another five new words from the next 300 words in the passage tomorrow. In some cases, the learning rate may decrease because the input becomes less effective over time. In other cases, the learning may accelerate, as learners become more skilled in using the input. Only by measuring participants’ learning after subsequent sessions can the effect of cumulative learning be effectively determined. Third, it is a well-known phenomenon that the effectiveness of practice decreases with the amount. This is known as the power law of learning, where the effects of practice are greatest at early stages of learning, but then eventually reach asymptote (e.g. Ellis, 2006a). This means that the initial practice will be more effective than later practice. Thus the rate of learning from a small amount of vocabulary learning practice will probably not be maintained as the amount of practice increases; there will be diminishing returns. Where studies typically have a limited amount of engagement with the vocabulary items for practical reasons, the resulting learning rate may be higher than if the study examined a longer-term/more intensive treatment regimen. Non-longitudinal studies thus need to be interpreted carefully when generalizing about possible effects of increased practice.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
4.4
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 155
4/13/2010 2:52:11 PM
Researching Vocabulary
Fourth, the different aspects of word knowledge will require longer or shorter periods of time to master. While an initial form-meaning link might be established with small amounts of input over a short period of time (habit = a tendency to behave in a particular way, especially regularly and repeatedly over a period of time), it will probably require a large amount of input to develop intuitions about the more nuanced meaning connotations of the word (habit very often refers to negative behavior: Smoking is a nasty habit; annoying habit). In summary, vocabulary learning is longitudinal and incremental in nature, and only research designs with a longitudinal element can truly describe it. There are two main ways of incorporating a longitudinal element. The first is simply by adding one or more delayed posttests. The value of delayed posttests is self-evident, having the key attribute of confirming durable learning. Given the incremental nature of vocabulary learning, and its susceptibility to attrition (especially if not well-established), only delayed posttests can demonstrate if long-term retention (i.e. learning) has occurred. I would suggest that immediate posttests should be interpreted mainly as showing whether the treatment had any effect (e.g. did the process of acquisition begin?, were the target lexical items enhanced in any way?, did learners notice the target items in the treatment?), and only delayed posttests interpreted as showing learning. Of course, it is not always practical to use delayed posttests (participants disappear or are unwilling to repeat an assessment), but if immediate posttests are used as the sole measurement, researchers must be extremely cautious when interpreting the type and amount of knowledge enhancement that these tests demonstrate. When interpreting learning, any interim posttest exposures should be factored into the interpretations. That is, the immediate posttest is an additional exposure which will increase learning in a delayed posttest, compared to a research design with only a delayed posttest. This is especially true as participants tend to give tests a great deal of focused attention, which generally facilitates learning. We would therefore expect better delayed scores in a T1-treatment-T2-delayed T3 design than a T1-treatment-delayed T2 design, even though the treatment is equivalent. There is then the practical question of the length of the delay. The short answer is that there is no standard period of delay, and that any delay beyond the immediate posttest is better than nothing. Research by Gaskell and colleagues (Davis, Di Betta, Macdonald, and Gaskell, 2008; Dumay and Gaskell, 2007; Gaskell and Dumay, 2003; Dumay, Gaskell, and Feng, 2004;) has shown that the integration of a new lexical item into the mental lexicon begins to take place within 24 hours after exposure, with the key factor being a night’s sleep. This indicates that the delayed posttest must be a minimum of two days after the treatment in order to capture this integration. Furthermore, memory research shows that most attrition occurs relatively
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
156
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 156
4/13/2010 2:52:11 PM
soon after a learning event, and then the rate of forgetting decreases over time (Baddeley, 1990). This suggests that the delayed posttest needs to be administered after the period of initial large loss. Most of the colleagues I have spoken with feel confident that a delayed posttest of three weeks should be indicative of learning which is stable and durable. If this three-week ideal cannot be met for practical reasons, I would suggest that any delayed posttest of less than one week is likely to be relatively uninformative, and should be avoided if possible. But whatever interval is practical, delayed posttests should be included in all acquisition research designs.
Concept 4.1
Cloze
Cloze is simply another name for the fill-in-the-blank format. It is commonly used in teaching materials and in language tests. The name cloze comes from the psychological notion of closure, where the mind abhors a vacuum, and attempts to fill in any noticed gap with a logical concept or linguistic feature.
There is also the issue of test format type for measuring durable learning. Groot (2000) notes that if the assessment goal is long-term learning (as opposed to results from immediate posttests), then productive formats such as cloze tests might be more suitable than receptive test formats. (See Quote 4.3). However, balanced against this, I think that learning of some lexical items (e.g. low-frequency words) to only a receptive level of mastery is a perfectly reasonable goal, and for these, receptive tests would be more appropriate.
Quote 4.3
Groot on measuring long-term retention
Knowing a word may be seen in operational terms as a continuum ranging from vague recognition of its spelling to (semantically, syntactically, stylistically) correct and contextually appropriate productive use. Retrieval of a word from the mental lexicon for productive use requires a higher degree of accessibility or, in other words, a more solid integration in various networks than is needed for receptive use. For measuring this higher level of mastery, a test which asks testees to simply recognise a word and give its meaning is unsuitable; a test [such as] using the cloze technique, which measures testees’ ability to produce the word themselves, is much more valid for that purpose ... [F]or a meaningful interpretation of the data, it is essential to give an accurate description of what one understands by the trait ‘knowing a word’ and of what trait is intended to be measured by what testing method.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 157
(Groot, 2000: 76)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 157
4/13/2010 2:52:11 PM
Researching Vocabulary
The second way of incorporating a longitudinal element into vocabulary studies is to design longitudinal studies which track the learning process over a period of time through a series of measurements. Such studies can be much more difficult to set up, and require the longer-term cooperation of participants, but are often very worthwhile. If participant numbers are a problem, longitudinal case studies can provide some of the best information on the way vocabulary learning progresses because of the level of detail that can be achieved. Including longitudinal elements is particularly important when studying word knowledge types that take longer to learn, such as connotation or collocation.
4.5
Selection of target lexical items
In vocabulary research, one of the most basic (and critical) steps is selecting the target lexical items. If they are mismatched to the research purpose for whatever reason, the resulting study is likely to be severely compromised. It is thus important to consider the vocabulary characteristics discussed in Chapter 2 when making this selection. Some of the implications for selection are outlined below. Single words and multi-word items One decision is whether to include only single word lexemes, only multiword items, or both. Most vocabulary research has focused on one-word items, because they have been traditionally seen as the ‘words’ that make up vocabulary, with multi-word items being only a relatively unimportant peripheral category of lexis, mainly made up of low-frequency items like idioms or proverbs. However, corpus research has demonstrated that much more lexical patterning exists in language than previously realized, and that this patterning plays an important role in the way language is both used and processed (see Chapter 3). With this in mind, not including multi-word lexemes runs the risk of excluding a major component of vocabulary. However, if you choose to use multi-word items, several hurdles must be overcome. First, there are many types of formulaic sequence, including but not limited to: idioms, proverbs, clichés, sayings, explanations, lexical phrases, lexical bundles, collocations, and phrasal verbs. There is no principled way to decide which of these types to include, unless the research is interested in a particular category, for instance, idioms. Moreover, there is no comprehensive listing of these multi-word items to refer to, and in any case, new ones are always coming into play, and old ones dropping out of the language. This is one factor that makes a principled sampling of formulaic sequences almost impossible. In addition, although formulaic sequences as a category occur frequently in discourse, with the exception of a few high-frequency items (e.g. on the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
158
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 158
4/13/2010 2:52:11 PM
other hand), any particular individual multi-word item is likely to be relatively infrequent. This also makes the selection of a representative sample extremely difficult. Just because a certain formulaic sequence on a measurement instrument is correctly answered, it is not necessarily possible to extrapolate from this that other formulaic sequences are also known. The lack of information on the scope and makeup of the phrasal lexicon means that chance can play a large, yet unknowable, role in whether any particular formulaic sequence is selected for a study, and how representative it might be. In the end, frequency may be the best way of deciding which (if any) formulaic sequences to include in a study. When studying the acquisition of L2 learners, there will probably be a focus on the most frequent vocabulary, which excludes most multi-word items (which are generally low frequency). However, the class of phrasal verbs might be included, as many are quite frequent, particularly in spoken discourse. Conversely, formulaic sequences can be a very good class of lexis to use when distinguishing between very advanced learners. Of course, if research is focused on formulaic sequences, they may be the only vocabulary items included in the study (e.g. the series of studies on formulaic sequences in Schmitt, 2004). Formal similarity Beginning learners of English often confuse words that are orthographically similar with one another. This occurs in reading, when learners come across a word and confuse it with a similarly-spelled word. Unfortunately, once a word is misrecognized, learners often do not realize their error, even if the context is totally inappropriate for the mistaken meaning they have assigned to the word (Haynes, 1993). There is relatively little research into whether this happens in listening as well (slips of the ear?), but there is no reason to believe that it does not. Association tasks have also shown that learners, especially beginners, often confuse words of a similar form in what are called clang associations (Section 2.4). Based on this tendency to misrecognize formally similar words, it is probably best not to select target words which have a similar form to other words in the target word pool (unless you are specifically studying this phenomenon). Also, L2 words with large orthographic neighborhoods (i.e. having formal similarities to many other words in the L2) might be more difficult than L2 words which are similar to few other L2 words (Section 1.1.8). A testing implication is that even though a learner indicates that they know a word (such as on a self-report Yes/No test), this does not necessarily mean they know that word, for they might be mistaking it for another word they know with a similar form. Therefore some check for this is advisable when validating the measurement instrument, in the case of Yes/No tests, usually through the use of plausible nonwords.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 159
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 159
4/13/2010 2:52:11 PM
160
Researching Vocabulary
Although Laufer’s (1997) review of early vocabulary research concluded that word class made little difference in the ease or difficulty of learning words, psychological researchers have found it does make a difference, at least in their research paradigms. It thus seems prudent to control for word class in all vocabulary research. This can be done by making sure that each word class being addressed in a study has the same number of target items. Another way is to only work with one word class in a study, to ensure that this factor does not confound the results. Homonymy and polysemy Homonymy and polysemy come into play mainly when counting vocabulary (see also Section 2.1.3). For instance, should the homonyms bank (financial), bank (river), and bank (plane turning) be counted as one lexeme or three? Similarly, should the different polysemous versions of chip be counted as one or several lexemes? The choices made can lead to quite different size figures, and so the method used must be clearly stated in your research report. For more details see Section 5.2.1. The method of counting these multiple form-meaning correspondences should also affect the way we interpret the results of such counting studies. A large percentage of the most frequent words (at least in English) are polysemous, and this has enormous implications for the learning load of these words. Clearly mastering a highly polysemous word will involve more learning burden than learning a technical word with one specific meaning. Therefore researchers need to be careful about equating the learning of different types of vocabulary. Literal and idiomatic meaning The knowledge of idiomatic meaning senses is a hallmark of more proficient language users, and so a researcher must normally match target words with idiomatic meanings to participants of a relatively higher proficiency. If one is interested in whether idiomatic targets are known or not, then these targets are appropriate with any level of student. However, if one is using a vocabulary test to discriminate between learners, then learner proficiency is crucial. Idiomatic targets can make up some or most of the test items for discriminating between learners with a very good knowledge of lexis who can operate comfortably in most situations, and those with an excellent lexical knowledge, who can understand and use the less common meaning senses in a language. In fact, many of the standardized proficiency tests use idiomatic targets, particularly phrasal verbs, to separate the very best examinees from the rest. Conversely, it makes little sense to include idiomatic targets on a test meant to discriminate between beginners. This is because very few beginners will be successful with these items, and as a result, all of the ‘zero’
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Word class
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 160
4/13/2010 2:52:12 PM
Issues in Research Methodology 161
scores will tell you little about the differences in lexical knowledge between the beginners in your study group.
Frequency is probably the most important aspect of vocabulary you must control for in your research designs. This is because frequency is the best indicator we have of how likely people are to know words, and in what order those words will be learned. Frequent words are for the most part not inherently any easier than nonfrequent words, but, on average, they will be encountered more often, which means that they are more likely to be known than nonfrequent words. (Or more precisely, the most frequent meaning senses of these mostly-polysemous high-frequency words will be more likely to be known.) One might argue that frequent words have some ‘ease advantage’ due to Zipf’s law, which roughly states that the shorter a word is, in terms of syllables or letters, the more frequent it is. However, shorter, more frequent words also suffer from a greater degree of polysemy than longer, less frequent words, and so any ease in terms of learning the form of a word might be offset by the difficulty in dealing with multiple meaning senses. Although lexemes can be inherently more or less difficult to learn due to other factors than word length (Laufer, 1997), the effects of frequency can easily override this inherent ease/difficulty, and so if you wish to create a pool of target words that are of equivalent difficulty, then you must ensure that they are roughly the same frequency. This does not mean that they must appear in adjacent positions on a frequency list (e.g. useful – 1,000th most frequent word in English; extent – 1,001st most frequent word), but they should normally be within 100–200 positions of one another for most study designs. If one is working within a psycholinguistic paradigm, e.g. using reaction times measured in milliseconds to measure familiarity with words, then the words should as similar as possible in frequency, because frequency is a powerful influence in such experimental designs. Frequency is important in a range of other issues as well. A few examples will serve to give an idea of the many ways frequency needs to be considered when setting up a vocabulary study. In most studies it is useful to take a measurement of the general vocabulary knowledge of your participants. This is usually done with a vocabulary size test (see Section 5.2). There are a range of such tests, with the Vocabulary Levels Test (Nation, 1990; Schmitt et al., 2001) being commonly used in second-language vocabulary studies. It samples English vocabulary at four frequency bands: 1,000 frequency level, 2,000 level, 5,000 level, and 10,000 level. It also includes an estimate of academic vocabulary. The entire test takes between 30 and 75 minutes to administer, depending on the examinee. However, it is not usually necessary to give all five sections. If given to beginners, certainly the 10,000 level,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Frequency
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 161
4/13/2010 2:52:12 PM
Researching Vocabulary
and probably the 5,000 level as well, should be excluded. They will provide little useful information, because examinees are likely to know very few of the words at these lower-frequency levels, and are only likely to become discouraged and frustrated in attempting them. Conversely, the 2,000 and 3,000 levels are good candidates for exclusion for advanced learners, as these examinees should obtain maximum scores at these levels. While they will not get frustrated by getting all of the items correct at these levels, a better use of time might be to measure more of the lexical feature you are focusing on in the study. Another example of the effects of frequency is in association tasks. In these, a stimulus lexeme is given, and participants asked to give one or more responses as quickly as they can (Section 2.4). The general behavior from a group of native speakers is that a small number of association responses are relatively frequent (e.g. for the stimulus sun: moon, shine) and a larger number of responses are relatively infrequent (warmth, ray). However, the response behavior depends to a great degree on the stimulus lexemes. Stimuli which have a high frequency of occurrence in a language tend to have more stable responses. For example, in the Edinburgh Associative Thesaurus, the high-frequency stimulus night elicits day over half of the time (52 out of 99 responses; 26 different response types). On the other hand, its lowerfrequency near-synonym evening elicits such a wider range of responses (37), with the most frequent being night (20/98) and morning (17/98). Because of this, it is better not to use very high-frequency words as stimuli (Fitzpatrick, 2007), unless perhaps if you are working with beginners (Henriksen, 2008). Characteristics of lexical items for use in psycholinguistic experiments The selection of target lexis important in all vocabulary studies, but it becomes even more critical in psycholinguistics experiments. When measuring lexical processing in milliseconds, any uncontrolled confounding factor that affects processing speed can jeopardize the entire experiment. Thus it is especially essential to control as much as possible for all factors which can affect processing speed. A number of authors have commented on these factors (e.g. de Groot and van Hell, 2005; de Groot, 2006; Ellis and Beaton, 1993; see also Chapter 2), of which the following are among those usually highlighted: ● ●
● ● ● ●
phonotactic regularity structural complexity (e.g. containing more or less complex consonant clusters) degree of correspondence between graphemes and phonemes familiarity of formal features conformity of L2 word form to L1 norms morphological complexity
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
162
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 162
4/13/2010 2:52:12 PM
Issues in Research Methodology 163
● ● ● ● ● ●
word length word class concrete/abstract words imageability of concept word meaningfulness cognate status frequency.
The effect of some of these factors has been shown to be substantial. For example, de Groot and van Hell (2005) review the literature and report that recall scores for concrete words are from 11% to 27% higher than for abstract words. Likewise, cognate words produced scores 15–19% higher for highly experienced foreign-language learners, and 25% (receptive test) to 50% (productive test) higher for less-experienced learners. This indicates that factors like these need to be controlled for, but getting measures for all the above factors would be quite a task. For instance, quantifying just imageability would entail having groups of participants rate lists of words according to how easy or difficult it is to form a mental image of the underlying concept. Doing something like this for each potential confounding factor would be prohibitive. Luckily, there is a website with data for the different word factors. It is the MRC Psycholinguistic Database (Coltheart, 1981) and is available at . To create lists of target words, the researcher checks the boxes of the factors which need to be controlled for, and then sets the minimum and maximum parameters, e.g. check the ‘Number of letters’ box, and set the length between three and five letters long. Below are the factors which are available on the site. Number of letters Number of phonemes Number of syllables Kuˇcera–Francis written freq Kuˇcera–Francis no. of categories Kuˇcera–Francis no. of samples Thorndike–Lorge written freq Brown verbal frequency Familiarity rating Concreteness rating Imagability rating Meaningfulness: Colorado Norms
Meaningfulness: Paivio Norms Age of acquisition rating Word Type Comprehensive syntactic category Common part of speech (N/V/ adj/Other) Morphemic status (Prefix/Suffix/ Abbrev/etc.) Contextual status (Colloquial/ Dialect/etc.) Pronunciation variability Capitalization Irregular plural Stress pattern
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 163
4/13/2010 2:52:12 PM
Researching Vocabulary
There is one caveat to using the MRC database, or others like it. You may have noticed that it uses quite old frequency information, i.e. the Kuˇcera– Francis (1967) and Thorndike–Lorge (1944) counts (see Section 5.1.2). Because frequency is one of the most important factors affecting processing, it makes sense to refer to more current frequency counts, such as Leech, Rayson, and Wilson (2001), which are based on more contemporary English and much larger corpora. Researchers can also choose to use nonwords in their studies (see Section 5.1.2). A website which automatically generates nonwords formed to the norms of English is the AKC Nonword Database (Rastle, Harrington, and Coltheart, 2002), available at .
4.6
Sample size of lexical items
Time is precious. You never have enough when doing vocabulary research. This is because the more samples you obtain from your participants, the more valid and reliable your results should be. This measurement truism is always in tension with the practical reality that the amount of time to administer a study is always limited. It is usually limited by practical considerations, e.g. teachers will give you one class period to do your research with their students, and this quite often is around an hour long. Even if you are doing one-to-one interviews and there is no ‘official’ time limit, you will find that there is a limit of how long even the most enthusiastic participant can concentrate on a task. This means there is still an effective time limit, beyond which data becomes invalid, or at least suspect. The upshot of this is that you must carefully design your study with time limitations in mind. This will entail prioritizing which information you are able to measure. You will need to pilot your instruments, not only to see if they work, but also to see if they can be completed in the time period you have available to administer them. Time constraints have an inevitable impact on the number of lexical items that can be incorporated into a study. Most lexical studies try to generalize to quite large amounts of vocabulary (e.g. the size of a learner’s lexicon, the number of words in a book, the first 2,000 word families in English). It is therefore important have a large sample population of words (e.g. the target words on a test) from which to generalize to the whole population of vocabulary one wishes to make statements about (e.g. a learner’s total vocabulary size). Thus, in terms of generalizability, the more target items the better, with the (usually) unachievable ideal being the measurement of all the lexical items in question. In practice, a researcher is usually able to measure only a small proportion of all of these lexical items. Also, the test type has an effect. Some item formats, like checklist tests, allow relatively more lexical items to be measured, while more involved formats, like cloze items or think-aloud methodologies, will take more time and so fewer items
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
164
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 164
4/13/2010 2:52:12 PM
can be included. There is also a tension between the amount of depth of knowledge which can be measured, and the number of items included. This leaves the tricky question of how large (or how small) a target sample is acceptable. Unfortunately, there is no set answer, only the observation that ‘The more, the better’. The only concrete guidance is that the researcher needs to be able to argue that the sample which is measured gives reliable and meaningful information about the vocabulary being discussed. In other words, the sample size needs to be large enough to realistically model the behavior of the total vocabulary population. A discussion about this needs to be included in all research reports, including any limitations to the generalizations drawn from the sample to the whole. One should take account of the points made in Quote 4.4, and carefully consider how the acquisition and use of larger sets of vocabulary may differ from the much smaller sets of target lexical items commonly used in studies. For example, the time required to learn a word in a small set of lexis may be much less than the time required to learn the same word in a larger set of vocabulary, due to the possible decreasing efficiency of whatever learning strategy is being used. It would thus be erroneous to generalize the faster rate of acquisition from a small set of words to a much larger set of words.
Quote 4.4 Meara on the problems of using small numbers of target words in lexical studies The basic problem is that ... [researchers assume] it is possible to model the acquisition of an entire vocabulary by looking at how effectively a tiny subset of this vocabulary is acquired in tightly controlled conditions. There are a number of obvious reasons why this position is untenable. Firstly, learning a set of 20–40 words may pose some difficulties for short-term memory, but seen from a longterm perspective, and in comparison with the number of words a fluent speaker needs to know, such numbers are basically trivial. Many people can handle a vocabulary of a few tens of words by using simple mnemonic techniques, for example. It is not obvious, however, that these techniques would enable a learner to handle, say, two thousand new words – the number of words you need to handle about 80% of English text. Secondly, and more importantly, a vocabulary of 30–40 words can be efficiently handled by treating it as an unconnected list of discrete items. Bigger vocabularies on the other hand will contain subsets of words which are linked together either on semantic or morphological grounds, and these linkages must make it inefficient to treat the vocabulary as a simple list. At the very least some sort of network structure must develop in a large vocabulary which reflects these relationships between the component items of the total vocabulary. Presumably, what makes it difficult to acquire a large vocabulary is that it takes time and effort for these connections to develop, and for a properly organised lexicon to emerge. This problem does not arise when the target lexicon contains only a handful of words.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 165
(1996a: 6–7)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 165
4/13/2010 2:52:12 PM
4.7
Researching Vocabulary
Interpreting and reporting results
A key requirement of good vocabulary research is that researchers need to be quite clear about which aspects of vocabulary they are addressing, since vocabulary knowledge is too complex to capture everything within a single study. This entails thinking precisely about target lexical items, and what one wishes to find out about them. Rather than just considering a construct as broad as ‘vocabulary knowledge’, it is necessary to specify which elements of vocabulary knowledge are being addressed. Much of the contradictory and confusing results have stemmed from imprecise research definitions and reporting. For example, the wide variation in the estimates of the vocabulary size of the average native or nonnative speaker stems mainly from the fact that different studies used different units of counting (e.g. individual word forms, lemmas, word families) (see Section 5.2.1), but regardless of whatever unit was used, the term employed in the report was usually word. Likewise, there is a large range of results for how much vocabulary can be learned from various input techniques, but more of this variation can usually be attributed to the measurement technique than the method of input (e.g. a relatively easy multiple choice meaning-focused format versus a more difficult productive spelling task). Furthermore, it is known that performance is usually better when the test conditions mirror the learning task than when they are incongruent (Lotto and de Groot, 1998). The point is that research reporting needs to be very explicit about what measures are used, and what they tell us about the degree of lexical knowledge or learning. This entails reporting whether the knowledge is receptive or productive in nature, what facets of word knowledge are being measured, and the degree to which the learning is stable (e.g. through the use of delayed posttests). Beyond more precise reporting in general, a number of other issues come into play in the interpretation and reporting of vocabulary studies. The first is reporting the results of statistical tests. Previously, when the significance of statistics was determined manually with tables, it was standard practice to set a p value, and stick to it throughout the research report (e.g. p < .05). Nowadays, virtually everyone uses statistical packages that give the exact p values. Given that exact p values provide more information than a range value, the APA Publication Manual (fifth edition) now recommends providing exact p values. One of the things an exact p value can provide is in indirect indication of the strength of effect. However, the standard practice is to now provide a separate effect size measure as well as a p value if possible. This is to be encouraged, as an effect size statistic makes the magnitude of an observed effect explicit, and makes comparison between studies much easier. There are various different effect size statistics (e.g. Cohen’s d and Pearson’s correlation coefficient), and in general, Cohen’s figures for effect size have been widely accepted (Field, 2005: 32):
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
166
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 166
4/13/2010 2:52:13 PM
Issues in Research Methodology 167
● ●
r = .10 (small effect): the effect explains about 1% of the total variance r = .30 (medium effect): the effect accounts for 9% of the total variance r = .50 (large effect): the effect accounts for 25% of the variance.
It is quite common to make comparisons in lexical studies, e.g. comparing learning from paired-associates input versus keyword method input. However, in addition to reporting the statistical comparisons (e.g. the significance of t-test comparisons), it is also important to report and interpret the absolute results. It is possible to have statistically significant results that are essentially meaningless. I have read numerous studies where Learning Method A was compared to Learning Method B, and a great deal was made of the significant result that one was better than the other. The discussion section then typically went on to discuss the great facility of using this method. However, when the descriptive statistics were checked carefully, it became obvious that neither method was of much value. It might be that Method A leads to the learning of two new words out of 50, while Method B leads to four new words being learned. While Method B might have been significantly better, the important news is that both methods were ineffective. Always consider acquisition/use in terms of the broader view; is there enough real learning/use occurring to make meaningful statements about? In short, report the absolute learning and use in addition to the comparisons, as they may be the more important result. Second-language learners vary considerably in their mastery of most linguistic aspects, and vocabulary is the prime case, as even native speakers vary a great deal in the size and content of their lexicons, especially at the lower-frequency levels. While frequency gives a good rough indication of the likelihood of a lexical item being known, even people with very similar lexical sizes will have a proportion of their lexicon which is idiosyncratic. For example, a learner with a vocabulary size of 1,000 items will likely know words like cat, look, wall, and pretty, but may also know much lower-frequency words like carburettor and fanbelt if they have a special interest in automobile engines. Conversely, they may not have yet picked up quite frequent words like policy or management simply due to the chance lack of exposure. When doing group research, it is important to indicate the central tendency of the group behavior, but it is likewise important to note any diversity in individual scores. This is usually indicated by the standard deviation statistic, and sometimes by the range of the scores. However, it is essential to interpret the variation for the reader, and indicate the extent to which the mean/ median figures reported represent the behavior of the group, and the extent to which individuals vary around the central tendency. Individuals typically vary widely in their lexical knowledge; a report should always consider the extent to which this is true in the particular study. The way that mean scores can hide individual variation is dramatically shown in a study by Li and Schmitt (in press). They followed four Chinese
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 167
4/13/2010 2:52:13 PM
Researching Vocabulary
students (‘LH’, ‘TT’, ‘WL’, ‘YJ’) doing an MA-ELT course at a British university. The feature of interest was adjective-noun combinations in the students’ written assignments. To determine whether these combinations were in fact collocations typical of academic discourse, the academic subcorpus of the BNC (see Section 6.2) was searched to see if the combinations occurred there, and a t-score (Section 3.2) calculated. Higher t-scores indicate very typical, higher-frequency collocations (e.g. black coffee, future role). Li and Schmitt found that the t-score means for assignments submitted over three university terms indicated a plateau from Term One to Term Two, but then an improvement at Term Three. This suggested that the students as a group used relatively more typical and frequent adjective-noun collocations at the end of the academic year. However, when individual behavior was examined, it was found that every student had a completely different profile, none of which were represented even vaguely by the mean scores! This is illustrated in Figure 4.1. The lesson seems clear: central tendency figures usually are useful in indicating group trends, but they can sometimes hide a great deal of important information. To finish this section, I will present an example of what I consider misinterpretation of research results due to a lack of consideration of the amount of vocabulary engagement various tasks engender. I suggest you first read
T-Score of bigrams over time 6.00 5.90 5.80 5.70
LH TT WL
5.60 5.50
YJ Mean
5.40 5.30 5.20 5.10 5.00
Term One
Term Two
Term Three
LH
5.20
5.47
5.10
TT
5.09
5.40
5.68
WL
5.60
5.29
5.94
YJ
5.36
5.05
5.04
Mean
5.31
5.30
5.44
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
168
Figure 4.1 T-scores of adjective-noun combinations over three terms
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 168
4/13/2010 2:52:13 PM
Issues in Research Methodology 169
the study for yourself and come to your own conclusions, before reading the following critique. The study is:
The study Mason and Krashen compared vocabulary learning from two different input conditions. The first was a Story-only condition, where L2 learners listened to a 15-minute story The Three Little Pigs. The Story-plus-study condition entailed listening to the story plus doing explicit exercises to learn 20 target words from the story. The two conditions were carried out in two separate classes in a Japanese junior college with students of relatively low English proficiency. The procedures are detailed below. The first three steps were the same for both groups. Time on task in minutes is shown in parentheses. The pretest, posttests, and delayed posttest are identical.
Story-only condition 1. 20 target words from a story were written on the board in front of the class 2. The students took a translation test (pretest) on the target words (5 minutes) 3. The students listened to the story (15) 4. Posttest on the 20 words (5)
Story-plus condition 1. 20 target words from a story were written on the board in front of the class 2. The students took a translation test (pretest) on the target words (5 minutes) 3. The students listened to the story (15) 4. Comprehension questions (10) 5. Posttest on the 20 words. (5) 6. Students correct posttest with a partner and the teacher (10) 7. Students read a written version of the story (10) 8. The students retold the story with encouragement to use the target words, but in fact they make no special effort in this regard (20) 9. Students retake the posttest (5) 10. Teacher gives correct answers (5)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
B. Mason and S. Krashen (2004). Is form-focused vocabulary instruction worthwhile? RELC Journal 35, 2: 179–185. Available at .
Both groups took the same delayed posttest five weeks later.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 169
4/13/2010 2:52:13 PM
170 Researching Vocabulary
The results
Story-only Story-plus-study
Pretest
1st posttest
2nd posttest
5 week delayed posttest
4.6 (2.3) 4.7 (1.7)
13.9 (3.4) 15.1 (2.6)
– 19.7 (.6)
8.4 (3.5) 16.1 (2.2)
Mason and Krashen also calculated the efficiency of the learning, in terms of words learned per minute of treatment. The calculations are based on a posttest score minus the pretest score (pre-existing knowledge) divided by the number of minutes spent on task. For example, the 1st posttest Storyonly efficiency figure was calculated as 13.9–4.6 = 9.3 ÷ 15 minutes = .62 words per minute.
Story-only Story-plus-study
1st posttest
2nd posttest
5 week delayed posttest
.62 .42
– .23
.25 .16
The interpretation Mason and Krashen acknowledge that the Story-plus condition led to greater vocabulary learning overall, but question its efficiency. They conclude, on the basis of their calculations, that the learning from the Storyonly condition was actually more efficient in learning per time expended, and that ‘additional focus on form in the form of traditional vocabulary exercises is not as efficient as hearing words in the context of stories’ (p. 179). A re-evaluation Schmitt (2008) suggests that one of the most important elements in determining whether various tasks facilitate vocabulary learning is the degree of engagement, a cover term for all the exposure, attention, manipulation, and time spent on lexical items. If we carefully consider the level of engagement of the various tasks in the Story-only and Story-plus procedures, we are led to quite different conclusions than Mason and Krashen came to. Let
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
These are the results (means) reported in the study for the various administrations of the test. Maximum score is 20. Standard deviations are in parentheses.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 170
4/13/2010 2:52:13 PM
us consider several points. The first, which Mason and Krashen briefly discuss in a footnote, is the effect of testing. Research shows that an increase in virtually any aspect of engagement leads to better vocabulary learning, and there few tasks which capture and focus learners’ attention like tests. This indicates that the time spent on testing needs to be included in the calculations for vocabulary exposure, as it is likely to have engendered deeper engagement than any of the other tasks. Second, some of the tasks in the Study-plus procedure which Mason and Krashen include in the ‘vocabulary learning time’ did not actually generate engagement with the target vocabulary, and therefore should be excluded from the efficiency calculations. In particular, when the students read the written version of the story, they were instructed to underline the words they wanted to learn. So while they should have been briefly exposed to the 20 target words again within the story, there is no indication that they focused on these words, and so calculating that three minutes out of the ten required for completion of the task is probably generous. Furthermore, Mason and Krashen indicate that students made no special effort to use the target vocabulary in their story retellings, and so all of the time on this task (20 minutes) cannot be counted as learning time for those target words. Again, there might have been some engagement, so let us generously count five minutes as the vocabulary learning time in this task. A recalculation of the time on task where engagement occurred with the target vocabulary looks like this:
Story-only
Story-plus-study
Pretest (5) Listening to story (15) Posttest (5)
Pretest (5) Listening to story (15) Comprehension questions (10) Posttest (5) Correction of posttest (10) Reading written version of (3) story Story retelling (5) 2nd posttest (5) Correction of 2nd posttest (5) 63 minutes
Total
25 minutes
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Issues in Research Methodology 171
Based on these revised time figures the efficiency table looks like the following:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 171
4/13/2010 2:52:14 PM
Researching Vocabulary
1st posttest Story-only
Story-plus-study
2nd posttest
9.3a
–
20b .47c 10.4 30 .35
15.0 53 .28
5 week delayed posttest 3.8 25 .15 11.4 63 .18
a
Gain score. Time on task in minutes. c Efficiency in words learned per minute. b
While the Story-only condition had a somewhat higher efficiency figure at the first posttest, the discussion in Section 4.3 argues that an immediate posttest should be mainly interpreted in terms of the effect of the treatment. The figures of importance are the delayed posttest scores. After a period of five weeks, any remaining knowledge can surely be classed as durable learning. Looking at these ‘learning’ figures, we find that, contrary to Mason and Krashen’s conclusions, the Story-plus-study condition actually produces slightly better figures, meaning that the insertion of explicit study leads to more efficient learning. It is important to also look at the absolute figures: after 53 minutes of listening plus explicit exercises, the relatively weak Japanese learners had scored virtually maximum scores on the second posttest, with 74% of the students achieving perfect scores. Clearly, the addition of explicit teaching was not only relatively efficient, it was highly effective. It was also durable: a gain of 11.4 words out of 15.3 possible (75%) is an excellent result after five weeks (4.7 out of 20 target words were already known at the pretest, and so not available for learning). We can compare this to the durable learning of the Story-only condition: a gain of 3.8 out of 15.4 possible = 25%. We should also note that one element which likely led to this good learning is the effect of the additional posttest (second posttest) in the Story-plus-study treatment which may well have helped the learners to consolidate the learning achieved in the listening and explicit exercises. This suggest that tests can act as a useful recycling in the vocabulary learning process. In sum, contrary to Mason and Krashen’s general questioning of the efficiency of the traditional vocabulary exercises in this study, the Storyplus-study condition lead to students learning three times as many words with slightly better efficiency. This is in line with almost all other studies which compare incidental learning with incidental-plus-explicit learning. However, this would not have shown up without a careful consideration of the tasks and the engagement they entailed. This is a warning to both be very careful with one’s own interpretation and reporting, and cautious about too easily accepting other authors’ interpretation of their data.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
172
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_05_cha04.indd 172
4/13/2010 2:52:14 PM
5
5.1
Global measurement issues
A great deal of vocabulary research involves measurement of lexical items in one way or another. This means that vocabulary researchers must be able to choose or design measurement instruments or procedures which validly and accurately describe the aspects of vocabulary being explored. This section will discuss a number of measurement issues specifically related to vocabulary. It will start with some general issues which need to be considered. In discussing vocabulary measurement, it is useful to first explore ways in which the various measurements’ formats differ. Read (2000: 9) proposes that formats can vary along three clines as illustrated in Figure 5.1. Tests which focus specifically on vocabulary knowledge are likely to be discrete in the sense that particular lexical items are highlighted. However, vocabulary measures can be a component of measures of broader linguistic proficiency, and in this case, the test would be embedded. Receptive vocabulary measures are almost always selective, because the test writer needs to select the lexical items to measure, determine their characteristics, and then write test items for them. On the other hand, a measure of the complete vocabulary output of learners’ speaking or writing production would be comprehensive. If this is ‘free’ output, it poses difficulties for the tester, as there is no way to know in advance exactly what the produced vocabulary will be. In terms of context, vocabulary items can range from completely context-independent (e.g. an L2-L1 translation task), to completely contextdependent (e.g. define the target word according to the meaning sense used in Passage X). Context-dependent formats will obviously provide a better way of tapping into the ‘contextualized’ facets of word knowledge like collocation and register.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
173
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 173
6/9/2010 1:09:18 PM
Discrete
Embedded
A measure of vocabulary knowledge or use as an independent construct
A measure of vocabulary which forms part of the assessment of some other, larger construct
Selective
Comprehensive
A measure in which specific vocabulary items are the focus of the assessment
A measure which takes account of the whole vocabulary content of the input material (reading/ listening tasks) or testtakers’ response (writing/ speaking tasks)
Context-independent
Context-dependent
A vocabulary measure in which the test-taker can produce the expected response without referring to any context
A vocabulary measure which assesses the test-taker’s ability to take account of contextual information in order to produce the expected response
Figure 5.1
5.1.1
Dimensions of vocabulary assessment
Issues in writing vocabulary items
Concept 5.1 Test terminology The field of language testing has its own technical vocabulary to describe different parts of a test. Although I have tried to avoid much of this jargon, a quick overview of basic terms is probably useful. Each individual ‘question’ on a language test is called an item, simply because most items are not in fact questions. In multiple choice items, the part that is given is called the stem, and the possible answers are called options. The correct option is the key and the others are called distractors (their purpose is to distract the examinees away from the key if it is not known).
When measuring knowledge of a lexical item, it is necessary to ensure that the test format does not limit the ability of participants to demonstrate whatever knowledge they have of the item. One of the most basic problems to avoid is using unknown vocabulary (or any other linguistic feature for that matter) in measurement instruments. This includes the instructions and any context provided in the test items. If the purpose of the measure is to ascertain knowledge of the target vocabulary, it does no good for participants to underachieve due to constraints not related to those targets, e.g. not
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
174 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 174
6/9/2010 1:09:18 PM
Measuring Vocabulary
175
cow (a) a feline animal (b) a bovine animal (c) a porcine animal (d) a marsupial animal Avoiding this kind of problem is not completely straightforward, as vocabulary difficulty is not an absolute intrinsic characteristic (as we have seen in Section 2.3), and depends to some extent on the similarity or difference with a participant’s L1 (Laufer, 1997). If you are researching in a homogeneous L1 environment, you may be able to consider the L1-L2 relationship in determining lexical difficulty. For example, while the above example may seem absurdly difficult, for speakers of some Romance languages it may not be quite so extreme, as bovine is cognate with their L1 (Spanish bovina; French bovin; Italian bovino). (See Jarvis, 2000, on more on measurement of L1-L2 transfer.) However, many research contexts will involve mixed groups with speakers of many L1s, which makes considering the various L1-L2 relationships infeasible. The most common method of ensuring that defining vocabulary is easier than the target vocabulary is through using higher-frequency words in the definitions. It is a well-attested phenomenon that learners typically learn more frequent lexical items before less frequent ones (e.g. Schmitt, Schmitt, and Clapham, 2001), and so it is a reasonable assumption that if the defining vocabulary is of higher frequency than the target vocabulary, the participants will know it if they know the lexical item being measured. This frequency approach usually works well, both because frequency data is now readily available (either through published lists or corpus analysis), and because it is not L1-specific. Using higher-frequency defining vocabulary, the following example largely avoids the problem of unknown vocabulary in the definition. bovine (a) cat-like (b) cow-like (c) pig-like (d) kangaroo-like
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
recognizing a word that they know due to its being placed in an unknown syntactic construction (e.g. passive sentence). This principle is most critical in devising definitions, where participants must know the defining vocabulary in order for items to be effective. This contrived example illustrates an extreme case of definitions being ‘harder’ than the word being tested.
The only potential exception is kangaroo. A check of the 179 million word New Longman Corpus shows that bovine occurs 293 times, while there are 241 instances of kangaroo. In terms of simple frequency, this suggests that kangaroo is somewhat less frequent than bovine, and so a bad component for a
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 175
6/9/2010 1:09:18 PM
Researching Vocabulary
definition. Conversely, the VocabProfiler on the Lextutor website puts kangaroo in the 7,000 frequency band, while bovine is in the 12,000. This shows that raw frequency information needs to be used with some caution and common sense. The makeup of a corpus obviously has a great effect on frequency counts, and researchers need to carefully consider whether a particular corpus (and the resultant frequency data) is suitable for their needs. For example, in the ACE and ICE-AUS corpora of Australian English, kangaroo occurs 33 and 51 times per million words respectively, making it a high-frequency word for learners of that national variety of English. Given the inevitable differences in frequency figures between even very comparable corpora, it is prudent to consult several appropriate corpora and their frequency counts when determining frequency information (e.g. Schmitt, Dörnyei, Adolphs, and Durow, 2004). It is also useful to consider how lexical items are used in the corpora. An analysis of the concordance lines in the Longman corpus indicates that bovine mainly occurs in medical or scientific contexts, while kangaroo occurs in general English contexts. Moreover, kangaroo is likely to be a loanword in many languages and so known through the L1 for many learners. Taken together, this is additional evidence that most participants would probably know kangaroo before they knew bovine, unless perhaps they were medical students. While the defining vocabulary should always be of higher frequency than the target vocabulary, in general it is best to limit oneself to the highest frequency words when possible. For example, even though the relatively rare word lithe (Lextutor 14,000 band) could be defined with the higherfrequency word supple (8,000 band), it would probably be better to use the even higher-frequency words flexible or graceful from the 2,000 band. It is possible to define most words with only vocabulary from the most frequent 2,000–3,000 words of English, as demonstrated by learner dictionaries, which tend to have a defining vocabulary of around this size (e.g. Macmillan English Dictionary for Advanced Learners, 2002: < 2,500 words; Longman Advanced American Dictionary, 2000: ≈ 2,100 words). In fact, learner dictionaries are good sources from which to extract accessible definitions for research instruments, as the lexicographers have already taken great pains to make them as easy and transparent as possible. Also, as copyright issues make it difficult for different dictionaries to have exactly the same definitions, a perusal of several learner dictionaries will provide you with a number of definition options to choose from. This is illustrated by the following similar, but subtly different definitions of lithe, most of which could probably be suitably adapted for a discrete item test of the form-meaning link: ●
●
●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
176
having a body that moves easily and gracefully (Longman Advanced American Dictionary, 2000) moving and bending in a graceful way (Macmillan English Dictionary for Advanced Learners, 2002) young, healthy, attractive, and able to move and bend gracefully. (Cambridge Advanced Learner’s Dictionary, 2003)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 176
6/9/2010 1:09:18 PM
Measuring Vocabulary ●
177
A lithe person is able to move and bend their body easily and gracefully. (COBUILD English Dictionary for Advanced Learners, 2001)
1. easily bent or flexed 2. characterized by easy flexibility and grace; also athletically slim. (Merriam-Webster Online Dictionary) It is also important that the test items used in measurement instruments are natural and make sense to the participants. Too often, they are created to fulfil the requirements of the research design, but end up being awkward and atypical of the target language. This can happen when contriving context for lexical items, or when developing distractors. One way around this is to use or mimic authentic language in the test items. Another way is to elicit native judgements of ‘naturalness’ to ensure that the target items do not have any unintended ‘strangeness’ about them. For example, if creating sentences for target lexical items, a panel of native speakers can be asked to rate each of the sentences on a six-point Lickert scale according to how ‘natural’ they are (1 = ‘very unnatural’, 6 = ‘completely natural’). Then only sentences receiving a mean rating of 5 or above can be retained. This is an easy way to avoid uninterpretable results based on unnatural stimuli. Best practice is to use delayed posttests in acquisition studies (Section 4.4), but this raises the issue of memory effect. Many researchers are inclined to use different versions of a test in order that participants won’t score artificially highly because they remember a test from a previous administration. However, the problem is that it is difficult to make different versions of a vocabulary test which are truly equivalent. For example, in the validation report of two versions of the Vocabulary Levels Test (Schmitt et al., 2001), the researchers concluded that while the two VLT versions were statistically equivalent for groups of learners, they were not for individual learners. In fact, with the possible exception of psycholinguistic designs that use lists of individual words which can be precisely matched in their characteristics, it is virtually impossible to create multiple versions of a vocabulary test on which individuals will produce the same score. This indicates that pretestposttest(s) designs need to use the same test in all administrations to avoid unknown variation due to use of not-fully-equivalent tests. If memory effect is a concern because two or more administrations are given close together, a good technique is to give participants a cognitivelychallenging task (e.g. a math exercise) immediately after the test administration, in order to get them thinking about something else, and ‘flush out’ their memories of the test. It is also useful to disguise the true targets
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
The ease of these definitions is illustrated with a comparison to a popular online monolingual dictionary. For example, the second definition below is clearly more difficult than the above definitions and thus less suitable for L2 language research.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 177
6/9/2010 1:09:19 PM
Researching Vocabulary
of the research by adding ‘red herring’ distractor elements to the pretest. Let us take the example of a research design which is interested in the incidental learning of vocabulary from reading, where it is necessary to give a pretest to measure pre-existing knowledge of the target lexical items. The researcher can add some reading-based test element(s) (e.g. a short test of reading speed) to avoid participants guessing that vocabulary is in fact the target. It might also be useful to add some extra words to the vocabulary pretest, so that the actual target items are less prominent. These additional pretest elements will take up extra time, but would be worthwhile if they distract the participants from the true nature of the research design. Although many commentators would have vocabulary always taught and tested in context, this probably depends on what is being measured. If the construct is the form-meaning link, then non-contextualized formats like L2-L1 translation can work well. However, if the desired construct is the ability to recognize a lexical item from context in written texts or speech, then context is obviously required, presumably the type (level) of context in which you expect your participants to meet these items in during nontesting (i.e. real usage) conditions. This follows language testing views that emphasize the need for testing situations to replicate situations of language use or learning (e.g. Bachman and Palmer, 1996). Similarly, if the construct is productive ability, the demonstration of the lexical items in contextualized speech or writing would also be required. Of course, this makes testing harder and messier, as it can be difficult to tell if you are testing the target items or the context.
Quote 5.1 Cameron on test items which measure vocabulary in isolation In the process of acquiring a vocabulary item, the meeting and making sense of a new word in context is likely to be the first step in a longer process; initial encounters with a word do not necessarily lead to that word being recognized on further occasions. Further meaningful encounters will be needed to establish the full range of a word’s meaning possibilities, and to engrave the word in memory. Eventually, after sufficient contextualized encounters, a word will be recognized when it is met in a new context or in isolation. If we then think about the process of completing a word recognition test, we can surmise that decontextualized presentation of a word in a test does not imply that a testee makes sense of the test word in a ‘decontextualized’ mental void. Rather, the recognition process may activate recall of previous encounters and their contexts. Since we would expect and want secondary-level [or any other L2] students to be able to operate this way with large sections of their vocabulary, it does not seem unreasonable to test to see how much vocabulary can be recognized without extended linguistic or textual contexts.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
178
(Cameron, 2002: 150–151)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 178
6/9/2010 1:09:19 PM
179
Although most vocabulary tests have been of the paper-and-pencil variety until now, it is worthwhile to consider the advantages of computerized testing and internet-based testing. They can make it easier to develop participant pools, as it is possible to set up the tests so that participants can take them at their convenience. It is also possible to set up the tests so that they are automatically scored, with immediate feedback to informants. Such tests can also be multimedia, with spoken, written, and graphic elements. They can also be interactive, and can adjust to the proficiency levels of the participants. It is beyond the scope of this book to discuss the development of computerized tests, but one relatively easy way into this area is Mike Scott’s freeware WebQuiz, which is available at . 5.1.2 Determining pre-existing vocabulary knowledge Research into the acquisition of vocabulary necessitates determining what vocabulary knowledge exists at a point in time (usually before an experimental treatment), and then establishing what the state of knowledge is at a later point. This is often explored with some form of T1 (Test 1/pretest)– treatment–T2 (Test 2/posttest) research design. The need for the pretest is obvious, because if pre-existing knowledge is not established at the beginning, it is impossible to know whether T2 knowledge is new acquisition, or simply knowledge that was in place before the study began. (Pretests are also important for determining whether the groups being compared are actually similar before the treatment.) There are two main ways in which the degree of pre-existing knowledge can be controlled for. The first involves ensuring none exists, and the second entails measuring and then adjusting for it. Ensuring no pre-existing knowledge exists One case where no pre-existing vocabulary should be in place is with rank beginners. If learners have had no exposure to an L2, then it can be assumed that they will have no knowledge of its vocabulary, as long as it is not a cognate language and there are no loanwords from the learners’ L1. Picking the lexis from an unknown language as targets can be a useful technique, as in the Clockwork Orange studies, where non-Russian speakers were tested to see how much Russian slang vocabulary they picked up from reading this novel (Saragi, Nation, and Meister, 1978). However, when English is the target language, it is probably unsafe to assume that the learners have had zero experience with it, given its wide usage and influence around the world. Another approach is to use very low-frequency words from the L2 the learners are studying, e.g. glimmer and coalesce in English. This usually works well, but researchers should be aware that learners do not learn words in strict frequency order, and will very likely know some low-frequency
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 179
6/9/2010 1:09:19 PM
Researching Vocabulary
words even if they are only beginners. While the numbers of low-frequency vocabulary known will be quite small, it is prudent to check for any preexisting knowledge by pretesting the participant group or a sample thereof, or a similar group with the same type of participant. It is also important to ensure that the low-frequency vocabulary has no word parts which can make guessing feasible, e.g. fishhook, which is not frequent in itself, but is still relatively transparent if learners know the very frequent words fish and hook. An increasingly common approach is using nonwords (sometimes called pseudowords or nonce words). This is effective, as it guarantees informants do not know the invented items in advance (e.g. Hall, 2002; Webb, 2007a). There are no technical reasons not to use nonwords, as de Groot and Keijzer (2000) report that the success rate with nonword training is comparable to that of training words from an existing language. This is on the condition that the nonwords are phonologically/orthographically legal in the learners’ L1. One way of doing this is to manually take real words and then change one or more letters to make nonwords (e.g. prod → prok). However, it is easier and more comprehensive to use a nonword generator, such as the ARC Nonword Database (Rastle, Harrington, and Coltheart, 2002). This website creates lists of nonwords based on a wider range of criteria than it would be feasible to control for manually, including: the number of letters or phonemes, the number of real words with a similar form, and the frequency of bigram and trigram within the nonword. The success rate with nonword training also suggest that learners can be motivated to learn non-real material, but there may be ethical issues, especially if the subjects are actively involved in language learning and expecting that participation in a study will enhance their language skills. In such a case, it is probably unreasonable to trade on their goodwill to teach them imitation vocabulary. However, if the subjects are paid, get some other credit or benefit from participation, or agree in advance to learning nonwords, then there would be no ethical barrier to the use of nonwords. Measuring and adjusting for pre-existing knowledge This approach entails using a pretest to measuring any pre-existing knowledge of the target lexical items. As mentioned in Section 5.1.1, it is usually best to use the same test for the pretest and subsequent posttest(s), often with a memory-flushing task after the pretest. It is also useful to have distractors in the pretest to minimize the chances of learners becoming aware of the target items. The practice in psychology studies is to have about onethird targets and two-thirds nontarget distractors. If the target vocabulary is at the level which the learners would naturally be learning at their level of proficiency, there will probably be considerable variation in pretest knowledge, which often makes it sensible to report gain scores (T2 scores minus T1 scores) rather than raw T2 scores.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
180
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 180
6/9/2010 1:09:19 PM
Measuring Vocabulary
181
Vocabulary measurement needs to be both valid and reliable. These are technical terms in the field of language assessment, and have been widely discussed (e.g. Bachman, 1990; Bachman and Palmer, 1996). Although validity is a complex construct, for the purposes of our short discussion, let us think of it as whether vocabulary tests actually measure what they purport to measure. This can be established in different ways. One common way is through criterion validity. In this method, a new test is judged according to how closely it correlates with an already established measure. This can work well if an accepted standard measure already exists to compare against. However, in vocabulary, few such standards exist, with the Vocabulary Levels Test being perhaps the closest we have at the moment. Moreover, the complex nature of vocabulary knowledge dictates that any particular test would be severely limited as a criterion measure. For example, the VLT is a receptive test, measuring knowledge of the formmeaning link only, and is based on four frequency levels (2,000, 3,000, 5,000, 10,000). For an alternative test that has these precise characteristics, the VLT will serve as an adequate criterion. But it would probably not be suitable as a criterion for a productive vocabulary test, one which measures other word knowledge facets, or which sampled vocabulary from different frequency levels. Therefore, a criterion validity approach has serious limitations at the moment.
Quote 5.2 Read questioning whether language tests always measure what they purport to We need to be cautious in making assumptions about what aspect of language is being assessed just on the basis of the label that a test has been given. (2000: 99)
This means that the validity of a vocabulary test will usually have to be demonstrated through its own development and performance. The development part starts with specifying the content, i.e. content validity. It is here where one must apply the themes running through this book, and not specify a ‘general’ vocabulary, but be much more specific about what lexical items the test includes and what is being measured about those items. Some of these specifications will include: ●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
5.1.3 Validity and reliability of lexical measurement
whether the test measures only a specific set of lexical items (such as a group of items which have been previously taught), or whether the lexical items on the test are supposed to represent a wider population of vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 181
6/9/2010 1:09:19 PM
182
● ●
if the test items represent a wider population, what is the basis for this generalization, and is it supportable (e.g. frequency band, type of vocabulary, word class) what word knowledge aspects are being addressed whether the tests measures recall/recognition/receptive/productive levels of mastery.
It should be possible to develop detailed and focused specifications for new vocabulary tests, based on the literature in the field and previous research. After test items have been written for these specifications, it is time to gauge how well the test captures the specified content. That is, when examinees take the test, how well do their scores represent this content? There are numerous statistical analyses available which shed light on this, but I feel the best way is through a separate direct, in-depth assessment of the underlying knowledge. Perhaps the best way of determining ‘true’ underlying knowledge is through interactive face-to-face interviews where the interviewer can probe the examinees lexical knowledge in detail, and come to a very confident determination of this knowledge. This method was carried out in the validation of the VLT. A subsample of learners who had taken the VLT were asked to attend individual interviews with a two-person interview panel. The two interviewers probed the learners about their knowledge of a 50-word subsample from the test. The learners could demonstrate their knowledge in a number of ways, and the probing continued until both interviewers were satisfied the learner either did or did not know the form-meaning link for the word. Once the interviews were completed, the results from the interviews were compared to the results from the test. This comparison is shown in Table 5.1. Using the interview procedure, the raters could be confident of the learners’ knowledge of the target test words. This was reflected in high inter-rater reliability figures of .95–.97. Note that interviewers did not know the learners’ responses to the VLT, and so this could not bias their interview judgements. The contingency table gives a wealth of information about the validity of the test. Ideal performance entails all responses being either in Boxes A or
Table 5.1
Comparison of interview results with levels test results Levels Test
Knew Did not know
Interview
Correct
Incorrect
A 731 C 65 796
B 47 D 255 302
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
Researching Vocabulary
778 320 1,098
(Schmitt et al., 2001: 75).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 182
6/9/2010 1:09:19 PM
183
D. While not perfect, the VLT performs well in this regard: 90% (986/1,098) of the responses were so placed. What is more, the table illustrates the problems with the remaining 10% of responses, where the VLT did not indicate the examinees’ true lexical knowledge, as indicated by the interview. The mismatches came in two types. The first, knowing a word but not matching the correct option on the Levels test (Box B), did not seem to be too much of a problem with an occurrence rate of only about 4%. Thus, if learners knew the word, they usually were able to answer the test item correctly. The second, not knowing a word but still matching the correct Levels option (Box C), occurred slightly more often – about 6%. This box relates to guessing, and a 6% ‘guess rate’ must be considered quite acceptable in a matching test format where guessing can never be eliminated completely. The interviews also suggested that many of the mismatches were not the result of guessing, but of partial lexical knowledge. For example, one learner believed that collapse meant ‘to break’. While this was not a full enough understanding to warrant a ‘know’ rating in the interview, it did allow him to choose the correct option, ‘to fall down suddenly’, on the VLT. The information from the table, combined with additional insights like above, demonstrate the value of this type of methodology in providing rich data from which to fashion a validity argument for a vocabulary test. Another requirement of good measurement is that testing instruments give consistent (i.e. reliable) results. In other words, if a participant takes a test today and scores 50%, he/she should also score 50% on that test tomorrow, assuming no learning has taken place. In reality, no test will be able to deliver 100% reliability, because participants are human and vary dayto-day in their performance. However, if tests produce scores which vary widely even though the underlying participant abilities remain the same, it becomes impossible to interpret them: are the higher scores closer to ‘true’ ability? Or the lower scores? Or an average? Because reliability is essential to valid testing, reliability should be determined for all your instruments, and reported.
Quote 5.3
Bachman on reliability
We can all think of factors such as poor health, fatigue, lack of interest or motivation, and test-wiseness, that can affect individuals’ test performance, but which are not generally associated with language ability, and thus not characteristics we want to measure with language tests ... When we minimize the affects of these various factors, we minimize measurement error and maximize reliability. In other words, the less these factors affect test scores, the greater the relative effect of the language abilities we want to measure, and hence, the reliability of the test scores.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
(1990: 160)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 183
6/9/2010 1:09:19 PM
Researching Vocabulary
Because developing valid and reliable measurement instruments is time consuming, it can useful to use established tests if they are available and appropriate for your purpose. One advantage of this is that it makes your research more comparable to previous research because the tests are the same and can be interpreted in similar ways. Another advantage is that the tests (hopefully) have been through a validation process where some reliability evidence has been collected. Having previous reliability evidence is useful and offers some quality assurance; however, test performance can vary from one participant population to another, even if they appear to be similar on the surface. It is not difficult to understand why this is: people are inherently individuals, and vary according to factors such as L1, language proficiency, motivation, how well they like their teacher, parental support for learning, and many others. Because of this, best practice dictates that reliability needs to be established for the measurement instruments for each participant population in their own environment. That is, although previous reliability evidence can suggest the consistency of a test, you do not really know how it will work with your particular participants until you try it with them. Therefore, even when using existing measurement instruments, you should report the reliability figures of those measures with your population. The reliability of most research instruments can be established using different methods. The most easily conceptualized is the test-retest method. A test is given one day, and then again quite soon before any learning can occur. However, there are several problems with this method: it requires two administrations, participants may remember something of the test in the second administration if it is given too soon after the first administration, they may have forgotten some of their knowledge if the second administration is given too long after the first, and participants are usually not keen to take the same test twice in row. It is also possible to establish reliability if there are two equivalent versions of the test, which can be compared against each other. However, in my experience, it is difficult to create two tests which are truly equivalent (e.g. Schmitt et al., 2001), and so this may not be a viable option, although Rasch analysis can be a useful aid.
Concept 5.2
Rasch analysis
Rasch (also known as one-parameter) analysis is part of the item response theory approach to language measurement. It involves complex statistical modelling, but in essence, ranks examinees according to their ability (as determined by their total test scores) and simultaneously ranks the test items according to difficulty (in terms of how many examinees were successfully able to answer them). Comparisons between the examinees and test items can then be made (e.g. we would expect the most difficult items to be answered by only the strongest examinees). Any items which ‘misfit’ in these comparisons, can be flagged for further consideration of their merit.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
184
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 184
6/9/2010 1:09:19 PM
185
In practice, reliability is usually established via an internal consistency approach. Instead of giving a test twice as in the test-retest method, it is given only once, but then split into smaller parts which can then be compared. For example, the split-half method divides the test into two halves, each considered an alternative form in its own right. The scores from the two halves can be compared, and if the scores are similar, the overall test is considered to be reliable. Internal consistency methods have the obvious advantage of only requiring a single administration, and Hughes (2003: 40) concludes they work well providing ‘that the alternate forms are closely equivalent to each other’. When the construct being measured is a single rule or ability (third person-s, reading speed), it is possible to create tests where all the items address that single construct. However, vocabulary is largely item-based learning, and so each item addresses a separate construct, i.e. knowledge of a single lexical item. Therefore, just because one lexical item is known (e.g. the word succinct) does not mean that another will be known (pithy). This is true even if the words are similar in terms of form, meaning, or topic area. In fact, the best guide for whether words are known or not seems to be frequency of input. The easiest way of determining input is frequency lists, but these can only ever be a general guide. for example, the words pencil, eraser, and notebook are not among the most frequent words in English, but they will be especially frequent in classrooms of beginning language learners. Furthermore, while L2 learners tend to learn words in frequency order, their lexicon will follow a frequency profile rather than strict frequency order. This is illustrated by the vocabulary profiles of three Japanese learners (Figure 5.2).
30
Student 1 Student 2 Student 3
Scores (max = 30)
25 20 15 10 5
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
0 2,000
3,000 5,000 Frequency band
10,000
Figure 5.2 Vocabulary profiles of three EFL learners (max score = 30)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 185
6/9/2010 1:09:20 PM
Researching Vocabulary
From Figure 5.2, we see that knowledge of vocabulary generally decreases as frequency decreases, in a type of stair-step pattern. That is, learners typically know fewer families in the lower-frequency bands than they do in the higher-frequency bands, but they usually still know some. Student 1 is a beginner, and knows only a few of the 2,000 and 3,000 families. Student 2 still has not mastered the 2,000 level, but knows some words at the 3,000 and 5,000 levels. Student 3 has a much larger lexicon, and scored maximum points on the 2,000 and 3,000 levels (and nearly so on the 5,000 level), and knows about two-thirds of 10,000 level. These profiles show that it cannot be assumed that all of the lexical items within a frequency band will be learned together. To state this another way, just because some lexical items in a frequency band are known, it does not follow that other items at that band will also be known. This is illustrated by five words from the Leech, Rayson, and Wilson (2001) BNC frequency counts. Film, process, useful, conference, and operation are five words in frequency sequence, but just because one or more of these are known, it does not mean the others are, just because of the frequency placement. Frequency is a good tool to gauge the probabilities of a target word being known, but is not strong enough to predict knowledge with certainty. This causes problems for internal consistency methods. They work by assuming that the division of a test results in equivalent forms. However, because it is difficult to establish whether the lexical items of one part of a test are of equal difficulty to another part, this assumption is suspect. For vocabulary, it would be ideal to establish reliability using a test-retest format, because the lexical items on both administrations of the test would have exactly the same characteristics, as they would be exactly the same items. However, practical limitations may make this infeasible in many contexts. However, if internal consistency measures are used, it may be better to manually determine that the alternate forms are equivalent as possible, using criteria such as frequency, typical acquisition order of lexical items for your type of participant, and similarity with L1. A similar problem lies with using item-total score correlations. Again, when a single underlying construct is being assessed, every single item should reflect the total score on a test, and high correlations are one indication of a well-working item. However, when a test is made up of individual lexical items, with knowledge of each essentially a different construct, then item-total correlations do not make sense as a measure of an item’s goodness. A perfectly good item may well behave differently from others, e.g. when the item measures a word that for some reason has occurred less often than other words in a frequency band in a particular learning environment or for a particular age group. Even though this item would generate lower scores than those for other ‘higher frequency in environment’ words, it is not necessarily a bad item. The point that matters is whether that test
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
186
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 186
6/9/2010 1:09:20 PM
Measuring Vocabulary
187
item truly reflects testees’ knowledge of the lexical item, and not whether it matches the results from other test items. The methods of establishing reliability are just another case where vocabulary might require different techniques than other linguistic features which are more rule- or system-based.
Although this is not strictly an issue specific to lexical research, it occurs often enough to warrant a brief mention. When one has a set of data or participants, and wants to divide them into separate groups, it is necessary to divide them at one or more places. However, where the ‘cut’ is made is crucial. Typically, it is made at a some point, and all of the cases below this point are put into the lower group and everything above it are placed in the higher group. The problem is the borderline cases. We can see this in the small contrived dataset below. 40, 44, 45, 48, 49, 50, 59, 60, 62, 65 Many researchers would make the cut-point at 50, and so have five cases below the cut-point and five above. But when dividing into groups, we need to be able to argue that the members of the various groups are exhibiting separate behavior, and are clearly different from each other. With this cut, though, 49 and 50 would be in different groups even though they are only one point apart. Clearly, they are much more similar to each other than the 40 and 44 scores are in the same lower group. In this data set, there is a ‘natural’ cut-point, where data and below have clear differentiation: between 50 and 59. A researcher can use such a natural cut-point when it exists to form groups that are supportably different. However, if no natural cut-point exists, or if it is necessary/desirable to retain the same number of cases in the different groups, it may be necessary to delete some of the close borderline cases in order to clearly differentiate the groups. For this dataset, deleting 49 and 50 would result in to two groups of four cases (40–48 and 59–65) which have an obvious gap and should represent truly different behaviors. However one places the cut-point, it is important to be able to argue that the resulting groups in fact represent separate behavior.
5.2
Measuring vocabulary size
Much vocabulary research involves counting lexical items for some reason, e.g. to discover how many items a learner has acquired, to find out how many items a person needs to know to understand a conversation, or to count how many academic words a book contains. This section will first focus on issues involved in measuring vocabulary size, and will then review a number of vocabulary size testing instruments.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
5.1.4 Placing cut-points in study
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 187
6/9/2010 1:09:20 PM
188
Researching Vocabulary
Quote 5.4
Meara on the importance of vocabulary size
Meara (1996b: 37)
5.2.1 Units of counting vocabulary The main issue in counting vocabulary is in setting the unit of measure. Different ways of counting lexical items will lead to vastly different results, and a persistent problem in lexical studies is that size figures are reported, but without a clear indication of how they were derived. The following describes the main methods of counting vocabulary. Tokens and types Two useful terms when discussing vocabulary, particularly corpus research, are token and type. Tokens are the number of running words in a text, while types are the number of different words. Thus there are five tokens in the following example, but only four types, as the two occurrences of fat belong to the same type. Fat cats eat fat rats. Word forms The most basic way to calculate vocabulary is to count each type (also called an individual word form) separately. In the nearby box, I list all of the variations of teach that I could find in the BNC (see Section 6.2). You will see that there are 11 different word forms in the box. This is the easiest way to count, as lexemes with even slightly different spellings are counted as new word forms. No judgements need to be made about issues like meaning, word class, or frequency. There is also some evidence that word forms are the basic psycholinguistic element. The speech production research done by Levelt and his associates shows that both in terms of production and perception, speakers generally activate the base word form. This is true even if it is an inflection or derivation (i.e. base form + inflectional or derivational affix). This evidence suggests that people acquire mainly base word forms, although in some cases inflected forms of those base forms are acquired where the frequency is high. (See Kuiper, van Egmond, Kempen, and Sprenger, 2007, for more on lexical activation, especially of formulaic sequences.)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
All other things being equal, learners with big vocabularies are more proficient in a wide range of language skills than learners with smaller vocabularies, and there is some evidence to support the view that vocabulary skills make a significant contribution to almost all aspects of L2 proficiency.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 188
6/9/2010 1:09:20 PM
Measuring Vocabulary
189
Variations of teach found in the BNC with number of occurrences 3,298
teachable
12
taught
4,224
teachability
1
teaching
9,581
teachableness
1
teaches
536
unteachable
8
teacher
9,145
teacher-like
0
teachers
12,370
Lemmas It is clear that the items in the box are all semantically related, and in addition to this, some have a very close grammatical relationship. The first item is the base (or root) form of the verb teach. The next three items (taught, teaching, teaches) are grammatical inflections of the base form. It is not difficult to argue that these four forms are so closely related that they should be counted as one item – a lemma. Teacher and its plural inflection teachers combine to form another lemma, but the rest of the items in the box would each be a separate lemma. One good reason to count vocabulary using lemmas concerns the way the mind processes vocabulary. Some psycholinguistic research indicates that the mind stores only the base form of a lemma and then attaches inflectional suffixes (which are usually regular and consistent) on-line when they are needed (Aitchison, 2003). Thus, to the mind, a lemma operates as one lexeme, albeit one which can be grammatically manipulated. The exception is irregular forms (e.g. taught), which need to be stored separately as individual lexemes. There are only around 200 verbs with irregular past forms and many fewer irregular plural nouns in English (Schmitt and Marsden, 2006), so individual storage of irregular lemma members is definitely the exception compared to the vast majority of regular nouns and verbs which are inflected on-line. It is important to note however, that individual irregular word forms can be quite frequent (e.g. be, men, ran). Another reason to use lemmas concerns learning burden. If a student knows the inflectional system and the base form of a word (sew), then learning its inflected forms (sewed, sewing, sews) should be relatively easy. This is not the case for irregular forms though, and so there it is unclear whether they should be included in a lemma, or be counted as a new one.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
teach
Word families While lemmas are groups of related word forms within a word class, words from other word classes can also be related to a base form. For the verb teach,
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 189
6/9/2010 1:09:20 PM
Researching Vocabulary
for instance, there is the noun teacher and the adjective teachable. These word forms have related meanings, and clearly fit with the lemma teach. We call all of the word forms which are semantically related a word family. Thus, the word family formed around the base form teach includes all of the words in the box. This unit of counting is the best at capturing all of the word forms related to a concept. There is some psycholinguistic evidence that the mind processes word forms like these together in some way, e.g. Nagy, Anderson, Schommer, Scott, and Stallman (1989) found that the speed at which a native speaker could recognize a word was better predicted by the total word family frequency than by the frequency of the individual word form. (See also Bertram, Baayen, and Schreuder, 2000, and Bertram, Laine, and Virkkala, 2000.) However, Kon Kuiper (personal communication) points out that the psycholinguistic reality of word families is not straightforward. The question as to whether there are word families in the mental lexicon depends in part on how one looks at formal similarity/idiosyncrasy. Let us suppose that two words are lexically related, such as silly and silliness. There is some formal similarity here, but there is also idiosyncrasy. Some related lexical items will have only very small amounts of idiosyncrasy (i.e. the forms are very similar: activate/ activation), while others will have as much, or even more, idiosyncratic information than there is predictable information (deceive/deception). But even if one assumes that the learner has less to learn when there is much formal similarity, it does not necessarily follow that speakers also store lexicallyrelated items under a single mental entry. Psycholinguistic evidence shows that activation of a lexical item spreads to related lexical items, but that the activation of each item is unique. So it seems that the psycholinguistic status of word families is still undetermined. In practical terms, word families can be much more difficult to use than other units of counting, especially in deciding which word forms belong in the family and which do not. While relatively frequent word forms like teacher are easy to include, other seemingly acceptable forms like teachable are relatively infrequent. Then we have very uncommon forms like teachableness which still follow the rules of morphology and can be found in a native corpus. Finally, there are forms like teacher-like which one might intuitively place in the word family, but which might not occur in a corpus at all. There are no hard-and-fast rules to determine the members of a word family, but if one has a good idea of their research purpose, it is still possible to make the selection in a principled way. Bauer and Nation (1993) give some guidance with a hierarchy of inflection and derivation affixes based on the criteria of frequency, regularity of form, regularity of meaning, and productivity.1
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
190
Deciding the best unit of counting The best unit of counting will depend somewhat on the technical resources available. Using word forms is often the easiest option, as concordancing
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 190
6/9/2010 1:09:20 PM
191
programs can count only items which have an exact formal match. However, there are now lemmatizers available which can automatically ‘tag’ a corpus and lemmatize frequency counts (see Sections 6.2 and 6.3), but they still need some manual oversight. I am aware of no program which can reliably count word families automatically on its own. For families, it is necessary to first give the program a list of family members, against which it can match forms in the corpus. However, it is better to decide the unit of counting based on the research questions, and perhaps one’s conceptualization of lexical storage. There are some research situations where it makes the most sense to distinguish between individual word forms. Charting the early vocabulary acquisition of L1 children is a case in point. Each new word form spoken at this initial stage has significance, and the acquisition of past and plural forms are important milestones in a child’s linguistic development. Thus a counting measure which captures each new individual word form is the appropriate methodology for this research setting. However, when studying the acquisition of new lexemes by older native speakers, then lemmas may well be a better measure, because counting word forms might give an overestimate of lexical growth. We would expect older natives to have mastered the basic morphology of their language, and so be able to attach grammatical affixes relatively automatically to newly learned word forms. This might result in several new word forms being counted (using word form as counting unit), when in reality perhaps only one word form was newly learned, and the rest generated via the on-line attachment of affixes. With nonnatives, the interpretation of the numbers becomes trickier. They are likely to vary widely in their ability to attach grammatical affixes, and so we cannot always assume that all members of a lemma are available to nonnative learners, particularly beginners, just because one member has been demonstrated. To be truly confident in our count, word forms seem a better option. Despite this caveat, lemmas are widely used in acquisition studies for several reasons. First, if we use word forms as our unit of counting, any measurement instrument is likely to have a variety of the various forms of inflection on it (e.g. boy, cars, walk, carried, seeing, sleeps). This is potentially confusing for the participant, and so measurement instruments typically include only base forms (or sometimes the most frequent forms) of a lemma or word family (boy, car, carry, see, sleep). Also, with more proficient learners, we can be more confident that they will have some mastery over the morphology of a language, and so for them lemmas are an appropriate measure. Lemmas have a transparent definition, and so the readers of research studies can form a clear idea of exactly what vocabulary is being discussed. This is not always true of word families, where the criteria for word form inclusion vary considerably. However, word families have their own advantages as a unit of counting. They correspond most closely to the notion of headwords used in
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 191
6/9/2010 1:09:20 PM
Researching Vocabulary
dictionaries, a concept most participants are familiar and comfortable with. This makes word families a reasonable measure when using dictionaries as a baseline for how many words exist in a language. Word families also reduce the redundancies that can occur in word counting, i.e. all the semantically related members are included in a word family, they do not have to be handled again in another category, as they would have to be with lemmas, and especially word forms. Nevertheless, word families inevitably also have drawbacks as a unit of counting. Some of these have already been alluded to: that they are difficult to interpret because the criteria for word form inclusion varies, and that concordancing software does not usually automatically tabulate word family figures. Moreover, we cannot assume that even relatively advanced L2 learners will have good control over the derivational affixes used to form the members of a word family (e.g. permit + -ion = permission; + -ive = permissive). Bauer and Nation (1993) suggest that if one member of a word family is known, then the other members can probably be recognized as well. There is evidence to show that this holds true, at least to some extent, for receptive knowledge. However, we cannot make the same assumption for productive knowledge. Schmitt and Zimmerman (2002) found that learners studying to enter English-medium universities typically did not know all of members of a word family productively, even though they knew one or more. If they knew any of the forms of a word family, it was usually the noun and verb forms, but they had much more trouble producing the adjective and adverb forms. There is also the issue of receptive versus productive use to consider. Paul Nation (personal communication) suggests that, in general, for receptive use, word families are the best unit to use, with the definition of what is included in the word family being related to the proficiency level of the participants involved. This makes sense because learners should be able to perceive the similarities between members of a word family in the receptive mode, at least to some extent. For productive use, Nation feels that the lemma, or even word form, is the best unit of counting to use. This is because productive use is more difficult, and having productive mastery over one member of a word family does not necessarily entail having it over other members. He also points out that John Sinclair would probably argue for individual word form, because the collocates are often different for the different word forms included in a lemma, and thus each word form would require a slightly different kind of knowledge for productive use. While this argument is persuasive, it does create problems for vocabulary measurement. If a researcher uses families for counting receptive research and lemmas/word forms for productive research, the different units of counting would mean that there could be no truly parallel receptive/productive tests. This would make it difficult to directly compare the receptive and productive knowledge of participants.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
192
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 192
6/9/2010 1:09:21 PM
193
So what are we to make of all this? I am personally developing a feeling that lemmas might be the best general unit of counting for four reasons: (1) the unit is relatively straightforward, which means that consumers of research studies will know what it means; (2) this relative simplicity also makes replication and comparison of studies more feasible; (3) lemmas might be a reasonable compromise for counting both receptive and productive vocabulary, and thus making receptive and productive studies comparable; and (4) it takes a lot of vocabulary to function in a language, and estimates based on word families may give the impression that less is necessary than is the case, especially as many consumers may simply interpret word family figures as ‘words’. Using lemmas as the counting unit counteracts this because the lemmas are easier to conceptualize, and because the figures will be higher than word family figures in any case. In sum, the unit of counting vocabulary must reflect the goals, participants, and resources of your study. If it is a corpus study, then choose the unit that best matches the facet of lexis you are exploring, the amount of tagging in your corpus, and capabilities of your concordance software. However, if you are running a study using participants, you must take their background and abilities into account. You should consider how many affixes they are likely to know, and how well they can use them. For most aspects of lexis, word form counts make sense for less proficient participants, while lemma or word family counts can be more appropriate for more proficient participants. There is also the issue of standardization. Vocabulary research will be much more comparable, and thus useful, if all researchers used the same unit of counting. On balance, I feel that lemmas are probably the best unit overall, as it is relatively easy to lemmatize words and they are unambiguous to interpret. Whichever unit you choose to use, it is critical that you report this unit prominently in your research account, and clearly explain how it was used in making the size counts. If not using individual word forms as your unit, it is also probably useful to include a discussion describing how your unit compares to word forms, as many consumers will think in terms of word forms as a default. 5.2.2 Sampling from dictionaries or other references One of the most common ways of establishing the vocabulary size of both natives and nonnatives is to first establish the overall population of lexical items which could be known, select a sample of these items to fix on a test, and then assume that the percentage of items answered correctly on the test represents the percentage of items known in the total population. There is no one reference source which includes all of the lexical items which could possibly be known, including high-, mid-, and low-frequency words, formulaic sequences, technical words and phrases, brand names, names of people and places, etc. The closest we have to a comprehensive source is dictionaries, although these vary widely in their
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 193
6/9/2010 1:09:21 PM
Researching Vocabulary
coverage. In English, the most comprehensive dictionary available is the Oxford English Dictionary. It is a massive multi-volume work, but still does not include many items, such as scientific terminology or the names of many geographical features. Nation (1993) reviews early size measurements and concludes that none sampled from dictionaries in a manner that produced valid results. Most included far too many high-frequency words (which participants are more likely to know), which generally led to inflated estimates of vocabulary size. He outlines a number of essential, but rarely followed, procedures which are necessary for a representative sampling from dictionaries and other reference sources. 1. Choose a dictionary that is big enough to cover the known vocabulary of the people being investigated. For educated adult native speakers, Nation suggests using a dictionary with at least 30,000 base words. Most modern dictionaries, both native and learner, meet this criterion. In fact, with second language learners, smaller dictionaries or word frequency lists may be more appropriate, so that learners have some chance of knowing a reasonable proportion of the target lexical items. If a full-sized dictionary were sampled, then beginning learners would know only a few of the sampled items, leaving most of the lower-frequency vocabulary items far beyond their level, and so effectively wasted. It is better to choose a more limited vocabulary population, so there are more samples of vocabulary within the examinees’ potential knowledge range. 2. Use a reliable way of discovering the total number of entries in the dictionary. Dictionary makers are not particularly concerned about the number of entries in their dictionary, except in terms of keeping the length of the dictionary to a reasonable size. However, in advertising, a greater number of entries is more impressive, and Nation found that publishers’ statements greatly exaggerated the number of entries. Therefore it is not possible to rely on these publishers’ figures. The true total number of entries can be found by counting each entry either manually or with a computer, or by counting a sample of the dictionary.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
194
3. Use explicit criteria for deciding and stating (a) what items will not be included in the count and (b) what will be regarded as members of a word family. All dictionary publishers have criteria for what will be included in their dictionaries. However, criteria based around lexicographic purposes will
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 194
6/9/2010 1:09:21 PM
Measuring Vocabulary
195
●
●
●
because they may not be important for the participant population (e.g. scientific terminology, geographical place names) in order to lower the number of words in the total population, so that the resulting test can have a greater percentage of feasible items (see above) to avoid counting multiple headwords with related forms (legal, legalese, legalism, legality, legalize). Such duplication can lead to inflated estimates of vocabulary size, and so are better coalesced into word families.
4. Use a sampling procedure that is not biased towards items which occupy more space and have more entries. Some lexical items have many meaning senses, and will take up much more space than other items. These longer entries tend to be high-frequency vocabulary, and so it is important not to over-sample these items, or inflated estimates of vocabulary size will occur, because participants are more likely to know the high-frequency vocabulary. Ways to compensate for this problem include using numbered entries (choosing every nth word), choosing every nth complete entry (which is not a homograph of a previous entry) on every mth page, random sampling, and stratified sampling based on letters of the alphabet. 5. Choose a sample that is large enough to allow an estimate of vocabulary size that can be given with a reasonable degree of confidence. More items on a vocabulary test lead to more confident estimates of vocabulary size. In general, it is advisable to include the greatest number of items possible. This will depend primarily on the amount of time available for the test and the test item format (e.g. checklist items are very quick to answer; gap-fill items less so). 6. The sampling should be checked for the reliability of the application of the criteria for exclusion and inclusion of items.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
probably not be suitable for research purposes. Therefore researchers need to set exclusion criteria to delete entries which are not appropriate for their studies into vocabulary size. Entries can be deleted for several reasons:
There are several ways of checking whether the inclusion criteria for lexical selection is being consistently applied. For example, the sample can be done in sections and the figures for each section compared, or more than one rater used and inter-rater reliability established.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 195
6/9/2010 1:09:21 PM
196
Researching Vocabulary
Once the sample is taken, it should be checked against frequency counts to see if it contains the appropriate number of lexical items at each frequency level. For example, if each item in the sample represents 100 words in the dictionary, then there should be ten words in the sample within the first 1,000 frequency band (10 x 100). Similarly, there should be ten words in the second 1,000 band, etc. 8. In the written report of the study, describe clearly and explicitly how each of the previous seven procedures was followed in sufficient detail to allow replication of any or all of the procedures. This reflects the same need for clear and precise reporting discussed in Section 4.7. 5.2.3 Recognition/receptive vocabulary size measures The next two sections will overview a number of vocabulary tests which measure recognition/receptive and recall/productive mastery of lexical items. The tests discussed have all been used in research to various degrees, and some, but certainly not all, have a degree of validity evidence in place. Peabody Picture Vocabulary Test (PPVT) Most vocabulary tests have been used for measuring L2 lexical knowledge, simply because most native speakers have a considerable-sized vocabulary, with Goulden, Nation, and Read (1990) estimating sizes of about 17,000 word families for their New Zealand university students. With relatively large vocabularies forming comparatively early in life, a more interesting facet of lexical knowledge for most native speakers is their relative lexical automaticity. However, for young children where vocabulary is in an early stage of development (two to six years old), and in the elderly (90+ years), where language is attriting, measures of vocabulary size can be interesting. One test used in these L1 cases is the Peabody Picture Vocabulary Test. It is a meaning-recognition test in which examinees listen to words spoken by the test administrator, and then point to the picture which best represents this meaning from a group of four simple, black-and-white illustrations. The test takes about 11–12 minutes on average. It is available in two parallel forms, each containing four training items followed by 204 test items divided into 17 sets of 12 items each. The sets are progressively more difficult. The PPVT test is commercially available at .
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
7. The sample should be checked against a frequency list to make sure that there is no bias in the sampling towards high-frequency items.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 196
6/9/2010 1:09:21 PM
Measuring Vocabulary
197
Perhaps the most widely used vocabulary size test in the ESL context is the Vocabulary Levels Test. In 1996, Meara called it the ‘nearest thing we have to a standard test in vocabulary’ (1996b: 38), and it may still hold that distinction today after going through several iterations (Nation, 1983, 1990; Beglar and Hunt, 1999; Schmitt et al., 2001) (although see the Vocabulary Size Test below). It is called the Levels test because it focuses on vocabulary at four frequency levels: 2,000, 3,000, 5,000, and 10,000. These bands coincided with the then current consensus of how much vocabulary was necessary for achieving key goals. Based on Schonell, Meddleton, and Shaw (1956), it was thought that around 2,000 word families were sufficient to engage in daily conversation; 3,000 families were thought to enable initial access to authentic reading, and 5,000 families independent reading of that material. In addition, 5,000 families represented the upper limit of general high-frequency vocabulary; 10,000 families was a round figure for a wide vocabulary which would enable advanced usage in most cases. (Note that current estimates of the vocabulary size requirements of English are much higher; see Section 1.1.2.) In addition, there is a section focusing on academic vocabulary, which is not frequency-based. The VLT test uses a form-recognition matching format, in which the stem is the definition, and the options are the target words. Each cluster of items contains three stems and six options. In the latest Schmitt et al. versions, each level has ten clusters (i.e. 30 items). Below is a sample cluster: You must choose the right word to go with each meaning. Write the number of that word next to its meaning. 1. 2. 3. 4. 5. 6.
concrete era ——— circular shape fiber ——— top of a mountain hip ——— a long period of time loop summit
A number of points can be made about this format. ● ●
●
Based on the definitions in Section 2.8, the VLT is a form recognition test. For consistency, each cluster contains words from only one word class. Roughly reflecting the distribution of word classes in English, there are five noun clusters, three verb clusters, and two adjective clusters per frequency level. Both the three target words and the three distractors are from the particular frequency band, which means that the examinees are considering six words from the band in each cluster.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Levels Test (VLT)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 197
6/9/2010 1:09:21 PM
●
●
●
●
●
●
Researching Vocabulary
The definitions are kept short, so that there is a minimum of reading, allowing for more items to be taken within a given period of time. The test is designed to tap into the initial stages of form-meaning linkage. Therefore, the option words in each cluster are chosen so that they have very different meanings. Thus, even if learners have only a minimal impression of a target word’s meaning, they should be able to make the correct match. The clusters are designed to minimize aids to guessing. The target words are in alphabetical order, and the definitions are in order of length. In addition, the target words to be defined were selected randomly from the six options in each cluster. The words used in the definitions are always more frequent than the target words. The 2,000 level words are defined with 1,000 level words, and wherever possible, the target words at other levels are defined with words from the General Service List (GSL) (essentially the 2,000 level) (see Nation, 1990: 264, for more details). This is obviously important as it is necessary to ensure that the ability to demonstrate knowledge of the target words is not compromised by a lack of knowledge of the defining words. The word counts from which the target words were sampled typically give base forms. However, derived forms are sometimes the most frequent members of a word family. Therefore, the frequency of the members of each target word family was checked, and the most frequent one attached to the test. In the case of derivatives, affixes up to and including Level 5 of Bauer and Nation’s (1993) hierarchy were allowed. As much as possible, target words in each cluster begin with different letters and do not have similar orthographic forms. Likewise, similarities between the target words and words in their respective definitions were avoided whenever possible.
The test is not really designed to provide an estimate of a person’s overall vocabulary size, although some studies have combined the frequency levels to produce an total size figure. The test is better used to supply a profile of learners’ vocabulary, which is particularly useful for placement and diagnostic purposes. The profiles illustrated in Figure 5.2 in Section 5.1.3 are products of the VLT. Validity evidence for the VLT is available in Read (1988), Beglar and Hunt (1999), and Schmitt et al. (2001). The two (relatively) equivalent Schmitt et al. versions are available in Section 6.1.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
198
Vocabulary Size Test (VST) The Vocabulary Size Test made its first appearances in Appendix 2 of Focus on Vocabulary (Nation and Gu, 2007) and Appendix 4 of Teaching Vocabulary (Nation, 2008). It is now also available in an interactive web format on Tom
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 198
6/9/2010 1:09:21 PM
Measuring Vocabulary
199
sheriff: The sheriff was friendly. (a) person who flies aeroplanes (b) person who takes care of babies (c) person who makes sure the law is obeyed (d) person who teaches children at home The words on the test were randomly selected from the Collins English Dictionary and sequenced into the 14 frequency bands based on range and frequency figures from the spoken section of the BNC. Beglar (2010) carried out a Rasch validation study on the VST on 178 Japanese EFL learners and 19 native speakers and found that the examinees’ scores generally decreased towards the lower-frequency bands (i.e. highest scores on first 1,000 band; lowest score on fourteenth 1,000 band. The Rasch model was able to account for 86% of the total variation in the test scores, and the test items generally had good technical characteristics. The reliability figures were very high (.96–.98). Although this is only an initial validation analysis, the results are promising, and give no indication why the test should not be used. As opposed to the VLT, which produces a profile of knowledge at various frequency levels, the VST is intended as a test of overall vocabulary size. It should have value in measuring learners’ progress in vocabulary learning. The most frequent 14,000 words of English along with proper nouns account for over 99% of the running words in written and spoken text (Nation, 2006). Although adult native speakers’ vocabularies are much larger than 14,000 words, these 14,000 words include all the most important words. Initial studies using the test indicate that undergraduate nonnative speakers successfully coping with study at an English-speaking university have a vocabulary around 5,000–6,000 word families. Nonnative-speaking PhD students have around a 9,000 word family vocabulary. Checklist tests from Meara and colleagues
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Cobb’s Lextutor website . It employs a traditional four-option multiple choice meaning-recognition format, with the target word and a non-defining example sentence as the stem. The VST is broken into 1,000-word frequency bands, and ranges from the first 1,000 band to the thirteenth or fourteenth 1,000, depending on the version. Each 1,000 word frequency band contains ten items, so each item represents 100 words within that frequency band. The item format is illustrated below:
A number of checklist tests have been developed by Paul Meara and his colleagues. Checklist tests utilize the simplest format of any vocabulary test, where examinees read lists of lexical items in isolation and simply indicate whether they think they know the items or not. For this reason, they are also called Yes/No tests, and probably should be considered meaning-recall items, even though the meaning does not have to be demonstrated. The
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 199
6/9/2010 1:09:21 PM
200 Researching Vocabulary
basic format will look something like the following extract from the Yes/No test, an interactive checklist test created by Paul Meara (1992), placed into web format by Tom Cobb, and available on the Lextutor website .
1 6
obey accident
2 7
thirsty common
3 8
nonagrate shine
4 9
expect sadly
5 10
large balfour
A number of points can be made about the checklist test format. ●
●
●
The test is easy to take. The examinees simply decide whether they think they know an item or not, and either ‘check’ (✓) known words on a paperand-pencil version, or click on the box in a computerized version. This simplicity means that the items are quick to take, and so many more items can be included on a test compared to more time-intensive formats. This makes it possible to have relatively higher sample rates with checklist tests. The test rubrics usually ask examinees to judge whether they ‘know’ the lexical items. This means that examinee variability can have an effect on participants’ scores. If examinees are conservative, and only check items they are completely sure of, their scores will be relatively lower than examinees who are less rigorous and check items they have some sense they might know. In fact the true underlying knowledge may be the same, but the test results can be different based on the examinees’ relative judgement behavior. In addition to this ‘degree of confidence’ issue, examinees can differ in how they understand the notion of ‘know an item’. Most will probably take this to mean that they know the form-meaning link, but some may think in terms of being able to recognize the item when listening or reading, or being able to produce the item in their speech and/or writing. To avoid these problems, the test rubrics should spell out more precisely what the criteria of ‘knowing’ are. For example, the rubrics could specify that examinees should check any items for which they know at least one meaning. Alternatively, a can-do approach can be taken: e.g. check any item which you are confident that you can use in your own writing without using a dictionary. A checklist test has no direct demonstration of knowledge, and there is always the chance that examinees will overestimate their vocabulary knowledge, i.e. check items that they do not in fact know. This connects with the above point that examinees may be relatively more or less careful in checking words as known. This is usually controlled for by adding plausible nonwords to the test to see if examinees check them. The ratio is usually around 25–33% of the total items. There is no set number or percentage of checked nonwords which invalidate a test, but if more than
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Test 1, Level 1:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 200
6/9/2010 1:09:22 PM
201
a few are selected, it raises serious doubts about the real items which are checked. For example, on the Yes/No test illustrated above, there are 40 real words and 20 nonwords. If only four nonwords are checked out of the 20 (20%), this suggests that 20% of the real words might also be at risk, i.e. eight. Since the 40 real words on the test represent 1,000 words in the total population (the first 1,000 word band), each of the target words represents 25 words. Thus selecting four nonwords indicates that the estimate of total vocabulary size might be overestimated by 200 words. Thus even a few nonwords checked is a potential problem. There are two approaches to dealing with this problem. The first is to set a maximum of nonwords, over which the data is discarded as unreliable. Schmitt, Jiang, and Grabe (in press) took this approach and deleted all participants who checked more than three nonwords out of a total of 30 (i.e. 3/30 (10%) maximum nonwords allowed to be checked). The other approach is to use some formula to adjust the test score downwards according to the number of nonwords checked. The problem is deciding which adjustment formula to use, as it is still unclear how well the various adjustment formulas work (Beeckmans, Eyckmans, Jansens, Dufranne, and Vande Velde, 2001; Huibregtse, Admiraal, and Meara, 2002; Mochida and Harrington, 2006). An example of a simple formula is Anderson and Freebody’s (1983):
True h =
h−f 1−f
This formula follows from Signal Detection Theory, which compares hits (appropriate responses correctly selected) and false alarms (inappropriate responses incorrectly selected). In checklist testing, this translates to: h = hit rate (real words selected as known) f = false alarm rate (nonwords selected as known) A more complex formula (Index of Signal Detection) which attempts to correct for sophisticated guessing and response style was developed by Huibregtse et al. (2002):
I SDT
=1−
4h( 1 − f) − 2(h
− f) ( 1 + h − f) 4h( 1 − f) − (h − f) ( 1 + h − f)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
There is even some question whether such adjustments are necessary for all types of participants. Shillaw (1996) found that his Japanese university EFL subjects were careful enough that a Rasch analysis of checklist results using only the real words on the test was not substantially different from results
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 201
6/9/2010 1:09:22 PM
including both the real words and the nonwords. Thus Shillaw concludes that ‘on these [checklist] tests and for these students, the presence of nonwords had little effect on their test performance’ (p. 7). Overall, it is still unclear which nonword adjustment method is best to calculate final checklist test scores, as the various adjustment formulas seem to lead to very similar results (Huibregtse et al., 2002; Mochida and Harrington, 2006). However, unless participants are predisposed to answer very carefully, it will be necessary to use some adjustment formula, or to delete tests with too many nonwords chosen, in order to control for overestimation. In addition to the Yes/No test on Lextutor, the following two checklist tests are available on Paul Meara’s lognostics website . X_Lex This computerized checklist list measures words up to the 5,000 level, and provides a profile of vocabulary known at the first five 1,000 frequency bands, as well as an overall vocabulary size estimate. At the time of writing, it was at Version 2.05. Y_Lex Y_Lex is the advanced companion test to the X_Lex, and is aimed at more advanced speakers. It measures words in the 6,000–10,000 frequency range. Like X_Lex, it provides an overall vocabulary size estimate and a profile of vocabulary knowledge, but in this case with the 6,000, 7,000, 8,000, 9,000, and 10,000 frequency bands. It was also at Version 2.05 when this book was written. Computer Adaptive Test of Size and Strength (CATSS) Laufer and Goldstein (2004) developed a computerized test of vocabulary knowledge called CATSS. It uses four different item formats to give estimates of both vocabulary size and depth of knowledge, in the sense that an indication is given of how ‘strong’ the form-meaning link is. Several aspects of the test have already been discussed in Section 2.8, but an additional point is worth making. The test has the advantage of being adaptive in two ways. First, if an examinee does well on the early high-frequency words, the computer program quickly advances the test to the next frequency band. This repeats until the examinee starts missing a number of the words in a band. The computer can then concentrate on words at around that frequency level to get a more accurate picture of the examinee’s vocabulary size. This avoids wasting time on words which are far too easy or too difficult for the examinee, and allows a far better sampling at the level where there is uncertainty whether the words are known or not. This adaptiveness has a great advantage over static tests, where either the test administrators must guess the frequency levels to give to the examinees, or the examinees must work their way through the whole
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
202 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 202
6/9/2010 1:09:22 PM
Measuring Vocabulary
203
test in a lockstep fashion. CATSS is also adaptive in terms of the four levels of form-meaning link ‘strength’. If an examinee knows a higher level (e.g. form recall), then the easier ones are not tested (e.g. form recognition). 5.2.4 Recall/productive vocabulary size measures Laufer and Nation (1995, 1999) used the words and frequency bandings from the form-recognition version of the VLT to create a form-recall version. The item format is a defining sentence context with a blank for the target word which examinees fill in. In order to disambiguate between the possible synonyms which could be inserted into the blank, enough initial letters are given at the beginning of the blank to hopefully limit the possible answers down to the target word. 1. 2. 3. 4. 5. 6.
Every working person must pay income t———. The differences were so sl——— that they went unnoticed. There are a doz——— eggs in the basket. The telegram was deli——— two hours after it had been sent. The pirates buried the trea ——— on a desert island. The afflu——— of the western world contrasts with the poverty in other parts. 7. Farmers are introducing innova——— that increases the productivity per worker.
A number of points can be made about this format. One of the most noticeable is that some of the target words have only one letter to disambiguate them, while others have up to six. What effect this variation has on the relative difficulty of the various target words is unclear. It may make little difference, but given the potential difficulty of learning form (Sections 1.l.8 and 2.1), one would expect that it might have a considerable effect. Another issue is the ‘transparency’ of the answers. In Example 1, the very strong collocation income tax serves to make the answer rather obvious in this sentence context (if it can be assumed that the examinee has intuitions about this type of more frequent collocation; see Durrant and Schmitt (2009) in Section 3.9). The same would appear to be true for Examples 3 (dozen eggs), 4 (telegram delivered), 5 (pirates buried the treasure). Examples 6 and 7 have associations which may help in answering the items (affluence – poverty; innovation – productivity). However, Example 2 lacks such a strong collocation, or for that matter, any obvious schema from which to fill in the gap. For this reason, it might be expected that this item would be more difficult than the others. In addition, it only has two given letters, which may make it more difficult to identify the target word. Thus, the items on this test vary in both their formal characteristics (number of letters), and the defining power of the context sentences.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
The Productive Vocabulary Levels Test (PVLT)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 203
6/9/2010 1:09:22 PM
Researching Vocabulary
These issues may or may not be problematic, and the only way to know is to carry out a careful validation study to ensure that individual items and test overall are working as desired. Laufer and Nation (1999) carried out a small validation study and found some initial positive evidence. The different levels produced scores in an expected ‘stair-step’ profile, with higher-frequency levels being known better than lower-frequency levels. (Note that the University Word List (UWL)2 is not frequency-based, and so cannot be considered as part of this evidence, but is still shown for interest.) Furthermore, examinees at higher-grade levels (and presumably higher L2 English proficiencies) scored better than lower-grade examinees (Table 5.2). Thus, the levels as whole appear to behave as one would expect. However, there was no inquiry into the behavior of the individual items within the frequency levels. Thus, we have little idea of the effects of the differing numbers of prompt letters and differing contextual defining power. There is also the question of what the test is measuring. I describe it above as a form-recall test, but this is not totally accurate, as some of the form is already given in the items, and in some cases, a great deal. Laufer and Nation (1999) describe the measure as a test of active vocabulary, but it is not entirely clear how this is to be interpreted (words which are available for productive use in writing, or only words which can be produced when prompted). In an earlier study, Laufer and Nation (1995) found some moderate correlations between the PVLT and the Lexical Frequency Profile (LFP – a frequency-based measure of learner writing; see below). This suggests that there is some relationship between the scores on the PVLT and participants’ ability to produce vocabulary in their writing. Unfortunately, this suggestion is not straightforward. The LFP only measures at the first 1,000, second 1,000, UWL, and Not in Lists (all other words not on the first three lists). This does not map well with the frequency band breakdowns in the PVLT, and so the correlations between the two measures are not direct comparisons. In another study, Laufer (1998) found no correlations between the 2,000+ level of the LFP and the PVLT, which also potentially raises doubts about the ability of PVLT to measure the productive ability to use vocabulary. This leads Read (2000) to wonder whether the Table 5.2 Scores on the Productive Vocabulary Levels Test (max = 18)
10th grade 11th grade 12th grade University
2,000
3,000
UWL
11.8 15.0 16.2 17.0
6.3 9.3 10.8 14.9
2.6 5.3 7.4 12.6
5,000 10,000 1.0 3.9 4.7 7.4
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
204
0.0 0.0 0.9 3.8
(Adapted from Laufer and Nation, 1999: 39)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 204
6/9/2010 1:09:22 PM
205
PVLT might be better considered an alternative way to measure receptive vocabulary knowledge rather than as a measure of productive vocabulary. All of this is not to say that the PVLT is problematic; it is simply pointing out that there is not enough evidence to know its true value. As with all vocabulary tests, the confidence we can place on interpretations drawn from this test is directly reliant on the rigor and comprehensiveness of the validation argument. I feel that a direct validation procedure, as suggested in Section 5.1.3, is an essential component of building such an argument. To further explore the PVLT, I would have examinees take the test, and then have them try to use the target words in a writing context as part of an interview process. Their ability or inability to do this would be very insightful into whether the PVLT is actually a productive measure as claimed. I would also run an item analysis, and, controlling for frequency, check whether the context/number of letters given has an impact on the difficulty of the individual test items. This fuller validation analysis would surely go some way towards understanding how the PVLT should be interpreted, and the degree to which those interpretations can be relied upon. The PVLT has been promoted as an active/productive test, but according to my proposed terminology (Section 2.8), it is closest to a form-recall test which assesses the form-meaning link. Further, it is not productive in the sense that it does not require examinees to produce the lexical items in the course of their spoken or written output. Similarly, in Read’s (2000) terms, it is selective test, in that target items were preselected by the test creators. This preselection is necessary if the test developer wishes to create tests with particular lexical items. However, it would be extremely useful, and more ecologically valid, if a means were available to measure all of the lexical items in an examinee’s output, i.e. a comprehensive measure in Read’s terms (a measure which takes account of the whole vocabulary content of the examinees’ response (writing/speaking tasks)). There have been several approaches to creating comprehensive measures, with some of the major efforts outlined below. Frequency-based comprehensive methods One of the major methods is to classify the lexical output according to frequency. Several measures have taken somewhat different approaches using frequency.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
Lexical Frequency Profile (LFP) One of the best known frequency-based measures is the Lexical Frequency Profile. It utilizes the VocabProfile software developed by Paul Nation, Alex Heatley, and Averil Coxhead from the Victoria University of Wellington, and is available on Nation’s website. VocabProfile breaks the vocabulary of inputted language into four categories: first 1,000 frequency band (1K),
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 205
6/9/2010 1:09:22 PM
second 1,000 frequency band (2K), words in the AWL (the UWL was used in early versions), and all remaining vocabulary not in any of these three categories (Off-list/2K+). The idea is that less proficient learners will produce texts mostly made up of the highest-frequency vocabulary (first 1,000), and very little of the lower-frequency bands and the AWL. Conversely, more advanced learners would have a larger vocabulary and so produce more of this lower-frequency vocabulary. Laufer and Nation (1995) trialled the measure with 22 mixed L1 low intermediate ESL learners in New Zealand, 20 Israeli first-year first-semester university students, and 23 Israeli first-year second-semester university students, which the authors believe to be three clearly-distinct proficiency levels. The means of these three groups at each LFP frequency level are shown below:
Lists Low intermediate University 1 semester University 2 semesters
1st 1,000
2nd 1,000
AWL
Not in
87.0a 79.6 75.5
7.1 6.8 6.1
3.7 8.0 9.1
3.1 6.1 8.1
a The figures are averages of scores from two compositions, and so the total percentages do not add up to 100% due to rounding.
These results illustrate that all of the learners produce a typical stairstep profile, with the vast majority of words produced coming from the first 1,000 frequency band. Moreover, the more advanced learners do use increasingly higher percentages of lower-frequency vocabulary. This is evidence that the LFP is tapping into the ability to produce compositions with a richer vocabulary. However, this conclusion must be tempered somewhat by several observations. First, the AWL is not solely frequency-based, and so one cannot view the profile in a sequential order, with Not in List words being considered less frequent than AWL words. In fact, academic vocabulary varies widely in frequency, with some words like major being very frequent (426th most frequent) and others being much rarer (reluctance 5,455 th) (figures according to Adam Kilgarriff’s BNC frequency counts, see Section 6.4). It is probably best to interpret the LFP profile as first 1,000, second, 1,000, and then everything else, broken into two concurrent categories: academic support vocabulary (AWL) and general English vocabulary (Not on Lists).3 With this in mind, the LFP only makes frequency distinctions at three levels first 1,000, second 1,000, and others (AWL or Not on Lists). In practice this may be too crude to be all that informative. Beginning learners of English will know little vocabulary beyond the 2,000 level, and so the LFP will only be able to give a rather unsophisticated distinction of
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
206 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 206
6/9/2010 1:09:22 PM
207
the percentage of vocabulary produced at the first 1,000 versus the second 1,000 levels. More advanced learners will produce vocabulary at the 2,000+ level, but the LFP lumps all of this together, and so does not make any distinctions at this level, with the exception of the academic/nonacademic dichotomy. Another problem is that the vocabulary produced by even very advanced learners (or native speakers for that matter) will still be largely made up of the first 1,000 words. For example, 71.4% of Coxhead’s Academic Corpus consisted of the first 1,000 headwords of the GSL, with the second 1,000 making up 4.7%. Thus the percentages of 2000+ words are always going to be relatively small in comparison. It can be considered problematic to base the major part of the analysis on a relatively small percentage of the lexical output. Another problem is that the LFP only indicates whether lower-frequency vocabulary appears in compositions; it can give no information about how well it is used. The words could be used inappropriately, or even totally incorrectly, and the measure would still indicate their usage. Thus the LFP procedure itself gives little indication of the degree of mastery of the productive lexical items. To avoid this problem, the texts need to be manually edited beforehand and words used incorrectly deleted because they cannot be considered as known. While this works, the necessity for previous manual analysis (i.e. correction) by a proficient language user before the data can be analysed by the software must be considered a weakness. Also, I think it is preferable to have measures that address all learner output, whether appropriately used or not, in order to come up with an estimate of the quality of vocabulary knowledge/usage. But in my own experience, the main problem is that I have found the LFP too blunt a measure to consistently indicate the lexical differences between compositions with more- and less-advanced use of vocabulary, as judged by rater judgement or intuition. It can also have problems showing vocabulary improvement. Horst and Collins (2006) looked at narrative texts produced by 210 beginner French learners of English over four 100-hour intervals of intensive language instruction. The learners made substantial progress in language proficiency during this time, but an LFP analysis of the longitudinal compositions did not reflect this improvement. However, a more detailed analysis did show lexical improvement in terms of using fewer French cognates, a greater variety of frequent words, and more morphologically developed forms. In other words, there was clearly improvement in lexical production, but just not of the frequency-based type which the LFP could discern. So although I feel the frequency-based methodology behind the LFP is worth pursuing, I have to wonder just how informative this early operationalization can be. For more discussion on the merits of the LFP, see Meara’s (2005) critique, Laufer’s rebuttal (2005b), and Section 2.8.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 207
6/9/2010 1:09:23 PM
BNC 20,000 Profile It may be possible that a more fine-grained frequency analysis might have better measurement characteristics. The BNC-20 version of VocabProfile provides such an analysis. It is based on the work of Paul Nation and Tom Cobb, and was launched in 2007 on the Lextutor website. The program gives a frequency breakdown of vocabulary inputted into the website, but has several advantages over the older version of VocabProfile upon which the LFP was based. First, BNC-20 gives frequency bandings at each 1,000 level up to and including the 20,000 level. This is a much fuller frequency description of the vocabulary, although the key additions are the 3,000–10,000 bandings, as the amount of vocabulary falling into the 10,000+ bands is relatively minor, sometimes making up only a handful of words. The second advantage is that there are no academic vocabulary categories in the BNC-20, therefore all of the banding categories are solely frequency-based, and so directly comparable. Third, the frequency information comes from the BNC, and so is probably a better representation of current English than the GSL information used by VocabProfile. Another advantage is that BNC-20 gives a wealth of frequency information: types, tokens, percentage of coverage of each band, cumulative percentage of coverage, and word families. The output is color-coded (as is output from the ‘classic’ VocabProfile also available on the website), with words in each frequency band indicated by a different color. This holds true for both a reproduction of the inputted text with each word color-coded for frequency, and for lists of words broken down per frequency level in three different ways: tokens, types, families. Table 5.3 gives an illustration of the summary frequency table, based on a little over 10,000 words taken from a draft version of this unit on measuring vocabulary. It would seem that the BNC-20 is a much improved tool from which to draw an LFP-like frequency analysis of written compositions. However, it only came on line less than a year before the writing of this book, and so little research based upon it had reached publication. Nevertheless, while it still cannot address the ‘quality of use’ issue, it should prove to be a very valuable tool for describing the frequency distribution of written output. The BNC-20 should also be of use for analyzing spoken discourse, as the K1 and K2 levels are essentially based on the spoken component of the BNC. P_Lex Another measure using the frequency-band approach is P_Lex (Meara, accessed 2008). Similarly to the LFP, it measures the lexical complexity of texts in terms of the amount of vocabulary beyond the 2,000 level. It divides the text into ten-word segments and determines how many 2,000+ words are in them. It then graphs the number of 2,000+ words per segment for each segment. This creates a curve, such as the one illustrated in Figure 5.3. In this case, a proportion of .4 (i.e. 40%) of the segments contained no (0) 2000+ words, .4 contained one 2,000+ word, .2 contained two 2,000+
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
208 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 208
6/9/2010 1:09:23 PM
Measuring Vocabulary
209
Freq. level
Families
Types
Tokens
Coverage
Cum. %
79.65 8.70 2.33 3.09 0.85 0.33 0.46 0.23 0.21 0.11 0.06 0.15 0.07 0.64 0.00 0.00 0.01 0.01 0.01 0.04 3.06
79.65 88.35 90.68 93.77 94.62 94.95 95.41 95.64 95.85 95.96 96.02 96.17 96.24 96.88 96.88 96.88 96.89 96.90 96.91 96.95 100.00
K1 Words K2 Words K3 Words K4 Words K5 Words K6 Words K7 Words K8 Words K9 Words K10 Words K11 Words K12 Words K13 Words K14 Words K15 Words K16 Words K17 Words K18 Words K19 Words K20 Words Off-list
509 217 74 61 38 16 18 18 7 8 6 7 4 12
901 337 92 88 47 18 20 18 11 9 6 7 4 14
8,065 881 236 313 86 33 47 23 21 11 6 15 7 65
1 1 1 1 ?
1 1 1 2 152
1 1 1 4 310
Total Pertaining to on-list only Words in text (tokens): Different words: Type-token ratio: Tokens per type:
999+?
1,728
10,126
10,126 1,728 0.17 5.86
100
Tokens: Types: Families: Tokens per family: Types per family:
100 9,816 1,576 999 9.83 1.58
(Lextutor, October 2008)
words, and no segment contained more than two 2,000+ words. This curve is then fit to already-established theoretical curves. Each of the theoretical curves has a lambda value (λ), and so we can assign the lambda value of the closest fitting curve to the data inputted into the P_Lex program. This has the advantage of reducing a complex frequency profile to a single parameter – lambda.4 The lambda values usually range from about .5 to about 4.5, although higher and lower values are possible. The lambda values get more reliable with longer texts, but even relatively short texts seem to produce lambda values that are workable.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 5.3 BNC-20 frequency analysis
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 209
6/9/2010 1:09:23 PM
Researching Vocabulary
0.45
0.45
0.4
0.4
0.35
0.35
0.3 0.25 0.2 0.15 0.1
0.3 0.25 0.2 0.15 0.1 0.05
0.05
0
0 0
1
2
3
4
5
6
7
8
9 10
No. of difficult words per segment
Figure 5.3
0
1
2
3
4
5
6
7
8
9 10
No. of difficult words per segment
P_Lex output
(Meara, accessed 2008: 1).
It should be possible to match the theoretical curves with the vocabulary sizes of norming subjects. That is, norming participants can be asked to write compositions which are analyzed and lambda values derived. These values can then be compared to the vocabulary sizes of the participants as established by vocabulary tests. Thus, with adequate norming, it may be possible to accurately estimate productive vocabulary size from the lambda values. This is an ingenious method of estimating total vocabulary size from a single sample of participant written output. Furthermore, because it samples text in ten-word segments, it seems to work even with shorter texts, which is an advantage when assessing lower-proficiency learners, who typically write shorter compositions. (This method could be carried out for spoken output as well, but would almost certainly require new calculation of spoken curves, as spoken and written lexical discourse differs substantially in terms of frequency of lexical content.) However, at the moment, the P_Lex manual states that there is not yet enough good normative data to allow this type of vocabulary size extrapolation (Meara, accessed 2008). For more information on this tool, and general approach, see Bell (2002), Meara (accessed 2008), Meara and Bell (2001), and Miralpeix (2007, 2008). V_Size A somewhat different approach to generating estimates of productive vocabulary size from relatively small amounts of writing has been developed by Paul Meara and Imma Miralpeix (accessed 2008a). Their V_Size software program creates lexical frequency profiles from inputted texts, and then estimates what these profiles tell us about the size of the productive vocabulary of the people who produced those texts. Like P_Lex, the inputted text is matched against theoretical profiles, but unlike P_Lex, V_Size uses
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
P(n words per segment)
P(n words per segment)
210
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 210
6/9/2010 1:09:23 PM
Measuring Vocabulary
211
Zipf argued that language was one of very many things where a simple relationship could be found between the rank order of an event and the size of the event. For language, Zipf argued that the pattern of frequencies exhibited by words followed this law, and he claimed that there was a straightforward relationship between the number of times a word occurred in a corpus and its rank order in a frequency list generated from the corpus. In simple terms, Zipf claimed that some words are more frequent than others, and you could tell roughly how many times a word would occur in a large corpus by looking at its rank order. (Meara and Miralpeix, accessed 2008a: 1) Also unlike P_Lex, there are some initial vocabulary size norms for the various theoretical Zipf curves. These are based on Miralpeix’s PhD thesis (Miralpeix, 2008) which compared V_Size estimates for groups of Spanish EFL learners with differing amounts of exposure to L2 classes, and different starting ages. Also, Meara (in preparation) carried out a more technical analysis using longer texts than Miralpeix, and found that the norms appeared to provide reasonable estimates of vocabulary size. However, the norms should probably still be seen as tentative until more data is collected, and anyone using V_Size with non-Spanish learners should establish norms for the L1(s) of their participants, or at least confirm that the V_Size size estimates have sensible correspondences with any other evidence of lexical size available. To use V_Size, a text (.txt) version of the learner output is selected and the program initially analyzes it according to five frequency categories: A (500), B (1,000), C (1,500), D (2,000), and E (2,000+). The researcher is able to reclassify any words which are believed to be misclassified, including numerals and proper names. The program then matches the frequency profiles to the theoretical profiles in its memory, and produces an estimate of vocabulary size. This is illustrated in a screen shot from the V_Size Manual (Figure 5.4), which shows the text words with their classification, a table which shows the percentages of vocabulary at each frequency band, a graph which shows the profile and the best-matching theoretical counterpart, and a vocabulary size estimate. To further illustrate the program, I ran a passage from a draft version of this productive vocabulary measurement section, using a word form analysis based on BNC frequency data (the bnc~strict database shown in the screenshot), with the following results: BAND
A
B
C
D
E
Err
data model
55 62
7 7
5 4
3 3
30 24
86
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
calculations based around Zipf’s Law distributions of vocabulary (see Section 2.5.2) to form the theoretical profiles.
Size estimate 19,400
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 211
6/9/2010 1:09:23 PM
Figure 5.4
V_Size screenshot
(Meara and Miralpeix, accessed 2008a: 6).
Although I have tried to write this book in an accessible style, I am glad that the vocabulary size estimate puts me in the range of educated native speakers! (See Section 1.1.2.) Type-token-based methods (lexical diversity)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
212 Researching Vocabulary
Another method of quantifying participant output is by measuring its lexical diversity/variation. That is, determining the variation in the number of individual types produced compared to the total number of tokens (i.e. establishing the type-token ratio). A relatively greater number of word types means that a wider range of vocabulary has been demonstrated, and the plausible assumption is that this reflects a larger, richer lexicon. The most
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 212
6/9/2010 1:09:23 PM
Measuring Vocabulary
213
basic way of quantifying the relationship between types and token is computing the type-token ratio. The formula for this is:
total number of tokens
× 100
This gives a simple idea of lexical diversity, but the problem is that it is strongly affected by the length of the text. As a text gets longer, there is less and less chance for new word types to appear, as a greater percentage of the frequent types have already appeared before. Thus, longer texts tend to have increasingly lower type-token ratios as an artefact of text length alone. This means that a basic type-token ratio cannot be used unless the text is controlled for length. As different participants tend to write different lengths of text (e.g. more proficient learners tend to write longer texts), a researcher would have to cut all of the texts to the length of the shortest one received, which has the effect of wasting a lot of valuable data from the longer texts. Standardized type-token ratio One way around this problem is to use a standardized type-token ratio. A program (such as WordSmith Tools) can divide a participant’s text into a number of 100-word samples (in some cases, every possible 100 sample), and then compute the type-token ratios for all of these. The average of all these computations is the standardized type-token ratio. This method avoids the problem of length attenuation. Vocd Another way to avoid the length problem is by using a curve-fitting method. David Malvern and Brian Richards have pioneered this approach (e.g. Malvern, Richards, Chipere, and Durán, 2004), and have created a statistic called vocd (i.e. vocabulary d statistic) and the software to compute it. It has now become the most accepted way of doing type-token analyses, in both L1 and L2 contexts. The software is available as part of the CHILDES L1 child database , see also Section 6.2), but requires texts to be formatted in the CHILDES format. A more user-friendly version called D_Tools (Meara and Miralpeix, accessed 2008b) is available on the _lognostics website, which accepts text files (.txt). The process behind vocd takes several steps. The program generates 100 samples of 35 randomly-selected words from a text, and calculates a typetoken ratio for each of these. These 100 means are then averaged to produce a composite mean ratio for all 100 samples. The program goes on to do the same thing for samples of 36 randomly-selected words, 37, 38, ... all the way to samples of 50 words. The end result is a list of 16 means for the 35–50 word samples. These means form a curve, and it is compared to a number
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
number of different types
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 213
6/9/2010 1:09:24 PM
of theoretical curves generated by the D formula. The value of D which produces the best matching curve is assigned to the source text. D typically varies between 0 and around 50, with lower values indicating more repetition and a vocabulary which is not lexically rich, and vice versa for higher values. Although vocd is a relatively sophisticated form of type-token analysis, and for the most part counteracts the length effect problem (although see McCarthy and Jarvis, 2007), it still has the same ‘quality of use’ limitations of other type-token analyses. Meara and Miralpeix illustrate this clearly with the following three examples: 1. The man saw the woman. 2. The bishop observed the actress. 3. The prelate glimpsed the wench. Each of these sentences will produce the same type-token ratio, even though they clearly demonstrate different degrees of vocabulary usage. This is not only the differences of frequency; Meara and Miralpeix note that bishops observing actresses would be highly marked in English (perhaps indicating some form of inappropriate behavior), and so production of this sentence would require considerable command of culture and background knowledge beyond the mere denotative meaning of the individual words. For this reason, they suggest that type-token analyses are probably best used with low-level learners who produce texts with lots of repetition and high-frequency vocabulary (i.e. type-token ratios of 10–30). More advanced users tend to produce texts with higher type-token ratios which are ‘not easy to distinguish from each other’, which led Meara and Miralpeix to question whether ‘D has good measurement properties at higher levels’ (p. 6). Similarly, Jarvis (2002) found that D had trouble discriminating between groups of speakers with obvious differences in vocabulary (native speakers, Swedish-speaking EFL learners, Finnish-speaking EFL learning at Grades 5, 7, and 9). This point about lack of sensitivity reflects my own experience with these measures, where they often have trouble differentiating between compositions with clearly different degrees of lexical quality, as judged by intuition or raters. Ultimately, I feel that the type-token approach (in all its guises) can only offer very limited information about lexical output, and certainly nothing about the appropriateness of use of that output. It should therefore only be used to supplement other lexical measures, and rarely (if at all) as the sole means of measurement. For more discussion of vocd, see Meara and Miralpeix (accessed 2008b), Read (2005), Richards and Malvern (2007), and van Hout and Vermeer (2007). It can also be pointed out that all of the comprehensive analyses in this section focus on individual word forms. They cannot cope with the multiword units which are so common in language (Chapter 3). For example, the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
214 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 214
6/9/2010 1:09:24 PM
215
term concentration camp can be analyzed as two individual words, but the sum of these two meanings will never add up to composite meaning of the dastardly forceful internment that has been used in so many wars. To whatever extent the analysis methodologies discussed in this section can inform about individual words, they are likely to miss a large portion of the lexical behavior and meaning that is tied into formulaic language. These ideas about the limitations of type-token methods of analysis, and indeed, all other vocabulary measurement, highlight the necessity for good validation too briefly discussed in Section 5.1.3. Quote 5.5 reminds us that researchers need to provide rigorous of tests of their measures, and indicate their limitations as well as their advantages when presenting them to the wider field to use. Although the authors are discussing measures of lexical diversity, the same point is appropriate for all types of vocabulary measure.
Quote 5.5 Malvern, Richards, Chipere, and Durán on the need for rigorous validation of vocabulary measures [Validation issues] matter. Much of the research based on flawed measures has significant implications for theory, practice, and policy. It is important therefore that methodological issues of measuring vocabulary richness are understood and that these confusions are cleared up. (2004: 180)
A comprehensive measure based on other criteria Coh-Metrix All of the above comprehensive measures rely on statistical counts of frequency or types and tokens. One other program gives a fuller description of the language in source texts by counting a wider range of linguistic characteristics: Coh-Metrix developed by McNamara, Louwerse, Cai, and Graesser . Coh-Metrix is a multi-component computerized analysis program that produces 60 indices of the linguistic and discourse representations of a text. Of these, a number are focused on the words in a text, and the characteristics of those words. The ones most obviously applicable to vocabulary study are listed below: ● ● ● ● ● ● ●
Number of words Average syllables per word Average words per sentence Raw frequency, mean for content words (0–1,000,000) Log frequency, mean for content words (0–6) Raw frequency, minimum in sentence for content words (0–1,000,000) Log frequency, minimum in sentence for content words (0–6)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 215
6/9/2010 1:09:24 PM
● ● ● ● ● ● ● ● ● ●
Type-token ratio for all content words Proportion of content words that overlap between adjacent sentences Latent semantic analysis, sentence to sentence Latent semantic analysis, paragraph to paragraph Concreteness, mean for content words Concreteness, minimum in sentence for content words Mean hypernym values of nouns Mean hypernym values of verbs Flesch reading ease score (0–100) Flesch-Kincaid grade level (0–12)
Just as the MRC Psycholinguistic Database (Coltheart, 1981) (see Section 4.5) can give detailed information about individual lexical items, Coh-Metrix can give enhanced information about the vocabulary in texts. It is a relatively recent tool, but is worth reviewing by any vocabulary researcher. It is available on-line after free registration. For more information, see the Coh-Metrix website and Graesser, McNamara, Louwerse, and Cai (2004).
5.3
Measuring the quality (depth) of vocabulary knowledge
The measures in the previous section focus on how many lexical items are known. This section will look at measures which attempt to quantify how well items are known. However, the two notions are not discrete. The observant reader may have noticed that all size measures have a (sometimes implicit) criterion of minimum knowledge for a lexical item to be counted as ‘known’. In the Vocabulary Levels Test, it is the ability to recognize the word forms which match the definitions given. In the frequency-based and type-token-based analyses, it is the fact that a lexical item was produced (however inaccurately or inappropriately) in a person’s written or spoken output. Thus, it can be said that all size measures are also depth measures in the sense that some quality of knowledge, no matter how minimal, must be operationalized as the criterion of sufficient knowledge. This size/depth connection is explicitly made in the CATSS test (Laufer and Goldstein, 2004; see Section 2.8), which deliberately uses four different criteria (i.e. four different types of test item) in order to give an indication of the depth of knowledge of the form-meaning link. Read (2000) suggests that there are two main ways to conceptualize the quality of knowledge of individual vocabulary items. (Let us disregard issues of automaticity and organization for the moment). The first is describing the incremental acquisition of a word along a continuum of mastery ranging from ‘Do not know at All’ at the beginning end, all the way to ‘Full Mastery of a Lexical Item in All Contexts of Use’ at the advanced end. Read calls this the developmental approach. The second approach is specifying the various types of word knowledge one can have about lexical items. This has
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
216 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 216
6/9/2010 1:09:24 PM
Measuring Vocabulary
217
usually been termed the dimensions or components approach. We will look at each in turn.
It is undeniable that vocabulary is learned incrementally, and so using a developmental scale to model this would appear sensible. However, the problem lies in operationalizing the developmental process into a workable scale. In fact, we have little idea about how vocabulary development advances, so creating a valid scale is rather speculative at the moment. Vocabulary acquisition theory is not advanced enough to guide the creation of a principled developmental scale, and previous research has not really been that helpful either. The result is that current developmental scales are often based on pedagogical rationales. This has the advantage of their being useful in learning contexts, but they have a number of problems resulting from their atheoretical development. First, for a scale to exist, there must be rational beginning and ending points. Having absolutely no knowledge of a lexical item seems a clear-cut beginning, but even this is not straightforward. If a person knows the spelling, pronunciation, and morphological rules of a language, then they will already know something about almost any new lexical item they meet. More problematic is the ending point. It must be something like ‘full knowledge of an item’, but how does one quantify this? There is no test imaginable which can verify that a word can be used accurately, appropriately, and fluently in every possible context. Thus any beginning and ending points will necessarily be approximations. This is despite the fact that the end points of scales are usually easier to establish than the gradations in the middle. We then come to the question of how many stages there are in the acquisition process. Where vocabulary learning is slow and gradual, built up over many, many meetings with a lexical item (although big jumps in knowledge can occur from focused, intentional learning), I tend to think that vocabulary learning is a continuum, with an uncountable number of small knowledge increments. But this is no good for developing a scale; we must have reasonable stages that are identifiable. However, there is currently no principled way of knowing how many stages an acquisition scale should contain. At a minimum, there must be the beginning ‘no knowledge’ stage, the ending ‘acceptable mastery’ stage, and one stage in between corresponding to receptive, but not productive, knowledge. But even this is problematic, as words can be known productively and not receptively. For example, I knew the word indict and used it productively in my speech, but did not know it was spelled i-n-d-i-c-t, and so did not recognize it in written discourse. A three-point scale may be the minimum, but there is no way to determine the maximum, or more importantly the appropriate, number of stages. Is it five stages like in the Vocabulary Knowledge Scale (VKS) below, or four stages,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
5.3.1 Developmental approach Developmental scales
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 217
6/9/2010 1:09:24 PM
as in Schmitt and Zimmerman’s scale? Or will we eventually find there are ten or more stages? In the end, we have to decide what kinds of vocabulary knowledge is important, and build the best scale we can around our decision. While setting the number of steps in a scale is not an insurmountable problem, the question of equal intervals between scale steps may be. Researchers have typically assigned a numerical value to each of the stages, and then proceeded to run statistical analyses on the resulting data. The analyses are almost always inferential (t-tests, ANOVAs, correlations, etc.), which require the use of interval scales, where the distance between the intervals is equivalent and consistent. With the scales currently available, this assumption cannot be met. This problem does not disqualify the use of scales, but it does mean that we should be very wary of analyses that use parametric statistics on data derived from these scales. It would seem much more appropriate to use non-parametric procedures with such data, because they rely only on rank hierarchies (i.e. ordinal data), which such scales should be able to provide. Another possibility would be to illustrate vocabulary knowledge gains graphically by showing pre-post patterns of movement from one scale stage/category to another (e.g. Paribakht and Wesche, 1997: 191). The Vocabulary Knowledge Scale (VKS) The best known and most widely-used depth-of-knowledge scale is the Vocabulary Knowledge Scale (see Paribakht and Wesche, 1997, and Wesche and Paribakht, 1996, for the most complete descriptions of the instrument). Although developmental scales have been around for a long time (e.g. Dale’s (1965) scale), Paribakht and Wesche were instrumental in reintroducing this measurement approach in more recent times. The VKS was designed to track the early development of learners’ knowledge of specific words at a given time in an instructional or experimental situation. As such, it was designed to provide a relatively efficient means of demonstrating certain changes in the receptive and initial productive knowledge of specific words resulting from instructional interventions (e.g. vocabulary exercises) or activities (e.g. reading), and in showing comparative gains resulting from different treatments (Wesche and Paribakht, personal communication). The VKS was designed to capture initial stages in word learning that are amenable to accurate self-report or demonstration through the use of a five-category Elicitation Scale that provides information for scoring using a five-level Scoring Scale. For Categories III–V, there is also a requirement for a demonstration of knowledge.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
218 Researching Vocabulary
I. I don’t remember having seen this word before. II. I have seen this word before, but I don’t know what it means. III. I have seen this word before, and I think it means ———. (synonym or translation)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 218
6/9/2010 1:09:24 PM
Measuring Vocabulary
219
The testee judgements and performance data are then evaluated according to the separate Scoring Scale, depending on levels 2–5 on the quality of the synonym, translation, or sentence responses (Figure 5.5). The VKS was originally developed to measure vocabulary learning in the English language programs at the University of Ottawa (Paribakht and Wesche, 1993, 1997; Wesche and Paribakht, 1996). More recently, Paribakht (2005) and Wesche and Paribakht (2009) used it to seek evidence of retention of new vocabulary knowledge by university ESL students after they attempted to infer the meanings of unknown words. The VKS proved useful in these studies, and seems to have value for its intended purpose of tapping into the early stages of vocabulary learning, rather than more advanced knowledge. Also, it seems to have good reliability (.89), being demonstrated using a test-retest method (Wesche and Paribakht, 1996). In addition, the fact that there is a requirement to demonstrate knowledge may well enhance the care in which the self-reports are made on the instrument.
Self-report categories
Possible scores
Meaning of scores
I
1
The word is not familiar at all.
II
2
The word is familiar but its meaning is not known.
III
3
A correct synonym or translation is given.
IV
4
The word is used with semantic appropriateness in a sentence.
V
5
The word is used with semantic appropriateness and grammatical accuracy in a sentence.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
IV. I know this word. It means ———. (synonym or translation) V. I can use this word in a sentence: ———. (Write a sentence.) (If you do this section, please also do Section IV.)
Figure 5.5 VKS scoring scale (Paribakht and Wesche, 1997: 181).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 219
6/9/2010 1:09:25 PM
However, the VKS, like all developmental scales, suffers from a number of limitations, and these need to be clearly understood before it can be used appropriately. These have been pointed out by Meara (1996c) and Schmitt (2000), and perhaps most fully outlined in Read’s (2000) critique of the VKS. First, as Paribakht and Wesche point out, the VKS is not an appropriate instrument to estimate lexical knowledge in general, nor does it provide a precise characterization of the process of learning individual words (see Henriksen, 1999, for a discussion of multiple developmental scales). The second potential limitation is that the initial two stages of the Elicitation Scale are unverified, but after this, stages require a demonstration of knowledge. There is no obvious way to elicit verification of knowledge at the first two stages, but this may not matter much in practice. Testees should be able to provide an accurate self-assessment at these stages, and there is no reason why they should wish to fake them. Furthermore, the demonstration of knowledge in Stages 3–5 should provide more valuable information than would self-judgement scores alone. Third, the knowledge constructs addressed between stages are not consistent. Categories I–IV of the Elicitation Scale essentially deal with various degrees of knowing the form-meaning link, but Category V jumps to mastery strong enough to use the word in a semantically appropriate way in a sentence. This can involve a constellation of lexical knowledge, including collocation, register, derivation (correct word family member), and grammatical knowledge (noun, verb, etc.). Thus, it seems that the scale is not unidimensional. Similarly, the intervals between the stages of the Scoring Scale do not seem to be consistent, which echoes the point made above about the intervals in developmental scales not necessarily being equidistant, and so inappropriate for parametric statistics. The different stages of the Scoring Scale do, however, appear to be ordinal in nature, (i.e. representing the progressive initial steps in acquiring knowledge of a given lexical item). The Elicitation Scale also mixes receptive and productive elements in ways that are not necessarily straightforward. Category I involves formrecognition, Categories II–IV elicit meaning-recall, and Category V requires full productive output. Likewise, the amount of contextualization varies among the categories, with Categories I–IV dealing with the lexical item in isolation, and only Category V involving context. Another potential limitation is that Categories III and IV of the Elicitation Scale require a judgement of degree of mastery: I think I know the meaning versus I know the meaning. Such metalinguistic judgements can be difficult for some learners to make, and many learners are better at judging what they can do (see below). The point about examinee judgement variability raised in the discussion of checklist tests (Section 5.2.3) also pertains here: some examinees will only select Category IV if they are absolutely positive of their meaning knowledge, while other examinees might select it if they
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
220 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 220
6/9/2010 1:09:25 PM
221
barely know the lexical item’s meaning. It is for this reason that testees are asked to respond to Category III if they attempt IV. A potential practical problem with the Elicitation Scale resides at Category V, where examinees are asked to produce a sentence which illustrates the meaning of the target lexical item. Unfortunately, as Read notes, examinees all too often write sentences which do not clearly demonstrate knowledge of the item. For example, one of my respondents once produced the following sentence for the target word access: I like the word access. Read cites McNeill’s (1996) finding that Chinese trainee teachers of English in Hong Kong could often produce plausible, and even quite sophisticated, sentences without really knowing the target words. So it seems that sentence writing is an uncertain method of eliciting evidence of productive knowledge, but in its favor, the VKS has the backup of the meaning definition at Stage IV as an explicit means of dealing with this uncertainty. However, some of the limitations of the Elicitation Scale might be mitigated in the process of interpreting the responses according to the Scoring Scale. The various forms of lexical demonstration potentially provide raters with enough information to place the testee responses at the appropriate levels of the Scoring Scale. Nevertheless, this involves a degree of rater subjectivity (e.g. ‘Is the synonym/translation “close enough” to show knowledge of meaning?’ ‘Does the sentence clearly show semantic appropriateness?’). Thus the final testee score on the Scoring Scale comes from a potentially complex combination of testee interaction with the Elicitation Scale and rater judgement of the testee responses. The variability in interpreting the VKS is well-illustrated in the VKS Scoring Scale (Figure 5.5). While Elicitation Scale Categories I and II map directly onto discrete interpretations, all of the other categories have multiple interpretations, with Category V having four possible outcomes. This illustrates both the strength and weakness of the VKS. The strength derives from the knowledge demonstration components of the Elicitation Scale, which should provide more trustworthy information than self-judgement, and the chance to adjust scores where the demonstration result disagrees with the judgement result. In the hands of a researcher who has a good understanding both of the instrument and of the learners being studied, this should lead to more accurate ratings on the Scoring Scale. On the other hand, it is always desirable to have a direct, consistent, and unambiguous relationship between learner output on a test and the scoring interpretation of that output.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
Schmitt and Zimmerman scale An example of a less complex scale is the one developed by Schmitt and Zimmerman (2002), based on the earlier Test of Academic Lexicon scale (Scarcella and Zimmerman, 1998). Schmitt and Zimmerman recognized the limitations of the VKS, and opted for a simpler scale, utilizing a can-do
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 221
6/9/2010 1:09:25 PM
Researching Vocabulary
paradigm. Can-do protocols are common in the field of assessment (e.g. the Can-do statements connected with the Common European Framework of Reference (CEFR): and , where examinees self-evaluate what they can do with their language proficiency, rather than providing metalinguistic judgements of what they know. For many examinees, it is easier to say what they are able to achieve with a language rather than making a statement about how well they know it. This can-do idea is incorporated into the Stages C and D of the scale, which essentially translate into receptive and productive knowledge respectively. A. I don’t know the word. B. I have heard or seen the word before, but am not sure of the meaning. C. I understand the word when I hear or see it in a sentence, but I don’t know how to use it in my own speaking or writing. D. I know this word and can use it in my own speaking and writing. Like the VKS, the scale has advantages and limitations. One advantage might be its apparently transparent descriptions. Rather than making judgements about degrees of metalinguistic knowledge, the C and D stages are written so participants only need to reflect on whether they can understand a word, and whether they can use it use it in their speaking and writing. This might make them easier to judge, but as Kirsten Haastrup (personal communication) has pointed out to me, it may not be straightforward to conflate speaking and writing mastery. Testees may well have facility with a lexical item in one mode, but not the other (e.g. can produce a word in compositions when not under time pressure, but cannot produce it in on-line speech). This would make judgements more difficult, and consequently, the level descriptions may be less transparent than we originally thought. Read (2000) notes that this type of inconsistency might be inevitable when trying to reduce the complexity of vocabulary knowledge down to a single scale, but it still poses problems for interpretation. (Note that the VKS avoids this by being specifically worded for written vocabulary, although Joe (1995, 1998) adapted it for oral use.) Overall, I think the ‘can-do’ approach is valid, but it might be better to specify it for either speaking or writing, but not both in developmental scales. Another stage which might be considered problematic is Stage B, which covers the somewhat tricky stage of knowledge where the word form is recognizable but has not yet been connected to a meaning. If a form-meaning link is considered the minimum criterion of useful vocabulary knowledge, then it could certainly be argued that Stage B be deleted, leaving a threestage scale corresponding to No knowledge/Receptive knowledge/Productive knowledge.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
222
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 222
6/9/2010 1:09:25 PM
223
As with all scales, deciding on the optimal number of acquisition stages is difficult, with little theoretical guidance to rely on. At the moment, it is probably best to select/develop scales which cover the aspects of acquisition which are important to the particular study at hand, as no scale currently describes the whole incremental acquisition process. Even if one could be developed, it may turn out to be too complex for any practical measurement usage. The Schmitt and Zimmerman scale would seem to have rational beginning and ending points. Having no knowledge of a word is the obvious beginning point, and the ability to use a word in one’s speaking and writing would appear to be a reasonable end goal for learners (although the caveat above suggests it might be better to focus on a single mode). Of course, the intended simplicity of the scale prohibits the testing of notions of full semantic and collocational appropriacy, as learners would probably find this extremely high level of mastery extremely difficult to judge. Despite the rational beginning and end points, it is not clear that the scale has interval spacings. Certainly the A→B increment intuitively feels smaller than the C→D increment, for instance. However, a three-stage version (A→C→D) may be closer to an interval scale, although it is difficult to think how this could be empirically established. Until a convincing argument can be made for equidistant stages, it is probably best to avoid parametric statistics with this and all scales. Compared to the VKS, the scale has the limitation of no demonstration of knowledge. This can be addressed in at least three ways. A first approach is to simply build demonstrations of knowledge into the scale. Second, a sample of participants could be interviewed after completing the instrument in order to confirm the accuracy of their self-reports (as discussed in Section 4.1). Alternatively, nonwords can be used. Schmitt and Zimmerman realized that their scale, as all self-evaluation measures, suffered from potential learner-over estimation of knowledge. To control for this, they borrowed an idea from checklist test methodology. Along with the AWL words they were testing, they included a number of nonwords, and alerted the participants to this fact in order to encourage them to be careful in their judgements. Participants who rated a nonword at Stage C or D were deleted from the data pool. Judgements at Stage B were allowed to remain, as the nonwords were purposely English-like, and so the researchers did not consider it unreasonable for learners to believe that they had seen a nonword before but did not know it. Of course, these last two techniques are not confined to the Schmitt and Zimmerman scale, but could be used with other developmental scales as well. It may seem that I have been quite harsh in my critique of developmental scales. This is because I think it is very important for researchers to be aware of these scales’ limitations. No current scale gives a full account of the incremental path to mastery of a lexical item, and perhaps lexical
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 223
6/9/2010 1:09:25 PM
acquisition is too complex to be so described. I have also reviewed numerous journal submissions where the VKS was used inappropriately, without any understanding of its limitations. However, I still feel that developmental scales have considerable research potential, but a sustained cycle of research is required to better determine these scales’ psychometric properties, so that they can be consistently interpreted with confidence. Until this happens, researchers need to carefully consider what these scales can and cannot do, and whether they are suitable for their research purposes. If there is any doubt, my personal feeling at the moment is that simpler, more transparent scales are probably more useful than more complex ones. 5.3.2 Dimensions (components) approach The second approach to measuring the quality of vocabulary knowledge involves specifying some of the types of word knowledge one can have about lexical items, and then quantifying participants’ mastery of those types. Schmitt (1998a) outlines some of the potential advantages of a dimensions/components approach. The first is its possible comprehensiveness. While measuring knowledge of several types of word knowledge is time consuming and limits the number of lexical items that could be studied, it can produce a very rich description of vocabulary knowledge, and so can be well worth the effort. The dimensions approach can also have a simplifying effect of breaking complex behavior (vocabulary acquisition) into its more manageable components for analysis. Furthermore, analyzing the components separately allows the possibility of discerning their relationships. A number of these relationships have long been obvious (e.g. between frequency of occurrence and formality of register; between word class and derivational suffixes), and one study has empirically demonstrated certain word knowledge interrelationships correlationally (Schmitt and Meara, 1997). An intriguing possibility is that some of these relationships are hierarchal; that is, learned in some type of developmental order. Developmental sequencing has been posited in other areas of language, syntactic structures (e.g. Pienemann and Johnston, 1987) and morphemes (e.g. Larsen-Freeman, 1975), so it would not be surprising if the principle obtained in the area of lexical acquisition as well. In fact, it seems counterintuitive that word knowledge is not at least partially hierarchal. It is unlikely that the initial exposure to a word yields much more than some partial impression of its written or phonological form and one of its meaning senses. After more exposures (or some explicit study), a learner would gradually learn the other kinds of word knowledge, with perhaps collocational and stylistic knowledge being the last. Indeed, it doesn’t seem reasonable that a learner would have a rich associative and collocational network built up without a knowledge of the word’s form, for instance. Research designs based on a word knowledge framework allow
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
224 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 224
6/9/2010 1:09:25 PM
225
investigation into whether some kinds of word knowledge are acquired before others. Finally, such word knowledge research may lead to a better understanding of the movement of vocabulary from receptive to productive mastery. This movement is still not well understood (Section 2.8), and researchers are not even sure whether receptive and productive knowledge forms a continuum, as Melka (1997) argues, or whether it is subject to a threshold effect, as Meara (1997) has suggested. Part of the problem is the typical assumption that lexical items are either receptively or both receptively and productively known. The actual situation is probably that, for any individual item, each of the different types of word knowledge is known to different receptive and productive degrees. For example, an item’s spelling might be productively known, some of its meaning senses receptively known, and its register constraints totally unknown. Thus, research into the underlying receptive/ productive word knowledge states should prove informative about learners’ overall ability to use words in a receptive versus productive manner. Of course, the dimensions approach has limitations as well. It is impossible in practical terms to measure all word knowledge aspects, as a test battery that comprehensive would soon become unwieldy, especially if both receptive and productive facets were addressed. Therefore researchers following this approach have typically focused either on one, or a limited number, of word knowledge aspects. Another limitation of this approach is that some word knowledge aspects seem much more amenable to testing than others. For example, I know of no test for the register/stylistic appropriacy of lexical items, and this probably has much to do with the difficulty in devising a test for this knowledge aspect. Similarly, I can personally attest that devising a test tapping into frequency intuitions is not easy (Schmitt and Dunham, 1999). This section will first discuss measures of individual types of word knowledge which have been used in research, and then overview a number of studies that have concurrently measured multiple aspects.
Quote 5.6 Read on the limitations of dimensions approach to vocabulary measurement Further work in this area [the dimensions approach to vocabulary assessment] has value for research purposes, in helping us to understand better the complex nature of vocabulary knowledge at the microlevel of individual items. It is not so clear what the role of such measures is in making decisions about learners. If a whole set of them [dimensions tests] is created, there is a danger of finding out more and more about the test-takers’ knowledge of fewer and fewer words, unless we have a definite assessment purpose in mind.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
(2000: 248)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 225
6/9/2010 1:09:25 PM
226
Researching Vocabulary
The test format which has been most utilized as a depth of knowledge measure is probably the Word Associates Format. The format was originally created by John Read and has evolved over several versions by Read (1993, 1998, 2000), and other scholars (see below). As the name indicates, it measures word associations, and so could logically be presented in Section 5.5, but due to its popularity as a depth test, it is discussed here. The 1993 version consists of eight options for each target word. Four of the options are associated to the target word in three ways: paradigmatic link (synonym – group), syntagmatic link (collocation – scientists), and analytic link (component – together). The other four options are unrelated distractors. team alternate orbit
chalk scientists
ear sport
group together
Read’s 1998 version contains eight options within two boxes for each target word, all of which are adjectives. The examinees are required to find four words that associate with the target word out of the eight options. sudden beautiful quick surprising thirsty
change doctor noise school
common complete light ordinary shared
boundary circle name party
The associates in the first box are paradigmatic in nature: either synonyms or representing an aspect of the target items meaning (sudden – quick, surprising). The second box includes syntagmatic associates (sudden – change, noise). The four associates can be evenly divided (2–2) between the two boxes as in the examples, but they can also be split 1–3 or 3–1. This is to make guessing more difficult. Read developed these receptive formats in response to the difficulty in judging the appropriateness of free associations (see Section 5.5). With a selective format (Section 5.1), the target words can be analyzed in advance and clear associates found via piloting of test items. The main advantage of the format is that it covers both meaning and collocation. Moreover, it can tap into multiple instances of both. For example, the two main meaning senses of common (ordinary and shared) are both addressed, as well as two collocations (common boundary, common name), in the above 1998 item. Also, while it focuses on individual words, the fact that it includes a collocation element means that taps into knowledge of formulaic language to some limited extent.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Word Associates Format (WAF)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 226
6/9/2010 1:09:25 PM
227
The format has been used by a number of scholars in their study of vocabulary depth of knowledge (see Greidanus, Bogaards, van der Linden, Nienhuis, and de Wolf, 2004, for one listing). For example, Qian (2002) used it in a study relating vocabulary size and depth knowledge to the reading comprehension component of the TOEFL (Test of English as a Foreign Language) test. He found the WAF was reliable (.88) and correlated with the TOEFL reading test at .77 and with vocabulary size (based on an early Vocabulary Levels Test) at .70. Most researchers have modified the WAF for their own purposes, including type of vocabulary assessed, whether the test was targeted at L1 or L2 participants, number of options (most researchers have opted for six-option (three response and three distractor) versions), and nature of the distractors. Much of the research with the WAF has been carried out in northern Europe, with scholars considering various characteristics of the format. For example, Beks (2001, cited in Greidanus et al., 2004) found that it did not matter much whether there was a fixed number of correct responses per item (e.g. three), or whether the number varied across items. Greidanus and Nienhuis (2001) explored distractor type (semantically-related versus semantically-non-related), association type (paradigmatic, syntagmatic, analytic), and frequency. The semanticallyrelated distractors worked better for the researchers’ advanced learners, who also showed a preference for paradigmatic responses. As expected, the learners showed more knowledge on the test for more frequent words. Schoonen and Verhallen (2008) used a six-item version with participants aged 9–12 years. They found that the test appeared valid for use with their younger learners on the basis of IRT evidence, reliability of .75–.83, and concurrent correlations with a definition test of .82. They also found that their WAF could distinguish between students with previously-known differences in level of more advanced word knowledge. One issue that has not been satisfactorily addressed to my knowledge is the problem of how to interpret ‘split’ WAF scores. If an examinee correctly chooses all associated options, this is good evidence that some advanced (receptive) knowledge of the target item is in place. Likewise, the inability to select any ‘correct’ associates indicates that little, if any, knowledge of the items exists. But how should a score of two associates and two distractors be interpreted (on an eight-option version), as an examinee would be expected to select two correct associates simply by random guessing. As with most multiple choice formats, if examinees are not prone to guessing, then this issue is probably not a problem. But active guessing can make the results of the test difficult to interpret. Unfortunately, Read (1998) found that guessing played a role in examinee performance on the test. Most researchers simply take the number of correct associates selected as their score (e.g. Qian, 2002), but this practice is questionable if examinees are guessing. If so, then a better approach might be to accept only scores above ‘chance level’ (e.g. two associates), but this approach would require a validation study to confirm
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 227
6/9/2010 1:09:26 PM
its efficacy. Another approach is to only count items as correct in which all appropriate options are marked, and none of the distractors (e.g. Schoonen and Verhallen, 2008). The amount of interest in the WAF gives weight to the notion that it is a viable depth-of-knowledge approach, and indeed there is a growing amount of validation evidence (e.g. Schoonen and Verhallen, 2008). Most studies have found it to be reliable and to provide useful information about vocabulary knowledge. Thus, the WAF would appear to have plenty of potential, and has an emerging track record of use. The downside (in addition to the uncertainty surrounding scoring and interpretation) is that there is no single ‘accepted’ test version which is available to be used for a wide variety of lexical research purposes. Rather, the format is available, but must be carefully adapted to individual research purposes, which entails becoming familiar with the previous attempts to use the format. Test of English Derivatives (TED) Given the importance of form in vocabulary acquisition, it can often be a reasonable word knowledge aspect to measure, and has been in several studies (e.g. Schmitt, 1998a; Webb , 2005, 2007a, 2007b). At the beginning stages of acquisition, it can be sensible to elicit demonstration of an item’s spelling or pronunciation. However, for lexis which is past the very beginning stages of acquisition, a more advanced facet of form to measure is knowledge of the different derivative forms within a word family. This can be informative because Schmitt and Zimmerman (2002) found that even relatively advanced learners (students studying in presessional courses preparing to enter English-medium universities) typically did not know the main derivatives of AWL target words. Their measurement instrument, Test of English Derivatives, illustrates how this type of knowledge can be tapped. The items in TED consist of four sentences with blanks for the participants to write in the appropriate derivative form of the target item. 1. philosophy Noun Verb Adjective Adverb
She explained her ——— of life to me. She was known to ——— about her life. She was known as a ——— person. She discussed her life ———.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
228 Researching Vocabulary
2. ethnic Noun Verb Adjective Adverb
The people in his neighborhood shared the same ———. The neighborhood ———. The people lived in ——— neighborhoods. The neighborhoods were divided ———.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 228
6/9/2010 1:09:26 PM
229
The test measures form-recall, and is productive to the extent that this recall is done in the context of sentences. The researchers did not want to rely on participants’ metalinguistic knowledge by framing the prompts in metalinguistic terms, e.g. by asking, ‘What is the noun form of ethnic?’, as Alderson, Clapham, and Steel (1997) found that even native speakers often lack this kind of grammatical metalinguistic knowledge. Rather, they drew on an idea from Nagy, Diakidoy, and Anderson (1993), presenting a series of four similar, contextualized sentences for each prompt word, to which participants could respond whether or not they had the respective metalinguistic knowledge. However, the word class information was also provided for participants who did possess this knowledge. The participants were instructed to write the appropriate derivative form of the target word in each blank and were informed that the prompt word could be the proper form without alteration. The sentences were written to be similar semantically and to recycle as much vocabulary as possible. The vocabulary was drawn exclusively from the 2,000-word GSL (West, 1953), with the exception of a few other words of relatively high frequency. The sentences were mainly designed to constrain the possible derivatives for each sentence to one word class. A key concern with this format is producing a list of derivatives which would be accepted as correct responses. In order to build this list, dictionaries, corpora, and native judgements were employed. Sometimes there was more than one possible answer for a word class (adjective: philosophical, philosophic), and in these cases, either possibility was accepted. There were also many cases where a particular word class did not have a typical derivative (e.g. *ethicize), and participants were instructed to fill the blank with an ‘X’ in these cases, to show positive knowledge that no such possibility existed. However, given the fact that native speakers are often creative with language, piloting showed that many natives sometimes ‘made up’ words which did not occur in any dictionary or corpus (verb: ?traditionalize), even while many other natives indicated that no derivative existed. Given this split opinion, decisions were made on a case-by-case basis, balancing the information from all three input sources. Given this potential fuzziness, any researcher developing new versions of this instrument will need to be careful about developing their answer list, and consider it an inventory of ‘typical’ derivative forms, rather than a list of ‘correctness’ in any absolute sense.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
Collocation measures One of the most important types of ‘contextualized’ word knowledge is collocation, which makes it a good candidate for a depth-of-knowledge test. Most researchers of collocation have used strength of association measures (e.g. t-score, MI) to identify and analyze collocations in learner output (i.e. a comprehensive approach in Read’s (2000) terms) and these have already
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 229
6/9/2010 1:09:26 PM
Researching Vocabulary
been discussed in Sections 3.2–3.4. However a smaller number of researchers have explored the assessment of selected target collocations. Though there is not yet a test which can be put forward as an accepted approach, this section will look at the various test formats which have been used in the collocational studies. One measurement method used to elicit productive collocation knowledge in early studies was translation. For example, Bahns and Eldaw (1993) used this format when testing German speakers on their knowledge of 15 English verb + noun collocations. Prompt sentences in German containing the translation equivalents of the target English collocations were given, with the instructions to translate these into English. In principle, this format would seem to be an ideal way to elicit a productive demonstration of target collocations, but only if participants choose to use the collocations (if known) in their translations. There are usually many ways of making a translation, and there is no guarantee that the translated sentences will contain the target collocations. There is also no way of knowing whether the absence of a collocation indicates a total lack of knowledge of the collocation, some knowledge but avoidance of it, or just that the sentence was translated in a way that the collocation was not required. These issues can be mitigated by careful writing and piloting of prompt sentences, but probably never fully eliminated. Another early method to prompt the production of collocations was with cloze items. Farghal and Obiedat (1995) used this technique in their study of the collocation knowledge of Arabic-speaking students. The following example is their attempt to elicit the collocation weak tea: I prefer ——— tea to strong tea. As with translation, there is always the chance that participants will foil the test designer’s intentions, and fill in blank with an acceptable answer not related to the target collocation, such as herbal in the above example. The addition of a translation of the key target collocation could help guide the participants towards the intended collocation, but this would only work if the L1 translation is not a word-for-word calque of the L2 version, otherwise the task would then become a simple translation of individual words rather than the collocation overall. For example, for the English collocation take a picture, the German translation ein foto machen would be a good translation prompt, as a direct translation would be *make a photo. Thus, in order to produce the appropriate English equivalent, collocation knowledge is required, rather than simply the ability to translate the German prompt word-for-word. Also, items like the above example only require production of one element of a collocational pair. It may be possible to have cloze blanks for the entire collocation, but the relative lack of structure may make it difficult to
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
230
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 230
6/9/2010 1:09:26 PM
Measuring Vocabulary
231
I like to drink a ——— ——— in the morning in order to wake up. (French: café fort) I need to ——— some ——— from my savings account into my checking account. (Czech: peˇr vod penˇez; German: Geld überweisen; Swedish: överföra pengar) (strong coffee, transfer money) Unfortunately, early researchers using these approaches did not address measurement issues, and so their studies provide little guidance concerning either the formats’ validity or problems. Gyllstad (2005, 2007) highlights several limitations of these early studies: usually only a small number of items were tested (typically 10–20), which makes it difficult to draw any firm conclusions; selection of target collocations was usually made in an unsystematic way or not described at all; often no reliability values were reported for the test instruments, and few of the studies compared learners at different levels of formal instruction. Happily, research moves on, and some later researchers have been much better in describing the validity evidence for their instruments, and so we have a clearer idea of those instruments’ behavior. One of these researchers is Bonk (2001) who explored three different collocation test formats. Two utilized sentence clozes, the first focusing on verb + object collocations with a gap for the verbs to be inserted: Punk rockers dye their hair red and green because they want other people to ——— attention to them. (pay) The second focused on verb + preposition collocations, with a gap for prepositions to be inserted, and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
constrain the possible answers down to the desired one. Again, a translation would help in this regard, and in principle, it may prove workable to even provide multiple translations for different L1s (for use with mixed L1 participant groups), as long as the direct translation problem is avoided. A researcher might also include initial letter(s) in the blanks to constrain choice.
Many of the birds in the area were killed ——— by local hunters. (to exterminate) (off )
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 231
6/9/2010 1:09:26 PM
232
Researching Vocabulary
(a) (b) (c) (d)
Are the Johnsons throwing another party? She threw him the advertising concept to see if he liked it. The team from New Jersey was accused of throwing the game. The new information from the Singapore office threw the meeting into confusion.
(b) After checking the three tests with ten native speakers, Bonk gave the them to 98 mainly Asian EFL university students, along with a condensed TOEFL test to measure general language proficiency. He found that the students scored similarly on the three collocation tests (8.7/17, 8.8/17 and 7.8/16 respectively), which equates to about 50% of the maximum possible scores. A Kuder-Richardson-20 analysis showed that reliability for the three combined tests was.83, but also that the verb + preposition test was weak in this regard, with an unacceptably low figure of .47. Bonk also did classical item analyses, including item facility, item discrimination, and pointbiserial coefficients. These showed that most of the items functioned and discriminated well. A Rasch (IRT) and Generalizability analysis showed that the three combined collocation tests worked reasonably well on the whole participant population, but that the verb + preposition test was relatively weak. Bonk found no instances of low proficiency (TOEFL) scores combined with high collocation scores, or vice versa. Overall, Bonk’s analyses suggest that his verb + object cloze and multiple choice formats are valid methods of assessing collocational knowledge, but that the verb + preposition format could not be recommended. Mochizuki (2002) also used a multiple choice format, but without sentence contexts. His four-choice format listed one component of the target collocation and testees had to decide with which of the four alternatives was the appropriate partner: job (1) answer (2) find (3) lay (4) put
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
The third test utilized multiple-choice items. The target collocations were verb phrases, with each item containing four option sentences. The testtakers’ task was to identify which of the four sentences did not contain a correct usage of the verb.
Mochizuki gave his collocation test as part of a vocabulary battery to 54 Japanese first-year university students in April and then in January (one academic school year in Japan), in which students received 75 hours of instruction (reading and conversation classes). Mochizuki found that although there was no significant improvement on the vocabulary size or paradigmatic knowledge tests in the battery, there was a significant gain in mean
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 232
6/9/2010 1:09:26 PM
233
collocation score (41.7 → 42.8 (max = 72)). The collocation test had variable reliability, .54 at the T1 and .70 at the T2 (Cronbach alpha). However, the low reliability scores may be partially caused by the Japanese participant group being relatively homogeneous, as the lack of variance generally results in low internal reliability values (see Brown, 1983: 86). It might be noted that although the increase in collocation scores might be significant in this case, in absolute terms, an improvement of 1.1 on a 72-point test is not particularly meaningful. There has also been an attempt to use a developmental scale approach to assessing collocation knowledge. Barfield (2003) developed a four-stage scale, onto which participants were asked to judge the frequency of the target collocations: I. II. III. IV.
I don’t know this combination at all. I think this is not a frequent combination. I think this is a frequent combination. This is definitely a frequent combination.
Barfield’s test focused on decontextualized verb + noun collocations. He took 40 lexical verbs and found three noun collocates for each (e.g. break + ground, break + record, break + rules). Similar to the methodology in checklist tests, non-collocations were substituted for some of the verb + noun collocations to check on the validity of the responses (adopt + approach, adopt + child, *adopt + profit). Barfield gave the test (100 real collocations and 20 non-collocations) to 93 Japanese university students, after first confirming that they knew the nouns in the collocations (they did). The mean result for the collocation scale was 2.56 (SD .39), with real collocations scoring 2.65 (SD .47), and non-collocations 2.15 (SD .62). The reliability for the real collocations was high (.97, Cronbach alpha), as was the reliability for the non-collocations (.93). Barfield studied two participant groups (high and low), and there was a significant difference between them on the real collocation scores, but not on the non-collocations. Thus, higher proficiency learners self-evaluated their knowledge of collocations more highly than lower proficiency learners, which suggests more advanced collocation knowledge. Conversely, both proficiency levels judged the non-collocations about the same, which suggest that both groups shared a similarity in rejecting the non-collocations. (Note that many of these were rated at Stage 2 – I think this is not a frequent combination). Barfield’s study is an interesting attempt to use a developmental scale with collocation knowledge, since it is clearly not an all-or-nothing proposition. However, one potential problem highlighted by Gyllstad (2005: 11) is that some of the non-collocations are possible in certain contexts, e.g. ?explain address, ?approve opportunity and ?create temperature are all possible
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 233
6/9/2010 1:09:26 PM
Researching Vocabulary
combinations in the contexts of ‘to explain an address to someone’, ‘to approve of a job opportunity’, and ‘to create a temperature at which certain solid elements become liquid’. However, although these combinations may be possible in certain constrained contexts, compared to the other much more frequent collocations, they certainly would not be typical. Given the creativity of language users, typicality of language use is probably a more reasonable criterion than possibility of use. Another issue is the scale itself. It essentially measures self-evaluation of whether a verb + noun combination is a frequent collocation or not. This only makes sense if the target collocations are in fact frequent. If the target combinations are collocations but somewhat infrequent ones (like high MI score collocations are likely to be), then Level 2 is actually the most appropriate judgement. It is not clear from Barfield’s report whether the collocations are frequent or not. Another issue is the whether the stages are equidistant and so can be considered an interval scale. I am not sure this could be established one way or the other, and so the use of means to summarize the results is dubious. Gyllstad (2005, 2007) notes that most collocation studies focus on collocations made up of content words. However, research has shown that delexical verbs (make, take, do, give have) occur frequently in English and are difficult even for advanced learners (Altenberg and Granger, 2001; Nesselhauf 2004). For this reason, he focuses on collocations containing these in his two collocational test formats. In the format he calls COLLEX 5, testees must decide which verb + noun combination is a real collocation out of three options, and check the corresponding box: a b c 1. do damage 2. turn out a fire 3. hold discussions 4. receive a cold 5. press charges
make damage put out a fire do discussions fetch a cold run charges
run damage set out a fire set discussions catch a cold push charges
The 50 items were created mainly from high frequency words (first 3,000), based on Kilgarriff’s frequency lists (see Section 6.4). This helped insure that participants knew the component words of the collocations. Also, to the extent possible, the three verbs in an item were similar in frequency to each other. The targets were confirmed to be indeed collocations by setting a minimum z-score of > 3, although most of the collocations had very high scores much above this. Similarly, the distractor combinations were checked with the BNC to make sure they were not collocations. The validation of the test showed that there was a clear progression in the COLLEX 5 scores as proficiency increased, and that the test had good reliability at .89.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
234
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 234
6/9/2010 1:09:26 PM
Measuring Vocabulary
235
The COLLMATCH 3 format presents verb + noun combinations and requires informants to make yes/no decisions as to whether they are collocations or not: 4. score problems yes no
As with COLLEX 5, COLLMATCH 3 produced a clear progression compared to proficiency level, and a reliability of at .89. Through an extended series of piloting and validation, Gyllstad builds a considerable validation case for the COLLEX 5 and COLLMATCH 3 formats. Interestingly, as part of the validation, Gyllstad found that both tests correlated highly with the Vocabulary Levels Test (COLLEX 5: .88; COLLMATCH 3: .83), indicating a strong relationship between vocabulary size and receptive recognition knowledge of English collocations. As part of his piloting, Gyllstad used item-whole correlations, even though knowledge of any particular collocation does not necessary entail knowledge of other particular ones. According to my ‘item-whole test’ argument in Section 5.1.3, this is questionable. The approach may be more supportable if one can ensure that the target collocations are representative of collocations in general, but this requires a principled way to ensure representativeness. However, at the moment, I know of no way this can be accomplished, simply because our understanding of the domain of collocations (its extent and nature) is too limited. It is interesting to look at two of the earlier test formats in Gyllstad’s test development. The original format used for COLLMATCH 1 was a grid, such as illustrated below: Check each combination which you think exists in use in English. charges
patience
weight
hints
anchor
blood
drop lose shed
The number of real combinations was not indicated to testees, making the test relatively demanding, since the number of alternatives is large. After piloting, Gyllstad decided to scrap this version, mainly because most of the items were in fact non-collocations (93/144, 65%). This meant that the test mainly measured learners’ ability to reject non-collocations rather than their ability to recognize real collocations. This ratio is inherent in the grid design, as it was difficult to find objects that partnered with two or all three of the verbs. For example, in the 18-box grid above, there are only eight
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1. press charges 2. lay pressure 3. bend a rule yes yes yes no no no
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 235
6/9/2010 1:09:26 PM
236
Researching Vocabulary
a. draw the curtains b. draw a sword c. draw a favour d. draw a breath e. draw blood
While having better test characteristics than COLLMATCH 1, Gyllstad decided to abandon this format as well, mainly because it was hard to generalize its results to the wider domain of general collocational knowledge, as each of these extended items were based around the collocations of only a single verb. Studies illustrating multiple word knowledge measures Schmitt (1998a) One of the first studies to concurrently measure a number of word knowledge aspects was Schmitt (1998a), where I measured advanced learners’ knowledge of spelling, word class/derivative form, meaning senses, and association. In fact, I also attempted to measure collocation, but was not all that happy with the measurement instrument (Schmitt, 1998b). I tried to devise word knowledge measures which captured the incremental nature of vocabulary learning. The test of spelling consisted of a four-point rating system. Zero (0) on the scale indicated that the participant demonstrated no knowledge of a word’s spelling. One (1) signified that the participant could give the initial letters of the target word, but omitted some later letters, added unnecessary letters, or transposed letters. Two (2) indicated that the word was phonologically correct, but perhaps some vowels or consonants were replaced by similar-sounding but erroneous items (brood – *brud; illuminate – *elluminate). Three (3) indicated fully correct spelling. This approach is similar to Barcroft (2002), who proposed a five-point scale.5 The association measurement procedure asked participants to give three responses for each target word stimulus. These responses were compared to a native norming list. In Category 0, none of the three responses matched any of those on the norming list, in which case, no native-like association behavior was demonstrated. In Category 1, some responses matched infrequent ones on the norming list, indicating a minimal amount of native-like association knowledge. In Category 2, the responses were similar to those typical of the native norming group, indicating native-like associations. Lastly, in Category 3, the responses were similar to those in the top half of the native norming group, indicating a native-like rating in which even more confidence could be put (see Schmitt, 1998c, for more detail).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
collocations (drop charges, drop hints, drop anchor, lose patience, lose weight, lose blood, shed weight, shed blood). This led to the revised COLLMATCH 2 format, where the number of real collocations could be controlled and a better ratio achieved (65 real collocations and 35 non-collocations). It presented five noun combinations with the target verb, and the participants’ task was to tick the collocations they thought existed in English, and leaving the boxes of the non-collocations blank:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 236
6/9/2010 1:09:26 PM
237
The norms for the word class and derivational forms were obtained from three dictionaries. The participants received one point for knowing the word class of the target word, and one point for knowing how to transform it into each of the three other word classes. If a form for a word class did not exist, participants got credit for being able to state that fact. When two or more forms were possible for any word class, only one was required for credit. For example, participants were awarded one point for knowing that illuminate is a verb, and one point each for knowing illumination is the noun form, illuminated or illuminating (only one required) is the adjective form, and that no common adverb form exists. During the development of this measure, I noted that the norming data from the dictionaries sometimes conflicted with the native pilot participants’ answers, particularly for adverbials, with the dictionaries occasionally listing forms that the natives found strange. In these cases, I consulted the BNC to check those forms’ frequency of occurrence. If it was very low, I still accepted it as a possible form for that word class, but I also considered acceptable an answer that no form existed. For example, the very rare adverb form of circulate, circularly, is so uncommon that I also accepted the answer “No form exists.” Thus, the possible scores ranged from 0 (knowledge for no word class) up to 4 (knowledge for all four word classes). Because this study attempted to describe vocabulary acquisition up to the level of full mastery, it was important to measure knowledge of all of the major meaning senses of the target words. (Knowing only a single meaning sense for a polysemous word must be considered only partial knowledge.) I consulted three dictionaries to determine the major meaning senses. For cases in which they disagreed, I made decisions based on the responses from both native and nonnative pilot participants and on corpus data. Whereas I only measured the other word knowledge types productively, it was both feasible and desirable to measure both receptive and productive knowledge of word meaning, because a major part of the incremental acquisition of word meaning probably involves the move from receptive to productive mastery of different meaning senses. I asked the participants to explain all of the meaning senses they knew for each target word. After the participant could not think of any additional senses, I gave prompt words designed to elicit additional senses that the participant might know but could not recall. The prompts were designed to trigger the related sense if the participant knew it, but not to give it away if it were unknown. For example, for the target word spur, the prompt word horse was designed to suggest the meaning, “metal device worn on the heel of a boot used to guide or encourage a horse”. Unprompted explanations of a meaning sense demonstrated meaning recall and were awarded 2 points. Acceptable explanations given after a prompt were assumed to be less well mastered (i.e. still meaning recall but with the aid of a semantically-related prompt) and received 1 point. If the participant could not describe the meaning sense
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 237
6/9/2010 1:09:26 PM
Researching Vocabulary
after the prompt, that sense was scored ‘unknown’ (0 points). Because the target words had differing numbers of meaning senses, a way of comparing the different words was to calculate a meaning proportion by taking the participant’s total point score for each word and dividing that by the number of possible points (i.e. number of meaning senses × 2 points each). While convenient, this method was somewhat problematic in that a single proportion score could represent different constellations of meaning knowledge. For example, a meaning proportion of .50 could indicate knowing all meaning senses receptively, half productively, or some combination of the two. Webb (2005, 2007a, 2007b) Stuart Webb has taken the notion of concurrent multi-componential vocabulary measurement and pushed it to a new level. In a series of studies, he used an extensive test battery measuring receptive and productive knowledge of orthography, the form-meaning link, grammatical functions, collocation, and association. The ten-part battery is illustrated below. As with Schmitt’s (1998a) battery, the individual tests were carefully sequenced to avoid the risk of earlier tests affecting answers to later tests. The target items were nonwords matched with the meanings of low-frequency vocabulary (e.g. dangy = boulder; masco = locomotive). As Webb employed the battery to explore the efficacy of various learning tasks, the use of nonwords was beneficial in ensuring that participants had no previous knowledge of the target words. Test 1: Productive knowledge of orthography: Participants heard each nonsense word pronounced twice, and then had ten seconds to write it down. Only fully correct spelling was marked as correct. Test 2: Receptive knowledge of orthography: The correct spelling of the target nonword was given along with three distractors with similar phonetic and orthographic forms. (a) dengie
(b) dengy
(c) dungie
(d) dangy
Test 3: Productive knowledge of meaning and form:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
238
This is a form-recall test using translation. The L1 translation was given and the informants were required to write the L2 nonword beside it. As the ability to spell the nonword had already been measured in Test 1, the response was not marked down for slightly deviant spelling, as long as the nonword could be clearly discerned.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 238
6/9/2010 1:09:27 PM
Measuring Vocabulary
239
Participants were asked to write a sentence using the target nonword. The instructions made clear that the sole criterion for correctness was grammatical accuracy in the usage of the nonword. For example, The girl mascoed to school would be marked as incorrect (verb form), while It is a masco (noun form) would be correct, regardless of the semantic quality of the sentence. Thus, this test essentially measured knowledge of word class and the attachment of inflections appropriate to that class. Test 5: Productive knowledge of collocation: It is interesting to note that Webb describes this test as a ‘syntax’ test in his papers. However, it is probably most accurately labelled an association test, but one which only allows for syntagmatic associations, which are essentially collocations. This can be compared to his ‘association’ tests (Tests 6 and 9) which focus solely on paradigmatic associations. Participants were required to produce one syntagmatic association to the nonword prompt. Webb reports that it was made clear to the participants that only a syntagmatic response would be accepted, but it is not reported how this was done in practice. It is also not clear how judgments concerning the informant responses were made. He reports using a ‘common sense’ approach in deciding whether the responses are typically encountered in contexts with the meaning attached to the nonword, e.g. locomotive (masco) station, tracks, left, arrived, or whether they are less frequently found in that that context (clock, ate, hard). Moreover, it is not reported whether this was based on the researcher’s sole intuition, a panel of judges, corpus evidence, or a combination. However, the description in Webb (2007a) indicates that a second rater was employed with good inter-rater reliability, so this hints at individuals using their intuition. This point holds true for all of the association-based tests. Test 6: Productive knowledge of association: This used the same test format as Test 5, but focuses on meaning-based paradigmatic associations. Therefore responses such as masco: train, airplane, vehicle would be scored as correct, but non-associations and syntagmatic associations would not.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Test 4: Productive knowledge of grammatical functions:
Test 7: Receptive knowledge of grammatical functions: This multiple choice test gave the nonword in three sentence contexts in which it had different word classes. The participants needed to choose the sentence where it was illustrated correctly. As the sentences were semantically
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 239
6/9/2010 1:09:27 PM
240 Researching Vocabulary
barren, the only type of knowledge from which to make this judgement was grammatical. Thus knowledge that masco (locomotive) is a noun, should lead to the selection of the correct answer (a).
Test 8: Receptive knowledge of collocation: This is another multiple choice test, where informants need to circle the words which are most likely to appear in context with the target nonwords. Note that the options are all from the same word class. The first item is for dangy (boulder) and the second is for hodet (lane). dangy hodet
(a) fall (a) drive
(b) wash (b) sit
(c) walk (c) take
(d) catch (d) know
Test 9: Receptive knowledge of association: This test is the same format as Test 8, but focuses on paradigmatic associations. dangy hodet
(a) stone (a) park
(b) plant (b) highway
(c) tree (c) garden
(d) person (d) building
Test 10: Receptive knowledge of meaning and form: (2005, 2007a studies): The last test is the counterpart to Test 3, but here the L2 word form is given, and the meaning must be recalled. To demonstrate this meaning, the corresponding L1 word must be written in the blank. masco ——— (2007b study): This study used a multiple choice meaning-recognition test. This is illustrated by the following example of ancon (hospital). ancon
(a) hospital
(b) house
(c) car
(d) city
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
(a) It is a masco. (b) It mascoed. (c) It is very masco.
Several observations can be made about Webb’s test battery. The first is that it is much more comprehensive than any used before to explore pedagogical issues in second language vocabulary, which allowed him to describe vocabulary learning much more fully. His 2005 study showed that both the short reading and writing tasks he used led to more than just form-meaning
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 240
6/9/2010 1:09:27 PM
241
links, enhancing all of the word knowledge aspects he measured. Also, it seemed that the productive task led to somewhat more learning for all word knowledge types, in terms of both receptive and productive mastery. In contrast, the 2007a study compared vocabulary learning from a L1 translation with learning from a L1 translation plus a short context sentence. He found that the addition of a single sentence context made no difference in the enhancement of any of the word knowledge aspects over the translation input alone. However, more exposures do lead to acquisition. Webb (2007b) exposed learners to nonwords in reading texts once, three times, seven times, and ten times. He found that each additional exposure band led to enhancement of at least one type of word knowledge, and usually many/most of them. The second point is that although his individual tests are scored dichotomously (correct/incorrect), the fact that he uses both receptive and productive formats for each type of word knowledge, means that he can show a progression from no receptive knowledge→receptive knowledge→productive knowledge. This is very useful as he was able to show not only that different pedagogical tasks enhanced different types of word knowledge, but they did to different degrees for the different word knowledge types. This much more detailed description gives a much clearer insight into the mechanics of vocabulary acquisition and what factors tend to enhance it and how. Almost inevitably with such an ambitious measurement program, there are limitations to the measurement methodology. The first is that only a very limited number of target items could be addressed (10–20). In none of the studies is it reported how long the entire battery took, but it must be assumed to be a considerable time, which presumably constrained the number of target items. Summary All of the tests in this section illustrate different ways of tapping into depth of knowledge. Although none of them are yet established as accepted standards, it seems obvious how they provide much richer information about informants’ lexical knowledge than typical form-meaning formats. This being the case, I feel that this approach definitely should be followed up and included to the extent possible in vocabulary research. While there will always be the issue of sampling rate vs. richness of measurement information elicited, many research questions require information on depth of knowledge to truly understand what is happening. Perhaps the best solution is to combine approaches, with some measures estimating the ‘quantity’ realm (e.g. size of lexicon), and others tapping in the ‘quality’ of the lexical knowledge within that realm. These combined measures could be contained within the same study, or if time is a constraint, then within consecutive studies, whose results can be linked for greater understanding.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 241
6/9/2010 1:09:27 PM
242 Researching Vocabulary
Most vocabulary research has been couched in terms of knowledge rather than in terms of how automatically that knowledge can be deployed (Section 2.11). This is partly because it is easier to measure knowledge than automaticity with paper-and-pencil tests. It is no doubt also due to the feeling that having knowledge is more important than how quickly it can be utilized. For example, many vocabulary studies have had pedagogical aims, where knowledge was considered the key attribute. Conversely, psycholinguistic research has commonly used measures of automaticity (e.g. reaction times), but in many cases the vocabulary involved was merely a convenient stimulus task, rather than the main focus of the research (e.g. the effects of the L1 lexicon on the L2 lexicon). Regardless of this focus on knowledge (typically of form and meaning), automaticity is also a key factor in how well vocabulary can be used. This is obvious in the verbal skills, which are carried out on-line in real time. A person usually has one chance to catch the words an interlocutor has spoken; the mind cannot rewind and replay the stretch of discourse. This is unless one asks the interlocutor to repeat themselves, which is annoying if done too much, and should be considered part of discourse strategy competence. On-line processing deficits can be even more obvious in speech; if one does not have words at one’s disposal, the speech can become very disfluent indeed. Speed of processing is also important in reading and writing. Recognition speed is essential in reading, as sight vocabulary is a key requirement for fluent reading (Carrell and Grabe, 2002). Van Gelderen, Schoonen, de Glopper, Hulstijn, Sir˙nis, Snellings, and Stevenson (2004) found evidence supporting this, in the form of substantial correlations between speed of processing and reading comprehension, both in the L1 and L2 (although speed did not add additional predictive power to a regression analysis which included vocabulary, grammar, and metacognitive knowledge). In the companion study to Van Gelderen et al. (2004), Schoonen, van Gelderen, de Glopper, Hulstijn, Sir˙nis, Snellings, and Stevenson (2003) found similar results for writing: speed of processing correlated with writing proficiency in both L1 and L2, but added no unique contribution in a regression analysis. The importance of automaticity means it is an aspect of vocabulary mastery that should be given more attention in vocabulary research. Luckily, new technology makes it increasingly possible to measure this construct. Many vocabulary measurement tasks can now be done either on a computer or online on the internet. In either case, programs can be set up to measure the response times as informants provide their answers. Researchers devising new research which is computer/internet-based (and it is likely that this will increasingly become the norm) should definitely consider the possibility of adding a timing element to their design.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
5.4 Measuring auomaticity/speed of processing
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 242
6/9/2010 1:09:27 PM
243
Psycholinguistic tasks provide very precise timings, with responses measured to milliseconds (e.g. in priming studies). However, psycholinguistic experiments typically use very basic and controlled tasks, most often lexical decision, where only a simple decision needs to be made (e.g. Is the item a real word or not?) and a button pushed. Lexical decision tasks have been used in studies addressing a wide variety of lexical phenomena, but they are not the only measurement task possible. More ‘lifelike’ tasks which measure vocabulary in use could also be used (e.g. recalling a word to be used in an essay, using a target item in a sentence, recognizing target items in a passage). However, they are likely to be relatively ‘noisy’, as any differences in processing speed may be overshadowed by other factors, for example, differences in writing speed. Thus great care needs to be taken in setting up such studies to control for as much of this ‘external’ variation as possible. Still, many relatively-realistic vocabulary tasks can be sensibly timed if the research design is developed with this in mind. Once the design is set, it is fairly easy to set up an automaticity measurement with a software program like Eprime. An example of this type of study is Siyanova and Schmitt (2008), who used a timed task to compare the speed of native versus nonnative judgements of collocation frequency. A total of 27 native and 27 advanced nonnative speakers judged the frequency of adjective-noun combinations, half of which were frequent collocations in the BNC (criminal offence), while the other half did not occur in the BNC, although they made sense and were possible combinations (exclusive crimes). Siyanova and Schmitt found that the nonnatives were much slower in making the judgements of both typical collocations (NNS: 2,813 milliseconds, NS: 1,945) and noncollocation combinations (NNS: 3,904, NS: 3,023). Based on this evidence, they concluded that, not only was the nonnatives’ collocational knowledge less accurate (based on another part of the study not reported here) than natives’, but that it was less automatic as well. Another example of an automaticity task is self-timed reading, a technique for measuring the speed at which participants can read target lexical items in context. The target items are embedded in context which is then shown on a monitor screen one line/phrase/sentence at a time. Once this is read, the participant presses button to bring up the next screen. The participants are instructed to read as quickly as they can, while still understanding the meaning. The reading times (between button pushes) can then be recorded. While mainly a technique to measure reading, it can be useful for measuring the speed at which longer formulaic sequences are read. Just such a use is illustrated by Conklin and Schmitt (2008), who embedded formulaic sequences and matched non-formulaic control strings in story passages. There were three conditions: formulaic sequences where the context forced an idiomatic reading (a breath of fresh air = an interesting
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 243
6/9/2010 1:09:27 PM
new situation), formulaic sequences with a literal meaning (a breath of fresh air = breathing nice air), and non-formulaic control strings which contained all/most of the words from the formulaic sequences, but in a different order (fresh breath of some air). These conditions are illustrated below in one of the study’s passages (italics = idiomatic, bold = literal, underlined = non-formulaic). (Note that this formatting is for the reader’s convenience; in the actual study, all of the text was rendered in the same regular font.) Dave used to work in Japan. He really liked his job there but he hated riding the trains because they were so overcrowded with hundreds of people crammed into the railway cars. Every morning the people were packed in like sardines during the rush hour time. Dave hated having people pushing against him from every side so would try to get a place where he could stand with his back against the wall if there was a place there. At least the wall was cooler than a hot and sweaty body against his back. In the summer the cars were always hot and the open windows hardly seemed to help at all. On one extremely hot trip, he felt like he was choking and desperately needed a fresh breath of some air before he became ill. He managed to hold out until the next station and staggered off of the train. He decided then and there that it was time to leave Japan and escape the rush-hour craziness. (Conklin and Schmitt, 2008: supplementary material)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
244 Researching Vocabulary
Conklin and Schmitt found that their participants read both types of formulaic sequence more quickly than the matching non-formulaic control
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 244
6/9/2010 1:09:27 PM
245
strings, but there was no difference between the idiomatic and literal meanings. They interpreted this as evidence that formulaic language is more easily processed than non-formulaic language. Their study demonstrates how self-timed methodology can be used to measure the processing of formulaic sequences in context. Although I have used the terms automaticity and fluency as synonyms for relatively quick speed of processing, it is important to note that some scholars have made rather more precise distinctions. Lennon (2000) distinguishes between a lower-order fluency (essentially speed, e.g. rate of articulation) and a higher-order fluency which involves a broader proficiency: ‘the rapid, smooth, accurate, lucid and efficient translation of thought into language’ (p. 40). Likewise, automaticity has been more tightly defined. Segalowitz and Segalowitz (1993) and Segalowitz and Hulstijn (2005) point out that increased speed can accrue from two different sources. One is the simple speeding up of processes which a person already possesses. Another is through the development of new processes which allow quicker operations. They argue that this second route (which they term automaticity) is important because it represents some mental restructuring or reorganization which makes language processing more efficient (and for nonnatives, more native-like), compared to a general speeding up of existing processes. Segalowitz and Segalowitz (1993) propose a statistical method of distinguishing between these two sources of faster processing based around the coefficient of variance (CV). CV is defined as the standard deviation divided by mean reaction time. The procedure is described in detail in Segalowitz and Segalowitz, who use it in studies, as do Segalowitz, Segalowitz, and Wood (1998), and Segalowitz, Poulsen, and Segalowitz (1999). However, it is probably best illustrated by the explanation in Segalowitz and Hulstijn (2005: 374–375): Suppose a videotaped recording of a person making a cup of tea on 50 different occasions is viewed. Each component of the action – putting the water on to boil, pouring the hot water into a cup, inserting the tea bag, and so on – will take a particular length of time. A mean execution time and a standard deviation for this mean can be calculated across the 50 repetitions for both the global action of ‘making tea’ and for each of component of this event. Suppose now a new videotape is created by rerecording the original at twice the normal speed. On the new tape, the entire event will appear to be executed in half the time with half the original standard deviation overall; moreover, the mean duration of each component and the standard deviation associated with each component will also be reduced by exactly half. This corresponds to what Segalowitz and Segalowitz (1993) argued to be the null case of generalized speed up; performance becomes faster because the underlying component processes are executed more quickly and for no
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 245
6/9/2010 1:09:27 PM
Researching Vocabulary
other reason ... Suppose now we are shown still another videotape in which the mean time for the global action of making tea is again half of the original mean time, but the standard deviation for the 50 repetitions is far less than half the original standard deviation. This tape cannot have been produced simply by rerecording the original at twice the normal speed. Instead, there must have been some change in the way the activity of making tea had been carried out, such that some of the slower and more variable components of the action sequence had been dropped or replaced by faster, less variable components. In other words, there must have been a change that involved more than simple speed-up, namely, some form of restructuring of the underlying processes. This speeding up versus automaticity distinction may not be important if the purpose is determining if vocabulary is being processed more quickly and what teaching or input led to this speed increase. But it may well be important to researchers who are interested in understanding the mental lexicon and explaining the mechanisms underlying any increases in processing speed. However, Hulstijn, van Gelderen, and Schoonen (2009) wonder whether the distinction between speeding up versus automaticity can be easily made in practice by the coefficient of variance. They reviewed seven previous studies using CV, then analyzed two of their own which were part of the L1 Dutch/L2 English NELSON project. Overall, they found minimal support for the proposition that CV reliably indicates developing automaticity. They feel that it is problematic to use the CV approach as an operationalization of automaticity when interpreting the reaction-time data typically used in automaticity studies, and that the holistic nature of language acquisition makes it difficult to differentiate between gains in knowledge itself and gains in the skill of processing of that knowledge: ... gains in knowledge itself and gains in processing it cannot be adequately disentangled in the RT [reaction-time] tasks used in the studies reviewed and reported. This may not just be an unfortunate feature of the RT tasks but may be an inherent characteristic of language learning. Although conceptually skill acquisition can be distinguished from knowledge accumulation, in reality knowledge accumulation forms part of skill acquisition because, in real L2 learning, exposure to new words goes hand in hand to exposure of words encountered before. L2 learning is both a matter of knowledge accumulation and of an increase in the efficiency with which that knowledge can be processed in knowledge-access tasks (listening and reading) and in knowledge-retrieval tasks (speaking and writing). (2009: 576)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
246
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 246
6/9/2010 1:09:27 PM
Measuring Vocabulary
247
For proficient speakers, lower-level processes, such as phonological articulation, are likely to be highly automatized, as are much lexical accessing and syntactic and morphological processing. For less proficient speakers of a foreign language, these processes will be as yet imperfectly automatized and may require much time, effort, concentration, and monitoring, especially for those who have learned the foreign language in the classroom and have had little chance to use it communicatively. Thus, correspondingly little mental energy will be freed for higherorder processes, such as the conceptualization of a message, discourse planning, and the sociolinguistic skills of turn taking, involving the interlocutor in discourse, achieving rhetorical effect, and so on. (2000: 28)
5.5
Measuring organization
We have examined measurement of vocabulary size, depth, and automaticity, but these have tended to focus on individual lexical items (with the principal exception being the WAF test format). However, Meara (1996b, 1996c) and Nation (2001) have noted that knowledge of individual items does not operate in isolation, but rather works in conjunction with knowledge of other lexical items. That is, the lexical connections between words (and the resulting organization of the lexicon) are also important to vocabulary usage. In fact, Meara and Wolter (2004) argue that the distinction between size and depth is somewhat unfortunate, as the real distinction which should be made is between size and organization. After all, they point out that size is not really about individual words, but rather about learners’ overall lexicons. Henriksen (1999) and Aitchison (2003) argue that two central processes of vocabulary acquisition are mapping (establishing and finetuning the form-meaning link) and network building (forming internal links between items in the mental lexicon). If this is right, then looking at lexical networks may be a useful way of considering the incremental acquisition of vocabulary, as network building can be seen as the outcome of elaborating the initial form-meaning link. Likewise, Meara (1996b) argues that size is the key lexical component for beginners, but as learners get a bigger lexicon, lexical organization becomes increasingly more important, as better organization allows better access to the mental store of lexical items.
Quote 5.8
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Quote 5.7 Lennon on fluency in L1 and L2 language processing
Henriksen on meaning versus network knowledge
Acquiring word meaning involves, as we have seen, two interrelated processes of (a) adding to the lexical store via a process of labelling and packaging (i.e. creating extensional links) and (b) reordering or changing the lexical store via a process of network building. There is a need for clarification in the research literature as to
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 247
6/9/2010 1:09:27 PM
248
Researching Vocabulary
which process is being described, tested, and discussed. There has, in my view, been a tendency in L2 vocabulary research to focus on the first aspect (i.e. mapping meaning onto form) and to disregard the second aspect (i.e. network building).
This all suggests that lexical organization is worthy of research, and thus needs to be measured in some way. The main method of researching lexical organization is word associations. It is clear that association data provides insights in the organization of the mental lexicon (Section 2.4), but equally, it must be said that the data is often confusing and difficult to interpret. It is still unclear just how much associations can tell us about lexical organization (as well as lexical acquisition and processing), and it seems that this approach is still waiting for a breakthrough in methodology which can unlock its undoubted potential. One reason why concrete conclusions have eluded many previous association studies is flawed methodology. Fitzpatrick (2006, 2007) highlights a number of recurrent problems which have led to difficult-to-interpret data. The first has to do with the lexical items used as stimuli. While target word selection is important for any vocabulary study (Section 4.5), it seems to be particularly crucial for association research, as responses are influenced or even determined by characteristics of the stimulus words. Prompt words from different word classes tend to elicit responses from similar or closely related word classes, with nouns prompting nouns, verbs prompting verbs, and adjectives prompting nouns, etc. Moreover, high-frequency stimuli tend to elicit more predictable responses (Meara, 1983), which is not particularly useful in studies set up to investigate differences between subjects. Many studies have used the 100 words from the Kent-Rosanoff association list (1910), simply because the response norms are already available. Unfortunately, the words on this list may not be suitable for L2 research, as they are of very high frequency, and are almost all adjectives or nouns, and so produce very similar responses in both the L1 and L2 (e.g. black→white; noir→blanc). This makes it difficult to decide whether an association is a direct response to the L2 prompt, or whether it is produced via translation into the L1 and back again. Furthermore, almost all the words in the list are from the highest frequency band, and so will be among the first words that a learner acquires in his second language. It is not clear whether the word association behavior for these basic L2 words is the same as or different from more advanced lower frequency vocabulary, and it might be misleading to generalize from one to the other (Meara, 1983). The way around this problem is to choose stimulus words which are less frequent, but are still known to the participants. It also makes sense to choose words which are matched to the participants. Albrechtsen, Haastrup,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
(1999: 309)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 248
6/9/2010 1:09:28 PM
249
and Henriksen (2008) studied participants ranging from Grade 7 to university level, and needed to ensure the lower-proficiency Grade 7 students would know the prompt words. They used 24 concrete nouns drawn from six semantic fields which the younger informants were expected to be familiar with (people – child, body – stomach, animals – eagle, house – window, food – cheese, geography – mountain) and 24 adjectives (beautiful, hungry, blue, afraid). Fitzpatrick (2006) studied more advanced L2 students who had gained entrance to a British university, and so chose words from the Academic Word List. The AWL does not include the highest frequency words, and includes relatively few concrete nouns, which tend to produce predictable responses. If the participant pool is more specific, e.g. ESP students in a particular field, it may be worth considering using the technical vocabulary of that field as stimuli. Another problem Fitzpatrick highlights is the categorization of association responses. Previous studies have tended to use three main categories: paradigmatic, syntagmatic, and clang. These categories do not account for all possible association types, as indicated by the frequent inclusion of an ‘other’ category for the (sometimes numerous) responses which did not fit comfortably into the main categories, or for which no decipherable link was obvious. Fitzpatrick also suggests that these categorizations are too broad. She points out that the paradigmatic category includes such diverse relationships as synonymy, hierarchal relationships, and quality associations (x is a quality of y), while the syntagmatic categories contains collocations (xy and yx) and words that are part of longer formulaic sequences. To address this problem, a finer-grained categorization system can be used. In addition to the paradigmatic, syntagmatic, and clang categories, Albrechtsen et al. (2008) discuss categorizing responses according to whether they are canonical or not, i.e. very common, primary, almost ‘standard’ responses such as black→white, eat→food, and house→home from Table 2.3 in Section 2.4. These associations are so strong that they are likely to have some important role in structuring the lexicon, and so distinguishing between canonical and less common associations may have value. Namei (2004) explored clang/syntagmatic/paradigmatic responses, but also looked at the word frequency of responses. She found that frequency had a relationship to the proficiency of the L1/bilingual informants, with more advanced informants tending to supply more low-frequency and abstract associations. Fitzpatrick (2006) took the approach of developing a much more detailed system based on three sets of information. First, she looked at the categorizations from previous research. Second, she examined responses from previous studies and determined which categories were necessary to classify those responses. Third, she drew upon Nation’s (2001) word knowledge taxonomy to identify three main categories (meaning, position, and form) and 17 subcategories in total. The complete system is illustrated in Table 5.4.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 249
6/9/2010 1:09:28 PM
250 Researching Vocabulary
Category
Subcategory
Definition
Meaning-based associations
Defining synonym
x means the same as y
Specific synonym
x can mean y in some specific contexts x and y are in the same lexical set or are coordinates or have a meronymous or superordinate relationship y is a quality of x or x is a quality of y y gives a conceptual context for x x and y have some other conceptual link
Hierarchical/lexical set relationship
Quality association Context association Conceptual association Position-based associations
Consecutive xy collocation
Consecutive yx collocation
Phrasal xy collocation
Phrasal yx collocation
Form-based associations
Different word class collocation Derivational affix difference Inflectional affix difference Similar form only Similar form association
Erratic associations
False cognate No link
y follows x directly, or with only an article between them (includes compounds) y precedes x directly, or with only an article between them (includes compounds) y follows x in a phrase but with a word (other than an article) or words between them y precedes x in a phrase but with a word (other than an article) or words between them y collocates with x + affix y is x plus or minus derivational affix y is x plus or minus inflectional affix y looks or sounds similar to x but has no clear meaning link y is an associate of a word with a similar form to x y is related to a false cognate of x in the Ll y has no decipherable link to x
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 5.4 Fitzpatrick categories for word association responses (x = stimulus word, y = response word)
(Fitzpatrick, 2006: 131).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 250
6/9/2010 1:09:28 PM
251
There are a number of other methodological issues in association research. One is using an adequate number of prompt words. Some studies have used only a handful of stimulus items, e.g. Kruse, Pankhurst, and Sharwood Smith (1987) used 12, while Ruke-Dravina (1971) used only four. Clearly this is not much of a sample to extrapolate from. Much better in this respect is Henriksen (2008), which used 48 stimulus words. We know that association responses vary according to word class, semantic category, word frequency, etc., and so it is necessary to have enough stimuli to smooth out the inevitable variation in order to obtain useful results. If it is only possible for practical reasons to use a limited number of prompts, it may be necessary to constrain the prompts to a narrowly focused set, e.g. only adjectives from a certain frequency range. Another issue is how many responses to require from participants. Most studies have asked for a single response, and this makes sense if there are numerous stimuli, or if the study is interested in whether canonical responses are given. But it is also possible to ask for several responses (e.g. Schmitt, 1998c, asked for three). This may be more appropriate if a researcher is interested in depth of knowledge, as several appropriate responses give evidence of a greater breadth of lexical knowledge than a single appropriate response does, and because the simple ability to produce three responses at all will also discriminate between learners. Asking for multiple responses can also be an expedient for gathering greater amounts of data from a limited number of stimuli. Indeed, some studies have asked informants to list as many associations as possible, with the total number produced being the variable of interest. However, if too many responses are asked for, there is the danger that the later responses in the chain will actually be associations of earlier responses rather than responses to the original prompt word (e.g. snow→cold, winter, ski, white, black). Sometimes it is difficult to categorize a response in terms of its relationship to the prompt item. Fitzpatrick (2006: 125) gives the example of the stimulus partnership and response business. It could be a collocation (They have a business partnership) or it could be a synonymous response (Their partnership/business went bankrupt). She also gives the cautionary example of habit→red eyes, grass, big ears, where it was only the third response which gave the researchers the clue that the first two were not responses in line with a ‘drug habit’, but rather, that the informant mistook the prompt habit for rabbit! Fitzpatrick suggests conducting retrospective interviews with informants to confirm the links between prompts and responses. While this would clearly be too time-intensive if large numbers of participants were involved, a compromise solution would be to check the responses quickly after administration, and then only going back to the informants which produced responses which needed resolving. Alternatively, Henriksen (2008) added a think-aloud session which she used in the coding to help her determine the response types.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 251
6/9/2010 1:09:28 PM
Word association data can be analyzed in a number of ways. Responses can be sorted into categories and then the relationships between categories explored, as in studies which compare paradigmatic and syntagmatic responses. Association responses can also be compared to norms of various types. Native data can be compared to existing norm lists, and L2 learner responses can be compared to native norms. Responses can also be analyzed according to how many other responses they relate to in a Graph Theory approach (see below). There are a number of norm lists existing, although many are in out-ofprint books, and difficult to access, e.g. Postman and Keppel (1970). Below are one book still in press and two internet sites. • Edinburgh Association Thesaurus
The Edinburgh Association Thesaurus is an interactive site which gives word association responses for a number of stimulus words. There is a space into which to type a stimulus word, and its responses are then produced. It is also possible to type in a response in order to see the various stimuli which produced it. About 8,400 stimulus words are available, each having been administered to 100 British university students (not all the same). Thus each stimulus has a maximum of 100 responses. The full report of the collection procedure is available in Kiss, Armstrong, Milr˙oy, and Pip˙er (1973). • The University of South Florida word association, rhyme, and word fragment norms (D.L. Nelson, C.L. McEvoy, and T.A. Schreiber, 1998)
The website claims to be the largest database of English free associations ever collected in the United States. More than 6,000 participants produced nearly three-quarters of a million responses to 5,019 stimulus words. On average, 150 participants worked with sets of 100–120 words each. About three-quarters of the words are nouns (76%), with adjectives making up 13%, and verbs 7%. The site includes association norms, matrices showing the links between related sets of associations, and information on rhyme and assonance. ● Birkbeck Word Association Norms (Moss and Older, 1996) This more recent book contains free association norms for over 2,000 words, collected from groups of 40–50 British English speakers between the ages of 17 and 45. There is also an index of stimulus words organized according to semantic category to aid selection of experimental materials.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
252 Researching Vocabulary
Most available norms, including those above, are of native English speakers. These are usually used without question, but Fitzpatrick (2007) sounds
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 252
6/9/2010 1:09:28 PM
253
a word of warning. She found that her 30 native informants varied hugely in their responses to AWL prompt words not only in terms of the responses themselves, but also in the category of association they produced (i.e. Table 5.4). Her finding of a lack of homogeneity among natives is congruent with earlier research by Rosenzweig (1961, 1964). He found that association responses from speakers of a language can vary according to education, and that responses differed somewhat between speakers of different languages. This suggests researchers need to be careful in how they determine ‘nativelike’ responses. Unless a study uses very frequent stimulus words (which tend to have canonical responses), then native-like behavior is likely to consist of a wide range of responses. An important part of association research is deciding on the elicitation methodology. Free association tasks are useful in that they require a production of association responses, without the ‘hints’ available in a receptive format. However, as informants can come up with any response, these tasks have the difficulty of categorization and scoring, as we have seen above. However, at least the elicitation instruments are easy to design, being some variation of the following basic format: Write the first word you think when you see each of the following words. Do not think about your answer, but write the first word that comes into your mind. 1. available ——— Henriksen (2008) developed the following scoring system, based on the various developmental shifts in response behavior across proficiency levels: (1) a shift from form-related to meaning-based responses, (2) an increase in the number of canonical responses (based on her two norming groups), and (3) an increase in the number of low frequency responses. The scoring rubric is outlined in Table 5.5. Each response from a participant was rated according to this scale, and then the scores from all of that participant’s responses were summed. This total score was then divided by the number of responses given, which gave the response type score for the participant. Weaker participants tend to give the same response to a number of prompt words, and to minimize the effect of such repetition on her dataset, Henriksen adjusted for this by including a lexical variation score. It was calculated by dividing the number of lexical types an individual participant used by the number of semantically related responses given by that participant and multiplied this by 100. For example, in a 50 item association task, if the participant produced 45 response types and repeated five of them, then the calculation would be (45 ÷ 50) × 100 = 90. Receptive formats have the disadvantage of not requiring active production of association responses, but also have the considerable advantage of
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 253
6/9/2010 1:09:28 PM
254 Researching Vocabulary Table 5.5 Scores awarded to different response types
Inability to supply an L1 or L2 response (‘unqualified’) Form-related Chaining High frequency non-canonical, but semantically related High frequency canonical Low frequency canonical Low frequency non-canonical, but semantically related
Score
brød (L1 translation) bread (repetition of stimulus) red table white, birds
0 0 1 1 2
food, water toast, loaf grainy, flour
3 4 5
(Henriksen, 2008: 50)
researcher control, where associations on an instrument can be selected and manipulated in a range of ways in order to research various aspects of association knowledge. This was seen in the Word Associates Format test (Section 5.3.2), where several types of association knowledge (paradigmatic, syntagmatic, analytic) were targeted. In addition to a free association task, Henriksen (2008) also used a receptive task to elicit information about the ability to discriminate between strong and weak association links. They selected the five most frequent responses to their target stimulus words from norming lists, and also five responses that were given by only one person on the norm lists (i.e. these responses represented potential, but clearly more peripheral, links in the lexical network). The task for the participants was to select the five strong associations. Select the five words most strongly connected to the key word. cold
water snow
war warm
frost pain
hand winter
hot ice
A different approach to analyzing associations is related to Graph Theory. Paul Meara has pursued this approach with a number of colleagues (e.g. Meara and Wolter, 2004; Wilks and Meara, 2002). Instead of looking at individual responses and their categories, they instead concentrate on the interconnectivity of the responses, i.e. network density: the relative number of association links for each stimulus word. They used a software program called V_Links, which presents 10–12 stimulus words, and the participant’s task is to decide whether there is a connection between these words. The words were chosen so that there are some obvious association pairs and some less obvious pairs. The participant is also asked to indicate
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Examples from L2 with the stimulus word ‘bread’
Response type
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 254
6/9/2010 1:09:28 PM
255
the strength of any connection they choose on a four-point scale. The V_ Links interface is illustrated in Figure 5.6. In this example, the participant has indicated links between quiet–morning, quiet–peace, quiet–sound, heavy– sound, rest–peace, and dream–bed. He is also in the process of indicating a link between sound and health. This methodology clearly discriminates between native and nonnative speakers, with one pilot study showing Japanese EFL learners indicating only about half of the links of native speakers (Meara and Wolter, 2004). They also found only a modest correlation between V_Link scores and vocabulary size, indicating that V_Links is measuring something separate from size. This supports the notion that organization is a viable independent construct of vocabulary knowledge in addition to vocabulary size.
Quote 5.9 lexicon
Meara and Wolter on vocabulary size and the
Vocabulary size is not a feature of individual words: rather it is a characteristic of the test taker’s entire vocabulary. (2004: 87)
Meara has developed a streamlined version of V_Links called V_Quint which presents five randomly-chosen high-frequency stimulus words and asks informants to find a single link between them. His research indicates that the ability to do this can be extrapolated to estimate the total number of links in the lexicon. V_Quint is available on-line with documentation at Meara’s website _lognostics . Meara and his colleagues admit that much more research is required before we know how to use this type of approach to best effect, but the possibilities are certainly exciting. (See more about Meara’s website in Section 6.5. It also contains a free association test, where informants produce up to four responses each for 30 stimulus words, and receive a score directly after finishing the test.) It is probably noticeable that this section contained less firm guidance on measurement methodology than other measurement sections. This is down to the fact that there is still not a consensus on the best way to use associations in language research, either in terms of how to run the methodology, or in what associations can tell us about the mental lexicon. Also, there is the major problem of finding a way of transforming patterns of association and shifts of patterns into a score which enables researchers to compare across learners and learner groups. Word associations have obvious potential, but the field awaits innovative techniques which can fully exploit this potential.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 255
6/9/2010 1:09:28 PM
Figure 5.6
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
256 Researching Vocabulary
Screen shot from V_Links
(Meara and Wolter, 2004: 91)
5.6 Measuring attrition and degrees of residual lexical retention Vocabulary acquisition is dynamic, and while we hope to measure improvement in vocabulary mastery, there will inevitably be attrition as well. However, just as vocabulary acquisition is incremental and multi-faceted, we might expect that vocabulary attrition would also be complex. Thus, attrition is not an all-or-nothing concept, but may affect various types of
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 256
6/9/2010 1:09:28 PM
257
word knowledge differently. Likewise, it is useful to distinguish between the attrition of receptive and productive mastery of lexical items. Researchers have explored the attrition of these various lexical aspects through a number of elicitation methodologies: oral production of monologues (e.g. Cohen, 1989), production of conversation (Tomiyama, 1999), response to visual stimuli (Hansen and Chen, 2001), recognition of written and spoken form (Weltens, 1989), and measurement of speed of recognition (Grendel, 1993). Attrition concerns vocabulary researchers in both short-term and longterm guises. This first has been a recurrent theme through this book: knowledge of lexical items learned in a study will usually decay over time after the treatment, and so only delayed posttests give a true indication of durable learning. This is the main reason why delayed posttests are so important in vocabulary research (see Section 4.4). Beyond this, short-term attrition is an important issue for SLA, and vocabulary in particular. The key issue is how long a memory trace from an exposure can endure, so that it can be subsequently built upon. If this period is exceeded, then the next exposure will merely be ‘starting over’ with no incremental gain. There is very little research to inform this question, although the answer should drive most of pedagogy, at least that concerning the earliest learning stages. For example, syllabuses should be designed so that vocabulary recycling occurs within the ‘retention period’. Another example is incidental learning from reading. A learner must read enough so that a new lexical item will be met again before its memory trace disappears. The length of the retention period will dictate the maximum number of pages which can be read before the item needs to occur again (for any particular reading rate, i.e. number of pages per day). Again, if this number of pages is exceeded, the acquisition of that item will suffer. The writers of graded readers would particularly benefit from this kind of information. It may well be that different kinds of exposure lead to stronger memory traces, with most current research showing explicit engagement outperforms incidental engagement (see Schmitt, 2008, for an overview). Thus, the retention period may vary in systematic ways. A related issue about which little is known is the number of exposures which are necessary to make vocabulary knowledge durable. Long-term attrition and retention is also of interest to vocabulary researchers. Studies with this focus usually test people who learned a language previously in their life, but for whatever reason have not used it for a long time. Sometimes there is a record of the previous level of knowledge (e.g. length of study and grades received) and so a T1–long deactivation–T2 approach can be used. For example, Bahrick (1984) studied the loss of L2 Spanish and found that some vocabulary knowledge was retained for more than 50 years. Moreover, recognition was less affected by attrition than production. Overall, Bahrick’s cross-sectional data suggests that vocabulary knowledge
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 257
6/9/2010 1:09:29 PM
declines exponentially for an initial period of from three to six years after instruction, but then remains steady for several decades, although with an additional decline in middle age. However, it appears that learners who achieved relatively high levels of proficiency are more resistant to the initial attrition, and maintain a plateau before attrition begins (e.g. Hansen, Umeda, and McKinney, 2002). Bahrick’s study focused on the form-meaning link, partially because there were no previous measurements of any other aspect of lexical knowledge, and so only a basic level of word knowledge could be assumed. This is a common problem, as any ‘previous knowledge’ indicators seldom include a precise specification of lexical knowledge, i.e. which lexical items informants knew and how well. However, there have also been some studies where learners have been intentionally tested at the end of their language studies with the express goal of exploring their long-term retention/attrition. In these cases, dedicated lexical measurements can be given. A good example of this is Grendel’s (1993) dissertation research (reported in Weltens and Grendel, 1993), which studied the automaticity of receptive orthographic and semantic knowledge. Orthographic knowledge was measured by a lexical decision task, where Dutch learners of French were asked to judge whether stimuli were words or not. The stimuli included French words (poivre, ‘pepper’), nonwords containing a high-frequency cluster (poible), nonwords containing a low-frequency cluster (poifle) and nonwords with unusual clusters (poizye). The nonwords with high-frequency clusters were expected to be recognized faster than those with low-frequency clusters, because they look more like real words. But if attrition set in, participants would become insensitive to the frequency of certain French vowel or consonant clusters in specific word positions, and so the high- and low-frequency cluster nonwords would eventually become indistinguishable for the subjects. The participants were measured at the end of their language instruction, then after two and four years of language disuse. The results showed that the speed difference between these two categories was maintained across both two and four years of disuse, indicating there was no attrition of the awareness of these phonotactic patterns. The semantic test was similar, but used a priming paradigm with stimuli including words and nonwords. Half of the real French word targets were primed by semantically related words (doux–dur, ‘soft–hard’), and half of the words were primed by semantically unrelated words (genou–rue, ‘knee– street’). Semantic priming typically speeds recognition (see Section 2.11), and so one would expect words primed by semantically-related words to be recognized faster than those primed by unrelated words. If attrition occurred, it could be expected that that this priming effect would decrease over time. However, the same result was obtained as in the orthographic part of the study, with the size of the priming effect remaining more or less the same over the four-year test period.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
258 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 258
6/9/2010 1:09:29 PM
259
One of the most commonly used attrition methodologies is the ‘savings paradigm’ (de Bot and Stoessel, 2000). It highlights residual knowledge by comparing the relearning of old, previously-known items with the learning of new items. Informants are tested on a list of previously-known words from the disused language, and those forgotten are noted. The informants are asked to study these forgotten items, along with a number of new words which were not previously known. Sometimes instead of these new words, nonwords are used. If so, they are carefully matched with the previouslyknown words in terms of complexity, so that the learning burden is equivalent. The informants are then tested on both the previously-known words and new words/nonwords, and the percentage of ‘old’ words relearned is compared to the percentage of ‘new’ words/nonwords learned. If there is a higher percentage of old items than new items, then it is assumed that this is because some residual learning remained, facilitating the relearning. In other words, some learning effort is ‘saved’ by the residual learning. Using this methodology, there is no evidence for complete attrition, as there are always some savings effects (see Hansen et al., 2002, for an overview). This indicates that, once learned, vocabulary never completely disappears, but only becomes inactive. Thus, people relearning the vocabulary of a previously-known language, even after a very long time and with no apparent knowledge still evident, will enjoy a substantial advantage over people learning the vocabulary of the language for the first time. However, this conclusion must be limited to words, as, to my knowledge, there has yet been no research on the long-term attrition and retention of formulaic language. Similarly, Meara (2004) comments that the attrition methodologies suffer from the fact that they work with individual lexical items, and do not take into account that lexicons are structured networks of knowledge (see Section 2.4). He feels that computerized simulations can prove illuminative, in that they can model what attrition of such networks might look like. See Section 2.10 for a more detailed discussion.
Quote 5.10 Hansen, Umeda, and McKinney on the ‘Matthew effect’ extending to vocabulary retention ... Stanovich’s (1986) insight from the reading research literature that ‘[t]he rich get richer’ also applies to the relearning of vocabulary. The larger the lexical network retained, the greater the chances of reactivating successful pathways to old words and the greater the chances of having the relevant infrastructure in which to integrate new words. Further language attrition studies, incorporating careful control of the original proficiency levels of individual attriters, will allow us to verify the aptness of extending Stanovich’s maxim to read, ‘[t]he rich get richer, and they stay richer.’
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Measuring Vocabulary
(2002: 672–673)
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_06_cha05.indd 259
6/9/2010 1:09:29 PM
6
Although this research manual is mainly written with the more advanced researcher in mind, I am well aware that everyone has to start someplace. With this in mind, I offer the following ten research projects on which emerging researchers can develop and hone their skills. They have all been designed to be challenging, but still ‘do-able’, hopefully with enough background information to make both the goals and the required methodology clear. However, although the basic research designs are relatively straightforward, understanding the full implications of the results will require a more sophisticated mastery of the ideas presented throughout this volume. On a practical note, I do not mention a minimum number of participants for the experimental studies. As usual, ‘more is better’, but if the main purpose of the study is develop research expertise, even a small number of participants can provide enough data to gain experience with the research techniques. However, if a more rigorous approach using inferential statistics is desired, the rule of thumb seems to be that around 30 participants are required to achieve a normal distribution of the data.
Research Project 1: Estimating the vocabulary size of native and/or nonnative speakers of English Goal 1. To obtain valid estimates of the vocabulary size of your native and/or nonnative participants 2. To interpret the native results in terms of previous estimates of native lexical size (and/or) 3. To interpret the nonnative results in terms of what language skills their lexical resources will support.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Example Research Projects
260
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 260
4/13/2010 1:02:11 PM
Example Research Projects 261
This project is essentially a replication of Goulden, Nation, and Read (1990; also available at . The first step is to read the original article, and sections of this book concerning vocabulary size and size measurement, especially Sections 1.1.2 and 5.2. The article provides five versions of a checklist test, and detailed instructions of how to administer the tests. Given that responses to any version of a test will be variable to some extent, you should use at least two of the versions and average the results. The first two versions are provided below. Version 1 1 as 2 dog 3 editor 4 shake 5 pony 6 immense 7 butler 8 mare 9 denounce 10 borough
11 abstract 12 eccentric 13 receptacle 14 armadillo 15 boost 16 commissary 17 gentian 18 lotus 19 squeamish 20 waffle
31 comeuppance 32 downer 33 geisha 34 logistics 35 panache 36 setout 37 cervicovaginal 38 abruption 39 kohl 40 acephalia
41 cupreous 42 cutability 43 regurge 44 lifemanship 45 atrdpia 46 sporophore 47 hypomagnesia 48 cowsucker 49 oleaginous 50 migrationist
21 aviary 22 chasuble 23 ferrule 24 liven 25 parallelogram 26 punkah 27 amice 28 chiton 29 roughy 30 barf
Version 2 1 bag 2 face 3 entire 4 approve 5 tap 6 jersey 7 cavalry 8 mortgage 9 homage 10 colleague
11 avalanche 12 firmament 13 shrew 14 atrophy 15 broach 16 con 17 halloo 18 marquise 19 stationery 20 woodsman
21 bastinado 22 countermarch 23 furbish 24 meerschaum 25 patroon 26 regatta 27 asphyxiate 28 curricle 29 weta 30 bioenvironmental
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Methodology
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 261
4/13/2010 1:02:11 PM
Researching Vocabulary
31 detente 32 draconic 33 glaucoma 34 morph 35 permutate 36 thingamabob 37 piss 38 brazenfaced 39 loquat 40 anthelmintic
41 gamp 42 paraprotein 43 heterophyllous 44 squirearch 45 resorb 46 goldenhair 47 axbreaker 48 masonite 49 hematoid 50 polybrid
Questions to consider 1. Goulden, Nation, and Read (1990) report that their New Zealand university undergraduates averaged about 17,000 word families on these tests. How do your native-speaking participants compare? Why are their scores higher or lower than the New Zealand university students? 2. Section 1.1.2 outlines current thinking on how much vocabulary is required to use English in various ways. Based on these targets, what language abilities do your nonnative participants have the lexical resources to pursue? If you know your participants’ language proficiency goals, how much vocabulary do they have to learn in order to help achieve those goals? 3. The scores on the various test versions you give are likely to vary. What does the degree of variation tell you about the validity and reliability of the tests? That is, if the scores are quite different, does this reflect a weakness in the tests? How much of the variation between versions should be considered a normal reflection of people’s intrinsic variability in doing a series of tasks?
Research Project 2: Exploring word associations Goal To explore the nature of word associations across increasing levels of language proficiency, and to explore various word association evaluation techniques. Methodology Read the sections on word associations in this volume (2.4 and 5.5). Then select the stimulus items (words and/or phrases), considering the issues discussed in Section 4.5. Decide how many responses you will require your participants to produce for each stimulus (usually between one and three). If you are interested in exploring how this variable affects the nature of the responses, ask your participants to produce X number of responses for some stimuli and Y number for other stimuli. Fix the stimuli on an instrument (either paper- or computer-based), with clear instructions of what the participants are to do. Administer the instruments to participants of three or more
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
262
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 262
4/13/2010 1:02:12 PM
levels of language proficiency. Use native speakers for your highest level of proficiency. Once collected, the responses will need to be evaluated. Run three separate analyses. First, use the traditional distinction between paradigmatic, syntagmatic and clang associations. Second, try Albrechtsen, Haastrup, and Henriksen’s (2008) notion of canonical associations. Third, use Fitzpatrick’s categorization system illustrated in Table 5.4. The first comparison is between native and nonnative responses. How are the responses from your native and nonnative participants similar and how are they different? Another source of native responses is the Edinburgh Associates Thesaurus (EAT) . Also compare your nonnative responses with those of the EAT. It has been posited that association responses reflect improving lexical knowledge and organization as language proficiency advances. How do the responses vary according to the different levels of language proficiency of your participants? Association responses typically have a great deal of commonality among native speakers, especially among the most frequent responses, but also show considerable variability among the less frequent responses. How do your native responses match up with those of the EAT? Questions to consider 1. How well do the association responses differentiate between levels of proficiency? Can they differentiate between larger differences in proficiency (beginner versus intermediate learner; native versus nonnative speaker)? Can they differentiate between closer levels of proficiency (lower intermediate versus higher intermediate learners)? 2. How closely do your native responses compare to those from the EAT? If they are different, how much of the difference might be caused by nationality or education level? (The EAT consists of responses from British university students.) 3. Which of the evaluation methods (paradigmatic/syntagmatic/clang; canonical; Fitzpatrick’s categories) worked best in showing the differences between proficiency levels? 4. How did the responses differ according to whether the stimuli were individual words versus phrases? 5. How did the nature of the responses vary depending on the number of responses requested for each stimulus word/phrase?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Example Research Projects 263
Research Project 3: Validate a vocabulary test with an interview approach Goal To determine the extent to which a target vocabulary test is producing a valid indication of the construct it is purporting to measure.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 263
4/13/2010 1:02:12 PM
264
Researching Vocabulary
Read Chapter 5 from this volume, and Schmitt, Schmitt, and Clapham (2001). Then choose a vocabulary test to analyze. Taking account of the issues brought up in the Chapter 5 discussion, decide what the test is purporting to measure. Consider whether the test focuses only the formmeaning link, or whether it measures other types of word knowledge. Also, consider whether the test taps into a receptive or productive level of mastery. Compile a list of the lexical items from the test (or a sample thereof). For each item, write down the element(s) of lexical knowledge which satisfy the criteria of ‘knowing’ the items in this particular test. You will use this list in an interview (see Step 2 below) to judge whether the participants actually ‘know’ the items on the test. The Schmitt et al. article explains how we did this with the Vocabulary Levels Test. We judged the VLT to be a ‘form-recognition’ format (see Section 2.8), and so we focused on the formmeaning link. Our interview list looked something like the one below, and on it the raters indicated whether they thought the form-meaning link was known or not. Because the raters (my wife and I) were educated native speakers, it was not necessary to specify the definitions of the target words on the list. However, you may find it useful to explicitly spell out your knowledge criteria on the list to have it handy for reference during the interview. Item 1. birth 2. choice 3. cap 4. attack 5. cream 6. adopt 7. bake 8. burst 9. original 10. brave ...
Knows meaning
Does not know
——— ——— ——— ——— ——— ——— ——— ——— ——— ———
——— ——— ——— ——— ——— ——— ——— ——— ——— ———
You can do the rating yourself, but it is better to have two raters. In this way, if the raters agree (inter-rater reliability), you can be more confident of the results. For vocabulary tests, the inter-rater reliability should be at least .90, although .95+ is desirable. Next, find some participants which it would be suitable to give this test to. The administration will be given to individual participants in two steps:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Methodology
Step 1: Allow the participant to take the test under conditions similar to those which would be in place in a normal testing situation (e.g. same amount of time, same instructions). Once the test is completed, take it from
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 264
4/13/2010 1:02:12 PM
Example Research Projects 265
Once you have finished the interview, compare the results from the test with the results from the interview. Probably the easiest way to do this is by creating a contingency table, such as Table 7 in Schmitt et al. Questions to consider 1. How closely to do the test and interview results tally? What percentage of test items indicates the participants’ ‘true’ lexical knowledge, as indicated by the in-depth interviews? 2. Is there any pattern of test items which do and do not indicate ‘true’ knowledge? That is, are certain types of items inherently stronger or weaker than other types on this particular test? 3. Does the type of lexical item (e.g. higher frequency vocabulary, individual word versus formulaic sequence) make any difference in how well the test items function? 4. If you had two or more raters, how good was your inter-rater reliability? 5. Based on your results, how valid is the test?
Research Project 4: Create a technical vocabulary list Goal To create a list of technical vocabulary for a specialized field Methodology First, look at some of the word lists mentioned in Section 7.4 and their documentation, to get a feeling for what they look like, and how they were compiled. This project will follow some of the methodology used by Coxhead (2000) in creating the Academic Word List, so also read her article. Decide which field you wish to compile a list for. For this small-scale project, choose a field that is relatively specific (e.g. instead of the general field of engineering, it is better to choose the narrower subfield of electrical engineering). Consider the target field, and decide on the main categories of inquiry within it. For example, in electrical engineering, the categories might include power, microelectronics, telecommunications, and computing. Then collect a sample of texts for each of those categories. The sample of texts should be as large as your time allows. You should also sample from as wide a range of texts as possible. For example, instead of including one or two whole books, it is better to sample numerous 2,000-word extracts from a wide range of books. Coxhead describes in detail how she
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
the participant, but do not look at it. (This is so that you do not become biased in the second step.) Step 2: In this step you will interview the participant concerning their knowledge of the lexical items on the test using the list you have developed. Probe the participant until you (and your co-rater) are confident you can make a sound judgement concerning the participant’s knowledge of each lexical item.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 265
4/13/2010 1:02:12 PM
Researching Vocabulary
sampled from a wide range of academic texts, and this serves as a good model. She was also careful to sample in a balanced manner, i.e. having the same amount and range of texts from each of her categories. Scan your texts and extracts into an electronic format, and build a corpus of your specialist field. Then use a concordancing program like WordSmith (see Section 7.3) to compile a frequency-based list of all the words in the corpus. From this list, you will then need to extract the technical words. To do this, you first need to eliminate all of the general English words which are common to all fields. Coxhead did this by eliminating all the words on her initial list that also occurred in the General Service List. You can use the GSL, or use frequency lists from the BNC and eliminate the most frequent 2,000 BNC words from your list. This should eliminate the high-frequency general English words from your list, and what remains should be the non-general words which still occur frequently in your specialized corpus. To further refine your list, check that the remaining words occur across a range of categories in your corpus, and also across a range of texts. (It is no good including a word which only occurs very frequently in a single text.) Again, Coxhead provides detailed advice on how to do this, although your thresholds (e.g. minimum number of texts a word appears in) will likely be much lower than hers. You will have to decide your own thresholds based on what is sensible for your own data. The words that remain after this further refinement will make up your final technical list. Questions to consider 1. How does the final technical list look? Does it seem reasonable? Are there some words included which do not seem to fit? Are there some obvious missing words? 2. If a word list for your target field already exists, how similar is your word list to it? 3. If a dictionary for your target field exists, contrast it to your list. Most dictionaries are complied by expert intuition, so how does the one in your field compare to your empirically-based list? 4. Was the list-building process straightforward, or did you find the decision-making difficult? For example, was it difficult to decide on the threshold levels which best produced a viable list of technical vocabulary? 5. What are the pedagogical implications of your list? Would it be useful in developing materials to teach the particular field? Would it be a useful reference for students of the field (or even experts)?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
266
Research Project 5: Compare the effectiveness of different vocabulary teaching techniques Goal To compare the effectiveness of different vocabulary teaching techniques in terms of long-term acquisition.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 266
4/13/2010 1:02:12 PM
Example Research Projects 267
First, read Chapters 4 and 5 in this volume. Then think about your participant pool at this early stage, because this is a longitudinal study that will require extended cooperation from your participants, and is probably best suited to researchers who already teach an existing class of students, or who have access to one. Next, think about which teaching techniques you wish to explore. Although you can compare any teaching techniques, there should be some rationale why the ones you choose can be logically compared. For example, it would make sense to compare incidental learning approaches (e.g. learning vocabulary from reading) with explicit approaches (e.g. learning vocabulary through dictionary lookup), or to compare the learning accruing from oral versus written input, or to compare different teaching techniques that are commonly used in your school system. You then need to decide how you are going to define ‘learning’. This will involve consideration of (1) which word knowledge aspects will be required (just the form-meaning link, or some other aspects like derivation knowledge or collocation), and (2) the level of mastery required (usually conceptualized as receptive versus productive mastery). For this project, it is probably better to find an existing test format to use, rather than developing a new one, as this is time consuming and requires considerable expertise to do well. One way to find such formats is to look up research studies in journals and books which have made a similar comparison to that which you wish to make, and see what measurement instruments were used. You will probably have to adapt those instruments to your study (e.g. in terms of target lexical items, or level of difficulty), but this is much easier than starting from scratch. The next step is to develop a list of target lexical items. The most ecologically sound items would be those which are useful for your students and would be taught anyway. However, you need to control for previous knowledge, and this is most easily achieved by using either low-frequency items, or nonwords. See Section 5.1.2 for a detailed discussion of pre-existing knowledge and how to address this issue. Comparisons of methodologies are usually carried out in one of two ways. The first is to use the different techniques on the same group of students. In this case, the students act as their own controls, i.e. the participants are exactly the same for the different methodologies, and so have the same levels of proficiency, aptitude, motivation, etc. The second way is to use different groups of students, but then it is necessary to determine that they are essentially equivalent in learning ability (usually indicated by some proficiency measure, or in the case of lexical acquisition, a vocabulary size measure). (There are also statistical ways of equating groups, but they are too complex to explain here.) Decide which approach makes the most sense for your research situation, and set up a research design where the competing methodologies are given an equal chance to succeed, often determined by having
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Methodology
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 267
4/13/2010 1:02:13 PM
268 Researching Vocabulary
the same amount of classroom time. The study will look something like this:
→
Immediate posttest optional, but shows whether treatment had an effect
Delayed posttest Shows durable learning
Delayed posttest scores – Pretest scores = acquisition gains OR No previous knowledge assumed, so delayed posttest scores = acquisition gains The vocabulary gains from the different methods can then be compared. While this can be done by looking at descriptive statistics like mean or medians, it is normal to check any differences for statistical significance. If two techniques are being compared, it is likely that some form of t-test or nonparametric equivalent will be appropriate. If three or more techniques are being compared, then some form of ANOVA or nonparametric equivalent may be required. This will require either knowledge of statistics, or access to someone who has this knowledge. This project is one of the more challenging, because it requires considerable expertise of research design and statistical analyses. However, it can also be one of the most rewarding, as it can potentially give tangible answers concerning the teaching methodologies which are more effective for the type of students you are involved with. Questions to consider 1. Was any teaching technique better than any other? If there was a difference in vocabulary gains, was it large enough to be statistically significant? 2. Your results come from your particular students and teaching situation, but would it be reasonable to argue that your results are also generalizable to other students and teaching situations? If so, which ones? 3. If you measured different aspects of word knowledge, or a different level of mastery, do you think the results would have been the same or different? 4. If you found little difference between the techniques, does this suggest that student factors (age, proficiency, motivation, etc.) might be more important to learning than the teaching techniques used? 5. Do your results suggest any implications for teaching change in your classroom or school system?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Pretest → Treatment → or potentially no same amount of time test if low-frequency and attention given to or nonwords are used each method
Research Project 6: Exploring formulaic language Goal To describe the larger patterning around collocations.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 268
4/13/2010 1:02:13 PM
Example Research Projects 269
Methodology
Description [ANIMATE OBJECT] think(s) nothing of
[DOING SOME ACTIVITY WHICH IS SURPRISING, UNEXPECTED, OR UNUSUAL]
Usage It is commonly used to express the meaning ‘someone/something habitually does something which we would not expect’. Examples Ron thinks nothing of writing a new book every six months. She thinks nothing of going out at midnight to begin her night on the town. The government thinks nothing of spending millions of the taxpayers’ dollars on worthless projects. The first step in identifying and describing these expressions is accessing a corpus to query. Section 7.2 introduces an extensive range of corpus resources. For this project, it is probably the best to use a corpus which represents general English, in order to find the phraseology which is common to English as it is most widely used. You will also need to access a concordancer, either one you can install on your own computer like WordSmith, or one built into an internet site, like the search engine available to interrogate the corpora on the BYU corpus website (see Section 7.2.1). Once you have accessed these resources, you begin by choosing a number of target words. You can choose either a ‘variety pack’ of words (e.g. words from different word classes, frequencies, imageabilities, etc.) to see if lexical patterning varies according to the words’ characteristics, or choose a set of words which are similar (e.g. adjectives within the first 500 frequency band) to see if the patterning has any similarities. Use your concordancer to find the main collocations for the target word. Most words have some collocations which can be extracted through t-score, MI, or other corpus analysis. (But note that this varies, and some words seem to lack identifiable collocates. In this case, simply choose another word to analyze.) Choose the ‘strongest’ collocation and call up concordance lines for it. For most concordancers, you simply type in the two words of the collocation. This should bring up a number of concordance lines with the ‘core
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
This project explores lexical patterning through corpus analysis. Read Chapter 3 in this volume, with special attention to Section 3.5, and the sections dealing with corpora (1.1.4, 7.2, 7.3). You will be looking for variable expressions similar to the following described in Section 3.5:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 269
4/13/2010 1:02:13 PM
270
Researching Vocabulary
Shirley’s a gal who thinks nothing of spending $3,000 or more He thinks nothing of doing business with the Joe thinks nothing of his hurtful remark has no money but thinks nothing of flying to Hawaii or She thinks nothing of staying up till three o’clock You should also be able to sort the words to the left and right of the core collocation in most concordancers. Sort to the left (or right), and try to discern any patterning. Things to look for include: ●
●
●
Any words which consistently appear in a particular position. These might be fixed elements of a larger variable expression, such as of after thinks nothing. Any recurring meaning. Variable expressions are typically used to convey particular meanings, and these can only be discovered by a semantic analysis of the text surrounding the core collocation. Any recurring grammatical patterns. Do these help you in your semantic analysis?
Now sort to the other side of the core collocation to find any patterning there. If you have found a variable expression which forms around the core collocation, write a description of it. Include the fixed elements in italics, and write the semantic constraints for the open ‘slots’ in CAPITALS, as in the following example from Section 3.6: SOMETHING/ SOMEONE
(be) bordered/bordering on
AN UNDESIRABLE STATE (OFTEN OF MIND)
Do the same type of analysis with a few other collocations of the same target word. Are the variable expressions similar or different? If you have access to a concordancer which does a ConCgram/kfNgram type of analysis, then also try this approach to open-slot pattern extraction. How do the results compare with your manual analysis? Once you have finished your analysis of the first target word, move on to the other target words you have chosen and repeat the above analyses.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
collocation’ highlighted in the middle of each. Some examples of concordance lines for think nothing include:
Questions to consider 1. How much extended patterning did you find? 2. Were you able to describe any variable expressions for the core collocations you looked up? 3. Did some kinds of collocations have more patterning than others? For example, did grammatical patterning make a difference (e.g. adjective +
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 270
4/13/2010 1:02:13 PM
noun versus Verb + object)? Did MI-type collocations lead to more variable expressions than t-score-type collocations? 4. Are there any similarities or differences in the lexical patterning you found according to the characteristics of the initial target words? 5. Do different core collocations for a target word each have their own identifiable variable expressions? 6. Which approach seems better able to identify variable expressions: a manual analysis (like you carried out), or an automated one (like kfNgrams)?
Research Project 7: Exploring the vocabulary task types in language textbooks Goal To explore how language textbooks introduce and teach vocabulary. Methodology In this project, you will analyze a number of language textbooks, and consider what vocabulary is introduced, what word knowledge aspects are addressed, and how it is practised and recycled. As a start, read the background information on vocabulary knowledge in Section 1.1.5 and Chapter 2. Next, find a number of textbooks you would like to analyze. You can choose textbooks which are similar in kind (e.g. integrated four-skill textbooks teaching general English to intermediate students) and see if their treatment of vocabulary is similar or not. Alternatively, you could choose textbooks for different proficiency levels, or which teach different types of English (e.g. business English, academic English), and see if the vocabulary introduced and treatments used vary according to level or type. For this project, analyzing between five and ten textbooks should be about right. Look at each textbook and first make a list of all the vocabulary which is explicitly introduced. Do not include the lexical items which occur as part of general language usage; rather, focus on the vocabulary which is somehow highlighted in some way. Some examples of this highlighting could include items occurring on word lists, items which are defined, items which are in bold format to attract attention, items linked to a picture which illustrates their meaning, and items in ‘vocabulary boxes’. There will be many fuzzy cases, and you will have to develop criteria for your selected textbooks to decide consistently whether certain items are explicitly highlighted or not. Once you have decided on the target vocabulary, go back and explore how that vocabulary is treated. Some aspects you should consider include the following, but you will no doubt find other aspects you will wish to follow up: ● ●
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Example Research Projects 271
What word knowledge aspects are addressed in the exercises? Is there any logical progression in the way different word knowledge aspects are addressed?
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 271
4/13/2010 1:02:13 PM
272 Researching Vocabulary
●
●
Is new vocabulary merely highlighted, or is it taught (i.e. is information given which will help the student understand the new items)? After vocabulary is initially introduced, is it recycled in any principled way? Does any recycling extend beyond the unit the item was introduced in? Is the authors’ intention to teach the lexical items to receptive or productive level of mastery? Or does this vary through the textbook?
If you have the resources, scan the books and then analyze them electronically with a concordancing package or with Lextutor (Section 7.3). If you are able to do this, the amount of repetition and recycling of lexical items will be calculated for you by the software. You will also be able to get an overview of the entire vocabulary content of the textbooks. You can then compare the frequency profiles of each textbook with another, which should provide useful insights into the relative difficulty of each book’s vocabulary content. When reporting your findings, you will want to put the results in a table format to make it easier for your readers to compare across the different textbooks. Questions to consider In addition to the questions outlined above, you may wish to consider the following: 1. Are the word knowledge aspects addressed appropriate for the level of student the textbook is written for? 2. Does it seem that there is enough information given and enough practice and recycling generated for the target items to reach the desired receptive or productive level of mastery? 3. What is the balance between individual words highlighted vs. formulaic sequences? 4. Are individual words taught by themselves, or as part of a word family? 5. Does it appear that the target lexical items are presented according to a rationale, which allows the principled selection of items according to level or type of English?
Research Project 8: Exploring the repetition of vocabulary in texts Goal
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
To determine whether the repetition of vocabulary in texts is enough to support incidental vocabulary learning. Methodology Research suggests that it takes something like eight to ten exposures to establish the form-meaning link from incidental vocabulary learning. Graded readers
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 272
4/13/2010 1:02:14 PM
are written to maximize vocabulary recycling both by limiting the range of vocabulary used, and by using the principle of repeating vocabulary whenever possible. Also, children’s reading books often follow these principles, as do teenage novels to a lesser extent. Conversely, adult authentic texts generally have less repetition. In all of these text types, high-frequency vocabulary will almost always be repeated more often than low frequency vocabulary. In this project, you will explore the repetition of vocabulary in several types of texts, and think about the implications of that repetition (or lack thereof) for incidental vocabulary learning for L2 learners. The background reading for this project includes Sections 1.1.6, 1.2, 2.5, and 2.8. You will analyze four types of text: ●
● ● ●
Graded readers meant for L2 learners (it would be interesting to look at several levels of graded reader in the analysis) Children’s books meant for L1 beginning readers Teenage novels or other literature meant for the developing L1 reader Authentic texts meant for adult L1 readers (e.g. newspapers, magazines, novels)
Collect materials from each of these categories. The texts designed for beginning readers (either in the L1 or L2) will generally be relatively short, and so a number of whole texts will be required. The teenage novels/literature and adult texts may be lengthy, and so it might be necessary to sample from the longer texts. However, many of these texts will be shorter (e.g. newspaper stories and magazine articles) and can be used in their entirety. Scan the texts into an electronic format, and analyze them electronically with a concordancing package or with Lextutor (Section 7.3). These will give you frequency lists and show the amount of repetition and recycling of lexical items within the texts. Compare the amount of repetition among the four text types, according to the questions below. Next, choose several target words which have been repeated to different degrees (i.e. some frequently repeated, some seldom repeated). How many words of text would a learner need to read in order to be exposed to a target word eight times? Determine (through interview or questionnaire) the number of L2 words the learners you are associated with read in a typical day. Alternatively, assume 1,000 words per day for the purposes of the project. Then calculate how many days of reading it would take to reach the eight-exposure threshold for each target word. Is the period ‘compact’ enough that learning can incrementally accrue, or is it so spread out that words will likely be forgotten between the exposure intervals?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Example Research Projects 273
Questions to consider 1. Is there more repetition for the lower-level texts than for the more advanced ones?
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 273
4/13/2010 1:02:14 PM
2. Do the texts at each level have similar amounts of repetition? 3. If you compared different levels of graded reader, does the amount of repetition vary according to level? 4. What kind of vocabulary is repeated more often, and what kind only occurs once or twice? 5. Given the amount of repetition you find, how suitable is each kind of text for supporting vocabulary learning by second-language learners? 6. How do the different text types vary in their overall frequency profiles? How do these profiles make the different text types suitable for L2 learners of different proficiencies? 7. Given the amount of reading L2 learners do, is there enough repetition to support incidental vocabulary learning? Or is the amount of reading required between exposures simply too great for learning to build up?
Research Project 9: Measuring incidental vocabulary learning from reading Goal To measure the amount of incidental vocabulary learning resulting from reading for pleasure. Methodology This project is a replication of the classic Clockwork Orange research design which was first used by Saragi, Nation, and Meister (1978). They had native English speakers read the novel A Clockwork Orange, which included Russian slang words called nadsat. Afterwards, they tested for knowledge of the nadsat words. Since none of the readers spoke Russian, or were exposed it outside the novel, any knowledge of the words must have been acquired incidentally from reading the novel. This design was most recently applied to secondlanguage learners by Pellicer Sánchez and Schmitt (in press). We used an easier novel (Things Fall Apart) which included words from the Nigerian language Ibo, and a more extensive vocabulary test battery which measured recognition of spelling, recall of word class, and both recognition and recall of meaning. To prepare for this project, first read Section 1.2, and the Saragi, et al. and the Pellicer Sánchez and Schmitt articles. In this project, you will use the novel Things Fall Apart and the methodology from Pellicer Sánchez and Schmitt. The article lists the 34 target words to use. It also gives examples of the formats for the spelling, word class, and meaning tests. Using these models, you will need to write tests for each word knowledge aspect for each target word. Once you have the test battery prepared, give each of your participants a copy of Things Fall Apart to read. Tell them to read it as they normally would, to enjoy it, and that you will discuss it with them after they finished. Do NOT tell them that you will be focusing on their vocabulary acquisition!
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
274 Researching Vocabulary
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 274
4/13/2010 1:02:14 PM
Example Research Projects 275
After they have read the novel, administer the vocabulary test battery. Because they will have no previous knowledge of, or exposure to, Ibo, any knowledge they have of the Ibo words must be considered learning gains from the novel.1
1. How much learning of spelling, word class, and meaning occurred incidentally from the pleasure reading? 2. Was there more learning of some word knowledge aspects than others? 3. Did the frequency of occurrence of the target words make any difference? That is, were more-frequently-repeated words learned better than words repeated less often? 4. How do your results compare to those reported by Pellicer Sánchez and Schmitt? If they are different, how much of the variation might have been caused by different participants and how much by different tests? 5. Based on your results, do you feel that reading novels for pleasure is a viable way to learn new vocabulary?
Research Project 10: Exploring the relationship between receptive and productive knowledge of vocabulary Goal To explore and compare the different levels of mastery of the form-meaning link. Methodology Receptive knowledge of vocabulary is usually thought to precede productive knowledge, and so receptive vocabulary sizes are usually larger than productive ones. However, this is complicated by several factors, as discussed in Section 2.8. Different word knowledge aspects are learned earlier or later, and so some may be at a productive level (e.g. spelling) while others may still be unknown or at a receptive level (e.g. collocation). Also, the degree of receptive and productive knowledge measured is highly dependent on the test instruments used. In this project, you will compare the receptive and productive levels of mastery of a single word knowledge aspect: the form-meaning link. To prepare, first read Sections 1.1.8, 2.1, 2.2, 2.5, 2.8, 5.2.3, and 7.4, and Laufer and Goldstein (2004) and Laufer, Elder, Hill, and Congdon (2004). You will be using the test format originally developed by Laufer and Goldstein and used in their CATSS test. However, we will use my terminology to label the different test components and discuss the results. Laufer and Goldstein insightfully realized that testing receptive versus productive knowledge of the form-meaning link involves two elements: which word knowledge aspect is required (form or meaning), and which
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Questions to consider
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 275
4/13/2010 1:02:14 PM
276 Researching Vocabulary
1. Form recall: d———hund 2. Meaning recall: dog h ——— 3. Form recognition: hund a. cat b. dog c. mouse d. bird 4. Meaning recognition: dog a. katze b. hund c. maus d. vogel (L1 = German [hund]; L2 = English [dog]) You will use these four formats to develop a form-meaning test battery for your participants. Note that the test battery makes extensive use of translation. Therefore, this project is only suitable to researchers who (a) have a pool of participants with the same L1, and (b) are competent in both languages. To develop the tests, first access a frequency list covering your L2 vocabulary. Sample from this list from the highest frequency levels to the lowest frequency level where you think your participants might know only a few words. Pilot this list with participants similar to those you will use in the study. Find the frequency band between where most learners know the words well and where most learners barely recognize the words. Within this band, sample words at a uniform rate (e.g. ten words per 1,000 frequency band). Once you have selected the target words, write four items for each, following the model of the items illustrated above. Of course, the number of target words you can put on the test will be constrained by the amount of time you have to administer the test battery. Administer the test battery to your participants. First, give all of the form recall items in frequency order, then all of the meaning recall items, then all of the form recognition items, and finally all of the meaning recognition items. Calculate the means for all of the form-meaning tests. If you have enough participants and the statistical expertise, run a statistical analysis on the results. Questions to consider 1. Is there a hierarchy of form-meaning knowledge? For example, is meaning recognition the best known and form recall the least known? 2. Laufer and Goldstein (2004) and Laufer et al. (2004) found a hierarchy in their studies. How does your hierarchy (if any) compare with theirs? 3. If you find a hierarchy, check and see if all of the individual participants follow it. Or do many of the participants produce a different order of vocabulary knowledge than the rest of the group? 4. Are there frequency bands which consistently map onto certain formmeaning levels of knowledge? For example, is there a frequency level where most of your participants have form-recognition knowledge, and a different one for meaning-recall? 5. Given your results, do you now feel that a typical meaning recognition multiple-choice test is an adequate measure of ‘knowing’ a word?
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
degree of mastery is required (recognition or recall). This leads to four possible levels of mastery, which I have relabelled as follows:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_07_cha06.indd 276
4/13/2010 1:02:14 PM
Part 4
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Resources
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 277
4/13/2010 2:47:53 PM
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 278
4/13/2010 2:47:54 PM
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
7
7.1 Instruments 7.1.1
Vocabulary Levels Test
These two versions of the VLT were developed by Schmitt, Schmitt, and Clapham (2001), where the validation evidence is presented. See Read (2000) for discussion of the test, and Nation and Gu (2007) for additional information on how to use and interpret the results. The test is © Norbert Schmitt, but is freely available for research and pedagogical purposes, as long as they are non-commercial. This is a vocabulary test. You must choose the right word to go with each meaning. Write the number of that word next to its meaning. Here is an example. l 2 3 4 5 6
business clock ——— part of a house horse ——— animal with four legs pencil ——— something used for writing shoe wall
You answer it in the following way. l 2 3 4 5 6
business clock horse pencil shoe wall
6 part of a house ——— 3 animal with four legs ——— 4 ——— something used for writing
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources
Some words are in the test to make it more difficult. You do not have to find a meaning for these words. In the example above, these words are business, clock, and shoe. If you have no idea about the meaning of a word, do not guess. But if you think you might know the meaning, then you should try to find the answer. 279
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 279
4/13/2010 2:47:54 PM
280
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 280
4/13/2010 2:47:54 PM
1 attack 2 charm 3 lack 4 pen 5 shadow 6 treasure
1 original 2 private 3 royal 4 slow 5 sorry 6 total
——— first ——— not public ——— all added together
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— gold and silver ——— pleasing quality ——— not having something
——— break open ——— make better ——— take something to someone
1 burst 2 concern 3 deliver 4 fold 5 improve 6 urge
1 cap 2 education 3 journey 4 parent 5 scale 6 trick
——— teaching and learning ——— numbers to measure with ——— going to a far place
1 bake 2 connect ——— join together 3 inquire ——— walk without purpose 4 limit ——— keep within a certain size 5 recognize 6 wander
1 choice 2 crop ——— heat 3 flesh ——— meat 4 salary ——— money paid regularly for 5 secret doing a job 6 temperature
——— go up ——— look at closely ——— be on every side
1 adopt 2 climb 3 examine 4 pour 5 satisfy 6 surround
Version 1: The 2,000 word level 1 birth 2 dust ——— game 3 operation ——— winning 4 row ——— being born 5 sport 6 victory
281
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 281
4/13/2010 2:47:54 PM
1 bench 2 charity 3 jar 4 mate 5 mirror 6 province
Version 1: The 3,000 word level – Continued
——— help ——— cut neatly ——— spin around quickly
——— meet ——— beg for help ——— close completely
——— frighten ——— say publicly ——— hurt seriously
——— commonly done ——— wanting food ——— having no fear
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— long seat ——— help to the poor ——— part of a country
1 assist 2 bother 3 condemn 4 erect 5 trim 6 whirl
1 encounter 2 illustrate 3 inspire 4 plead 5 seal 6 shift
——— cold feeling ——— farm animal ——— organization or framework
1 acid 2 bishop 3 chill 4 ox 5 ridge 6 structure
1 brave 2 electric 3 firm 4 hungry 5 local 6 usual
1 betray 2 dispose 3 embrace 4 injure 5 proclaim 6 scare
——— part of milk ——— a lot of money ——— person who is studying
Version 1: The 3,000 word level 1 belt 2 climate ——— idea 3 executive ——— inner surface of your hand 4 notion ——— strip of leather worn 5 palm around the waist 6 victim
1 cream 2 factory 3 nail 4 pupil 5 sacrifice 6 wealth
282
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 282
4/13/2010 2:47:54 PM
1 abolish 2 drip 3 insert 4 predict 5 soothe 6 thrive
1 alcohol 2 apron 3 hip 4 lure 5 mess 6 phase
——— bring to an end by law ——— guess about the future ——— calm or comfort someone
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— stage of development ——— state of untidiness or dirtiness cloth worn in front to ——— protect your clothes
1 blend 2 devise 3 hug 4 lease 5 plague 6 reject
Version 1: The 5,000 word level 1 balloon 2 federation ——— bucket 3 novelty ——— unusual interesting thing 4 pail ——— rubber bag that is filled 5 veteran with air 6 ward
——— mix together ——— plan or invent ——— hold tightly in your arms
1 dim 2 junior ——— strange 3 magnificent ——— wonderful 4 maternal ——— not clearly lit 5 odd 6 weary
——— a place to live ——— chance of something happening ——— first rough form of something written
1 apartment 2 candle 3 draft 4 horror 5 prospect 6 timber
——— wild ——— clear and certain ——— happening once a year
1 annual 2 concealed 3 definite 4 mental 5 previous 6 savage
Version 1: The 3,000 word level – Continued 1 boot 2 device ——— army officer 3 lieutenant ——— a kind of stone 4 marble ——— tube through which blood 5 phrase flows 6 vein
283
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 283
4/13/2010 2:47:54 PM
——— circular shape ——— top of a mountain ——— a long period of time
——— to accept without protest ——— sit or lie enjoying warmth ——— make a fold on cloth or paper
——— empty ——— dark or sad ——— without end
Version 1: The 10,000 word level – Continued
1 acquiesce 2 bask 3 crease 4 demolish 5 overhaul 6 rape
1 gloomy 2 gross 3 infinite 4 limp 5 slim 6 vacant
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Version 1: The 10,000 word level 1 antics 2 batch ——— foolish behavior 3 connoisseur ——— a group of things 4 foreboding ——— person with a good 5 haunch knowledge of art or music 6 scaffold
1 concrete 2 era 3 fiber 4 loop 5 plank 6 summit
1 casual 2 desolate ——— sweet-smelling 3 fragrant ——— only one of its kind 4 radical ——— good for your health 5 unique 6 wholesome
——— female horse ——— large group of soldiers or people ——— a paper that provides information
1 bulb 2 document 3 legion 4 mare 5 pulse 6 tub
——— come before ——— fall down suddenly ——— move with quick steps and jumps
1 bleed 2 collapse 3 precede 4 reject 5 skip 6 tease
1 apparatus 2 compliment ——— expression of admiration 3 ledge ——— set of instruments or 4 revenue machinery 5 scrap ——— money received by the 6 tile Government
284
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 284
4/13/2010 2:47:55 PM
——— ghost ——— study of plants ——— small pool of water
1 dubious 2 impudent 3 languid 4 motley 5 opaque 6 primeval
1 auxiliary 2 candid 3 luscious 4 morose 5 pallid 6 pompous
——— rude ——— very ancient ——— of many different kinds
——— bad-tempered ——— full of self-importance ——— helping, adding support
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1 arsenal 2 barracks ——— happiness 3 deacon ——— difficult situation 4 felicity ——— minister in a church 5 predicament 6 spore
1 apparition 2 botany 3 expulsion 4 insolence 5 leash 6 puddle
——— move very fast ——— injure or damage ——— burn slowly without flame
1 clinch 2 jot 3 mutilate 4 smolder 5 topple 6 whiz
1 casualty 2 flurry 3 froth 4 revelry 5 rut 6 seclusion
——— someone killed or injured ——— being away from other people ——— noisy and happy celebration
1 blaspheme 2 endorse ——— slip or slide 3 nurture ——— give care and food to 4 skid ——— speak badly about God 5 squint 6 straggle
Version 1: The 10,000 word level – Continued 1 auspices 2 dregs ——— confused mixture 3 hostage ——— natural liquid present in the 4 jumble mouth 5 saliva ——— worst and most useless 6 truce parts of anything
285
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 285
4/13/2010 2:47:55 PM
——— most important ——— concerning sight ——— concerning money
Version 1: Academic Vocabulary – Continued
1 equivalent 2 financial 3 forthcoming 4 primary 5 random 6 visual
——— control something skillfully ——— expect something will happen produce books and ——— newspapers
——— keep out ——— stay alive ——— change from one thing into another
——— change ——— connect together ——— finish successfully
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— 10 years ——— subject of a discussion ——— money paid for services
——— total ——— agreement or permission ——— trying to find information about something
1 consent 2 enforcement 3 investigation 4 parameter 5 sum 6 trend
1 decade 2 fee 3 file 4 incidence 5 perspective 6 topic
1 convert 2 design 3 exclude 4 facilitate 5 indicate 6 survive
——— money for a special purpose ——— skilled way of doing something ——— study of the meaning of life
1 element 2 fund 3 layer 4 philosophy 5 proportion 6 technique 1 anticipate 2 compile 3 convince 4 denote 5 manipulate 6 publish
1 achieve 2 conceive 3 grant 4 link 5 modify 6 offset
Version 1: Academic Vocabulary 1 benefit 2 labor ——— work 3 percent ——— part of 100 4 principle ——— general idea used to 5 source guide one’s actions 6 survey
286
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 286
4/13/2010 2:47:55 PM
1 arrange 2 develop 3 lean 4 owe 5 prefer 6 seize
1 blame 2 elect ——— make 3 jump ——— choose by voting 4 manufacture ——— become like water 5 melt 6 threaten
1 accident 2 debt ——— loud deep sound 3 fortune ——— something you must pay 4 pride ——— having a high opinion of 5 roar yourself 6 thread
1 coffee 2 disease 3 justice 4 skirt 5 stage 6 wage
——— grow ——— put in order ——— like more than something else
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— money for work ——— a piece of clothing ——— using the law in the right way
1 admire 2 complain 3 fix 4 hire 5 introduce 6 stretch
Version 2: The 2,000 word level 1 copy 2 event ——— end or highest point 3 motor ——— this moves a car 4 pity ——— thing made to be like 5 profit another 6 tip
——— make wider or longer ——— bring in for the first time ——— have a high opinion of someone
1 alternative 2 ambiguous ——— last or most important 3 empirical ——— something different that 4 ethnic can be chosen 5 mutual concerning people from ——— 6 ultimate a certain nation
Version 1: Academic Vocabulary – Continued 1 colleague 2 erosion ——— action against the law 3 format ——— wearing away gradually 4 inclination ——— shape or size of something 5 panel 6 violation
287
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 287
4/13/2010 2:47:55 PM
——— a drink ——— office worker ——— unwanted sound ——— not easy ——— very old ——— related to God
1 assemble 2 attach ——— look closely 3 peer ——— stop doing something 4 quit ——— cry out loudly in fear 5 scream 6 toss
1 blanket 2 contest ——— holiday 3 generation ——— good quality 4 merit ——— wool covering used on 5 plot beds 6 vacation
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Version 2: The 3,000 word level – Continued
1 abandon 2 dwell 3 oblige 4 pursue 5 quote 6 resolve
——— live in a place ——— follow in order to catch ——— leave something permanently
1 bitter 2 independent ——— beautiful 3 lovely ——— small 4 merry ——— liked by many people 5 popular 6 slight
1 ancient 2 curious 3 difficult 4 entire 5 holy 6 social
Version 2: The 3,000 word level 1 bull 2 champion ——— formal and serious manner 3 dignity ——— winner of a sporting event 4 hell ——— building where valuable 5 museum objects are shown 6 solution
1 dozen 2 empire ——— chance 3 gift ——— twelve 4 opportunity ——— money paid to the 5 relief government 6 tax
1 clerk 2 frame 3 noise 4 respect 5 theater 6 wine
288
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 288
4/13/2010 2:47:55 PM
1 contemplate 2 extract 3 gamble 4 launch 5 provoke 6 revive
——— think about deeply ——— bring back to health ——— make someone angry
1 aware 2 blank ——— usual 3 desperate ——— best or most important 4 normal ——— knowing what is happening 5 striking 6 supreme
——— thin ——— steady ——— without clothes
——— suffer patiently ——— join wool threads together ——— hold firmly with your hands
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Version 2: The 5,000 word level 1 analysis 2 curb ——— eagerness 3 gravel ——— loan to buy a house 4 mortgage ——— small stones mixed with 5 scar sand 6 zeal
——— advice ——— a place covered with grass ——— female chicken
1 brilliant 2 distinct 3 magic 4 naked 5 slender 6 stable
1 administration 2 angel ——— group of animals 3 frost ——— spirit who serves God 4 herd ——— managing business and 5 fort affairs 6 pond
1 atmosphere 2 counsel 3 factor 4 hen 5 lawn 6 muscle
1 drift 2 endure 3 grasp 4 knit 5 register 6 tumble
Version 2: The 3,000 word level – Continued 1 comment 2 gown ——— long formal dress 3 import ——— goods from a foreign 4 nerve country 5 pasture part of the body which ——— 6 tradition carries feeling
289
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 289
4/13/2010 2:47:55 PM
——— small hill ——— day or night before a holiday ——— soldiers who fight from horses
1 chart 2 forge 3 mansion 4 outfit 5 sample 6 volunteer
1 artillery 2 creed 3 hydrogen 4 maple 5 pork 6 streak 1 adequate 2 internal 3 mature 4 profound 5 solitary 6 tragic
1 decent 2 frail 3 harsh 4 incredible 5 municipal 6 specific
1 correspond 2 embroider 3 lurk 4 penetrate 5 prescribe 6 resent
——— enough ——— fully grown ——— alone away from other things
——— weak ——— concerning a city ——— difficult to believe
——— exchange letters ——— hide and wait for someone ——— feel angry about something
1 demonstrate 2 embarrass ——— have a rest 3 heave ——— break suddenly into small 4 obscure pieces 5 relax ——— make someone feel shy or 6 shatter nervous
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— map ——— large beautiful house ——— place where metals are made and shaped
——— a kind of tree ——— system of belief ——— large gun on wheels
1 circus 2 jungle ——— musical instrument 3 nomination ——— seat without a back or 4 sermon arms 5 stool ——— speech given by a priest in 6 trumpet a church
1 cavalry 2 eve 3 ham 4 mound 5 steak 6 switch
290
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 290
4/13/2010 2:47:55 PM
1 alcove 2 impetus 3 maggot 4 parole 5 salve 6 vicar
1 bourgeois 2 brocade 3 consonant 4 prelude 5 stupor 6 tier 1 illicit 2 lewd 3 mammoth 4 slick 5 temporal 6 vindictive
——— immense ——— against the law ——— wanting revenge
——— walk in a proud way ——— kill by squeezing someone’s throat ——— say suddenly without thinking
——— write carelessly ——— move back because of fear ——— put something under water
——— steal ——— scatter or vanish ——— twist the body about uncomfortably
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— priest ——— release from prison early ——— medicine to put on wounds
——— middle class people ——— row or level of something ——— cloth with a pattern or gold or silver threads
1 blurt 2 dabble 3 dent 4 pacify 5 strangle 6 swagger
1 contaminate 2 cringe 3 immerse 4 peek 5 relay 6 scrawl
1 benevolence 2 convoy 3 lien 4 octave 5 stint 6 throttle
——— kindness ——— set of musical notes ——— speed control for an engine
1 dissipate 2 flaunt 3 impede 4 loot 5 squirm 6 vie
Version 2: The 10,000 word level 1 alabaster 2 chandelier ——— small barrel 3 dogma ——— soft white stone 4 keg ——— tool for shaping wood 5 rasp 6 tentacle
291
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 291
4/13/2010 2:47:55 PM
——— male or female ——— study of the mind ——— entrance or way in
——— make smaller ——— guess the number or size of something recognizing and naming ——— a person or thing Version 2: Academic Vocabulary – Continued
1 bond 2 channel 3 estimate 4 identify 5 mediate 6 minimize
——— keep ——— match or be in agreement with ——— give special attention to something
——— change ——— say something is not true ——— describe clearly and exactly
——— lazy ——— no longer used ——— clever and tricky
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1 access 2 gender 3 implementation 4 license 5 orientation 6 psychology
1 correspond 2 diminish 3 emerge 4 highlight 5 invoke 6 retain
——— plan ——— choice ——— joining something into a whole
1 debate 2 exposure 3 integration 4 option 5 scheme 6 stability
1 indolent 2 nocturnal 3 obsolete 4 torrid 5 translucent 6 wily
1 alter 2 coincide 3 deny 4 devote 5 release 6 specify
——— light joking talk ——— a rank of British nobility ——— picture made of small pieces of glass or stone
Version 2: Academic Vocabulary 1 area 2 contract ——— written agreement 3 definition ——— way of doing something 4 evidence ——— reason for believing 5 method something is or is not true 6 role
1 alkali 2 banter 3 coop 4 mosaic 5 stealth 6 viscount
292
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt
9781403_985354_08_cha07.indd 292
4/13/2010 2:47:55 PM
1 abstract 2 adjacent ——— next to 3 controversial ——— added to 4 global ——— concerning the whole world 5 neutral 6 supplementary
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
——— end ——— machine used to move people or goods ——— list of things to do at certain times
1 adult 2 exploitation 3 infrastructure 4 schedule 5 termination 6 vehicle
——— last ——— stiff ——— meaning ‘no’ or ‘not’
1 explicit 2 final 3 negative 4 professional 5 rigid 6 sole
Version 2: Academic Vocabulary – Continued 1 accumulation 2 edition ——— collecting things over time 3 guarantee ——— promise to repair a broken 4 media product 5 motivation feeling a strong reason or ——— 6 phenomenon need to do something
Vocabulary Resources 293
Vocabulary Size Test
The Vocabulary Size Test was developed to provide a reliable, accurate and comprehensive measure of a learner’s receptive vocabulary size from the first 1,000 to the fourteenth 1,000 word families of English. Each item in the test represents 100 word families. If a test-taker got every item correct, then it is assumed that that person knows the most frequent 14,000 word families of English. A test-taker’s score needs to be multiplied by 100 to get his/her total vocabulary size up to the fourteenth 1,000 word family level. See Beglar (2010) for validation information. Instructions: Choose the best meaning for each word. If you do not know the word at all, do not guess. Wrong guesses will be taken away from your correct answers. However, if you think you might know the meaning or part of it, then you should try to find that answer. First 1,000 1. see: They saw it. a. cut b. waited for c. looked at d. started 2. time: They have a lot of time. a. money b. food c. hours d. friends 3. period: It was a difficult period. a. question b. time c. thing to do d. book 4. figure: Is this the right figure? a. answer b. place c. time d. number 5. poor: We are poor. a. have no money b. feel happy
c. are very interested d. do not like to work hard 6. drive: He drives fast. a. swims b. learns c. throws balls d. uses a car 7. jump: She tried to jump. a. lie on top of the water b. get off the ground suddenly c. stop the car at the edge of the road d. move very fast 8. shoe: Where is your shoe? a. the person who looks after you b. the thing you keep your money in c. the thing you use for writing d. the thing you wear on your foot
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
7.1.2
9. standard: Her standards are very high.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 293
4/13/2010 2:47:55 PM
Resources
a. the bits at the back under her shoes b. the marks she gets in school c. the money she asks for d. the levels she reaches in everything Second 1,000 1. maintain: Can they maintain it? a. keep it as it is b. make it larger c. get a better one than it d. get it 2. stone: He sat on a stone. a. hard thing b. kind of chair c. soft thing on the floor d. part of a tree 3. upset: I am upset. a. tired b. famous c. rich d. unhappy 4. drawer: The drawer was empty. a. sliding box b. place where cars are kept c. cupboard to keep things cold d. animal house 5. patience: He has no patience. a. will not wait happily b. has no free time c. has no faith d. does not know what is fair 6. nil: His mark for that question was nil. a. very bad b. nothing
10. basis: This was used as the basis. a. answer b. place to take a rest c. next step d. main part
c. very good d. in the middle 7. pub: They went to the pub. a. place where people drink and talk b. place that looks after money c. large building with many shops d. building for swimming 8. circle: Make a circle. a. rough picture b. space with nothing in it c. round shape d. large hole 9. microphone: Please use the microphone. a. machine for making food hot b. machine that makes sounds louder c. machine that makes things look bigger d. small telephone that can be carried around 10. pro: He’s a pro. a. someone who is employed to find out important secrets b. a stupid person c. someone who writes for a newspaper d. someone who is paid for playing sport etc
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
294
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 294
4/13/2010 2:47:56 PM
Vocabulary Resources 295
2. restore: It has been restored. a. said again b. given to a different person c. given a lower price d. made like new again 3. jug: He was holding a jug. a. a container for pouring liquids b. an informal discussion c. a soft cap d. a weapon that explodes 4. scrub: He is scrubbing it. a. cutting shallow lines into it b. repairing it c. rubbing it hard to clean it d. drawing simple pictures of it 5. dinosaur: The children were pretending to be dinosaurs. a. robbers who work at sea b. very small creatures with human form but with wings c. large creatures with wings that breathe fire Fourth 1,000 1. compound: They made a new compound. a. agreement b. thing made of two or more parts c. group of people forming a business
d. animals that lived an extremely long time ago 6. strap: He broke the strap. a. promise b. top cover c. shallow dish for food d. strip of material for holding things together 7. pave: It was paved. a. prevented from going through b. divided c. given gold edges d. covered with a hard surface 8. dash: They dashed over it. a. moved quickly b. moved slowly c. fought d. looked quickly 9. rove: He couldn’t stop roving. a. getting drunk b. traveling around c. making a musical sound through closed lips d. working hard 10. lonesome: He felt lonesome. a. ungrateful b. very tired c. lonely d. full of energy
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Third 1,000 1. soldier: He is a soldier. a. person in a business b. student c. person who uses metal d. person in the army
d. guess based on past experience 2. latter: I agree with the latter. a. man from the church b. reason given
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 295
4/13/2010 2:47:56 PM
Resources
c. last one d. answer 3. candid: Please be candid. a. be careful b. show sympathy c. show fairness to both sides d. say what you really think
7. crab: Do you like crabs? a. sea creatures that walk sideways b. very thin small cakes c. tight, hard collars d. large black insects that sing at night
4. tummy: Look at my tummy. a. cloth to cover the head b. stomach c. small furry animal d. thumb
8. vocabulary: You will need more vocabulary. a. words b. skill c. money d. guns
5. quiz: We made a quiz. a. thing to hold arrows b. serious mistake c. set of questions d. box for birds to make nests in
9. remedy: We found a good remedy. a. way to fix a problem b. place to eat in public c. way to prepare food d. rule about numbers
6. input: We need more input. a. information, power, etc. put into something b. workers c. artificial filling for a hole in wood d. money Fifth 1,000 1. deficit: The company had a large deficit. a. spent a lot more money than it earned b. went down a lot in value c. had a plan for its spending that used a lot of money d. had a lot of money stored in the bank 2. weep: He wept. a. finished his course b. cried
10. allege: They alleged it. a. claimed it without proof b. stole the ideas for it from someone else c. provided facts to prove it d. argued against the facts that supported it
c. died d. worried 3. nun: We saw a nun. a. long thin creature that lives in the earth b. terrible accident c. woman following a strict religious life d. unexplained bright light in the sky
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
296
4. haunt: The house is haunted. a. full of ornaments
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 296
4/13/2010 2:47:56 PM
Vocabulary Resources 297
5. compost: We need some compost. a. strong support b. help to feel better c. hard stuff made of stones and sand stuck together d. rotted plant material 6. cube: I need one more cube. a. sharp thing used for joining things b. solid square block c. tall cup with no saucer d. piece of stiff paper folded in half
d. a small line to join letters in handwriting 8. peel: Shall I peel it? a. let it sit in water for a long time b. take the skin off it c. make it white d. cut it into thin pieces 9. fracture: They found a fracture. a. break b. small piece c. short coat d. rare jewel
7. miniature: It is a miniature. a. a very small thing of its kind b. an instrument for looking at very small objects c. a very small living creature
10. bacterium: They didn’t find a single bacterium. a. small living thing causing disease b. plant with red or orange flowers c. animal that carries water in lumps on its back d. thing that has been stolen and sold to a shop
Sixth 1,000 1. devious: Your plans are devious. a. tricky b. well-developed c. not well thought out d. more expensive than necessary
3. butler: They have a butler. a. man servant b. machine for cutting up trees c. private teacher d. cool dark room under the house
2. premier: The premier spoke for an hour. a. person who works in a law court b. university teacher c. adventurer d. head of the government
4. accessory: They gave us some accessories. a. papers giving us the right to enter a country b. official orders c. ideas to choose between d. extra pieces
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
b. rented c. empty d. full of ghosts
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 297
4/13/2010 2:47:56 PM
298 Resources
6. thesis: She has completed her thesis. a. long written report of study carried out for a university degree b. talk given by a judge at the end of a trial c. first year of employment after becoming a teacher d. extended course of hospital treatment 7. strangle: He strangled her. a. killed her by pressing her throat b. gave her all the things she wanted Seventh 1,000 1. olive: We bought olives. a. oily fruit b. scented pink or red flowers c. men’s clothes for swimming d. tools for digging up weeds
c. took her away by force d. admired her greatly 8. cavalier: He treated her in a cavalier manner. a. without care b. politely c. awkwardly d. as a brother would 9. malign: His malign influence is still felt. a. evil b. good c. very important d. secret 10. veer: The car veered. a. went suddenly in another direction b. moved shakily c. made a very loud noise d. slid sideways without the wheels turning
b. hurting someone so much that they agreed to their demands c. moving secretly with extreme care and quietness d. taking no notice of problems they met
2. quilt: They made a quilt. a. statement about who should get their property when they die b. firm agreement c. thick warm cover for a bed d. feather pen
4. shudder: The boy shuddered. a. spoke with a low voice b. almost fell c. shook d. called out loudly
3. stealth: They did it by stealth. a. spending a large amount of money
5. bristle: The bristles are too hard. a. questions b. short stiff hairs
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
5. threshold: They raised the threshold. a. flag b. point or line where something changes c. roof inside a building d. cost of borrowing money
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 298
4/13/2010 2:47:56 PM
Vocabulary Resources 299
6. bloc: They have joined this bloc. a. musical group b. band of thieves c. small group of soldiers who are sent ahead of others d. group of countries with a common purpose 7. demography: This book is about demography. a. the study of patterns of land use b. the study of the use of pictures to show facts about numbers c. the study of the movement of water d. the study of population 8. gimmick: That’s a good gimmick. a. thing for standing on to work high above the ground Eighth 1,000 1. erratic: He was erratic. a. without fault b. very bad c. very polite d. unsteady 2. palette: He lost his palette. a. basket for carrying fish b. wish to eat food c. young female companion d. artist’s board for mixing paints 3. null: His influence was null. a. had good results b. was unhelpful
b. small thing with pockets for holding money c. attention-getting action or thing d. clever plan or trick 9. azalea: This azalea is very pretty. a. small tree with many flowers growing in groups b. light material made from natural threads c. long piece of material worn by women in India d. sea shell shaped like a fan 10. yoghurt: This yoghurt is disgusting. a. dark grey mud found at the bottom of rivers b. unhealthy, open sore c. thick, soured milk, often with sugar and flavouring d. large purple fruit with soft flesh
c. had no effect d. was long-lasting 4. kindergarten: This is a good kindergarten. a. activity that allows you to forget your worries b. place of learning for children too young for school c. strong, deep bag carried on the back d. place where you may borrow books
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
c. folding beds d. bottoms of the shoes
5. eclipse: There was an eclipse. a. a strong wind
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 299
4/13/2010 2:47:56 PM
300 Resources
6. marrow: This is the marrow. a. symbol that brings good luck to a team b. soft centre of a bone c. control for guiding a plane d. increase in salary 7. locust: There were hundreds of locusts. a. insects with wings b. unpaid helpers c. people who do not eat meat d. brightly coloured wild flowers Ninth 1,000 1. hallmark: Does it have a hallmark? a. stamp to show when it should be used by b. stamp to show the quality c. mark to show it is approved by the royal family d. mark or stain to prevent copying 2. puritan: He is a puritan. a. person who likes attention b. person with strict morals c. person with a moving home d. person who keeps money and hates spending it 3. monologue: Now he has a monologue. a. single piece of glass to hold over his eye to help him to see better
8. authentic: It is authentic. a. real b. very noisy c. old d. like a desert 9. cabaret: We saw the cabaret. a. painting covering a whole wall b. song and dance performance c. small crawling insect d. person who is half fish, half woman 10. mumble: He started to mumble. a. think deeply b. shake uncontrollably c. stay further behind the others d. speak in an unclear way
b. long turn at talking without being interrupted c. position with all the power d. picture made by joining letters together in interesting ways 4. weir: We looked at the weir. a. person who behaves strangely b. wet and muddy place with water plants c. old metal musical instrument played by blowing d. thing built across a river to control the water
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
b. a loud noise of something hitting the water c. the killing of a large number of people d. the sun hidden by a planet
5. whim: He had lots of whims. a. old gold coins b. female horses c. strange ideas with no motive d. sore red lumps
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 300
4/13/2010 2:47:56 PM
Vocabulary Resources 301
7. regent: They chose a regent. a. an irresponsible person b. a person to run a meeting for a short time c. a ruler acting in place of the king or queen d. a person to represent them 8. octopus: They saw an octopus. a. a large bird that hunts at night b. a ship that can go under water c. a machine that flies by means of turning blades Tenth 1,000 1. awe: They looked at the mountain with awe. a. worry b. interest c. wonder d. respect 2. peasantry: He did a lot for the peasantry. a. local people b. place of worship c. businessmen’s club d. poor farmers 3. egalitarian: This organization is very egalitarian. a. does not provide much information about itself to the public b. dislikes change
d. a sea creature with eight legs 9. fen: The story is set in the fens. a. a piece of low flat land partly covered by water b. a piece of high, hilly land with few trees c. a block of poor-quality houses in a city d. a time long ago 10. lintel: He painted the lintel. a. beam across the top of a door or window b. small boat used for getting to land from a big boat c. beautiful tree with spreading branches and green fruit d. board which shows the scene in a theatre
c. frequently asks a court of law for a judgement d. treats everyone who works for it as if they are equal 4. mystique: He has lost his mystique. a. his healthy body b. the secret way he makes other people think he has special power or skill c. the woman who has been his lover while he is married to someone else d. the hair on his top lip
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
6. perturb: I was perturbed. a. made to agree b. worried c. very puzzled d. very wet
5. upbeat: I’m feeling really upbeat about it. a. upset b. good
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 301
4/13/2010 2:47:56 PM
Resources
c. hurt d. confused 6. cranny: We found it in the cranny! a. sale of unwanted objects b. narrow opening c. space for storing things under the roof of a house d. large wooden box 7. pigtail: Does she have a pigtail? a. a long rope of hair made by twisting bits together b. a lot of cloth hanging behind a dress c. a plant with pale pink flowers that hang down in short bunches d. a lover 8. crowbar: He used a crowbar. a. heavy iron pole with a curved end Eleventh 1,000 1. excrete: This was excreted recently. a. pushed or sent out b. made clear c. discovered by a science experiment d. put on a list of illegal things 2. mussel: They bought mussels. a. small glass balls for playing a game b. shellfish c. large purple fruits d. pieces of soft paper to keep the clothes clean when eating
b. false name c. sharp tool for making holes in leather d. light metal walking stick 9. ruck: He got hurt in the ruck. a. hollow between the stomach and the top of the leg b. noisy street fight c. group of players gathered round the ball in some ball games d. race across a field of snow 10. lectern: He stood at the lectern. a. desk made to hold a book at a good height for reading b. table or block used for church sacrifices c. place where you buy drinks d. very edge
3. yoga: She has started yoga. a. handwork done by knotting thread b. a form of exercise for the body and mind c. a game where a cork stuck with feathers is hit between two players d. a type of dance from eastern countries
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
302
4. counterclaim: They made a counterclaim. a. a demand made by one side in a law case to match the other side’s demand
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 302
4/13/2010 2:47:57 PM
b. a request for a shop to take back things with faults c. an agreement between two companies to exchange work d. a top cover for a bed 5. puma: They saw a puma. a. small house made of mud bricks b. tree from hot, dry countries c. very strong wind that sucks up anything in its path d. large wild cat 6. pallor: His pallor caused them concern. a. his unusually high temperature b. his lack of interest in anything c. his group of friends d. the paleness of his skin 7. aperitif: She had an aperitif. a. a long chair for lying on with just one place to rest an arm b. a private singing teacher c. a large hat with tall feathers d. a drink taken before a meal Twelfth 1,000 1. haze: We looked through the haze. a. small round window in a ship b. unclear air c. cover for a window made of strips of wood or plastic d. list of names 2. spleen: His spleen was damaged. a. knee bone b. organ found near the stomach
8. hutch: Please clean the hutch. a. thing with metal bars to keep dirt out of water pipes b. space in the back of a car used for bags etc c. round metal thing in the middle of a bicycle wheel d. cage for small animals 9. emir: We saw the emir. a. bird with two long curved tail feathers b. woman who cares for other people’s children in Eastern countries c. Middle Eastern chief with power in his own land d. house made from blocks of ice 10. hessian: She bought some hessian. a. oily pinkish fish b. stuff that produces a happy state of mind c. coarse cloth d. strong-tasting root for flavouring food
c. pipe taking waste water from a house d. respect for himself 3. soliloquy: That was an excellent soliloquy! a. song for six people b. short clever saying with a deep meaning c. entertainment using lights and music d. speech in the theatre by a character who is alone
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources 303
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 303
4/13/2010 2:47:57 PM
Resources
4. reptile: She looked at the reptile. a. old hand-written book b. animal with cold blood and a hard outside c. person who sells things by knocking on doors d. picture made by sticking many small pieces of different colours together 5. alum: This contains alum. a. a poisonous substance from a common plant b. a soft material made of artificial threads c. a tobacco powder once put in the nose d. a chemical compound usually involving aluminium 6. refectory: We met in the refectory. a. room for eating b. office where legal papers can be signed c. room for several people to sleep in d. room with glass walls for growing plants
b. threads from very tough leaves c. ideas that are not correct d. a substance that makes you excited 8. impale: He nearly got impaled. a. charged with a serious offence b. put in prison c. stuck through with a sharp instrument d. involved in a dispute 9. coven: She is the leader of a coven. a. a small singing group b. a business that is owned by the workers c. a secret society d. a group of church women who follow a strict religious life
7. caffeine: This contains a lot of caffeine. a. a substance that makes you sleepy
10. trill: He practised the trill. a. ornament in a piece of music b. type of stringed instrument c. way of throwing a ball d. dance step of turning round very fast on the toes
Thirteenth 1,000 1. ubiquitous: Many weeds are ubiquitous. a. are difficult to get rid of b. have long, strong roots c. are found in most countries d. die away in the winter
2. talon: Just look at those talons! a. high points of mountains b. sharp hooks on the feet of a hunting bird c. heavy metal coats to protect against weapons d. people who make fools of themselves without realizing it
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
304
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 304
4/13/2010 2:47:57 PM
Vocabulary Resources 305
4. jovial: He was very jovial. a. low on the social scale b. likely to criticize others c. full of fun d. friendly 5. communiqué: I saw their communiqué. a. critical report about an organization b. garden owned by many members of a community c. printed material used for advertising d. official announcement 6. plankton: We saw a lot of plankton. a. poisonous weeds that spread very quickly b. very small plants or animals found in water c. trees producing hard wood d. grey clay that often causes land to slip
b. man-made object going round the earth c. person who does funny tricks d. small bird that flies high as it sings 8. beagle: He owns two beagles. a. fast cars with roofs that fold down b. large guns that can shoot many people quickly c. small dogs with long ears d. houses built at holiday places 9. atoll: The atoll was beautiful. a. low island made of coral round a sea-water lake b. work of art created by weaving pictures from fine thread c. small crown with many precious jewels worn in the evening by women d. place where a river flows through a narrow place full of large rocks
7. skylark: We watched a skylark. a. show with aeroplanes flying in patterns
10. didactic: The story is very didactic. a. tries hard to teach something b. is very difficult to believe c. deals with exciting actions d. is written in a way which makes the reader unsure of the meaning
Fourteenth 1,000 1. canonical: These are canonical examples. a. examples which break the usual rules
b. examples taken from a religious book c. regular and widely accepted examples
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
3. rouble: He had a lot of roubles. a. very precious red stones b. distant members of his family c. Russian money d. moral or other difficulties in the mind
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 305
4/13/2010 2:47:57 PM
306 Resources
2. atop: He was atop the hill. a. at the bottom of b. at the top of c. on this side of d. on the far side of 3. marsupial: It is a marsupial. a. an animal with hard feet b. a plant that grows for several years c. a plant with flowers that turn to face the sun d. an animal with a pocket for babies 4. augur: It augured well. a. promised good things for the future b. agreed well with what was expected c. had a colour that looked good with something else d. rang with a clear, beautiful sound 5. bawdy: It was very bawdy. a. unpredictable b. enjoyable c. rushed d. rude
6. gauche: He was gauche. a. talkative b. flexible c. awkward d. determined 7. thesaurus: She used a thesaurus. a. a kind of dictionary b. a chemical compound c. a special way of speaking d. an injection just under the skin 8. erythrocyte: It is an erythrocyte. a. a medicine to reduce pain b. a red part of the blood c. a reddish white metal d. a member of the whale family 9. cordillera: They were stopped by the cordillera. a. a special law b. an armed ship c. a line of mountains d. the eldest son of the king 10. limpid: He looked into her limpid eyes. a. clear b. tearful c. deep brown d. beautiful
7.1.3 Meara’s _lognostics measurement instruments There are a number of measurement instruments on Paul Meara’s _lognostics website. For details, see the documentation on the lognostics site , discussion of Meara’s website in Section 6.5, and related sections in this book.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
d. examples discovered very recently
There are two vocabulary size tests in the Lex family: ●
X_Lex
A 5K vocabulary size test
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 306
4/13/2010 2:47:57 PM
Vocabulary Resources 307 ●
Y_Lex
A 5–10K vocabulary size test
● ●
●
P_Lex A program for evaluating the vocabulary used in short texts D_Tools A program that calculates the mean segmental TTR statistic vocd for short texts V_Size A program that estimates the productive vocabulary underlying short texts
The site includes two depth-of-knowledge tests: ● ●
V_Quint An alternative assessment of vocabulary depth Lex_30 Online word association test
There is also a suite of language aptitude tests: ●
Llama
language aptitude tests
The site indicates that a number of other measurement instruments will also be added in the future, including a tests of short-term memory.
7.2
Corpora
As we have seen throughout this book, corpora have transformed the way we think about and research vocabulary. It is hard to imagine any area of vocabulary research into acquisition, processing, pedagogy, or assessment where the insights available from corpus analysis would not be valuable. In fact, it is probably not too extreme to say that most sound vocabulary research will have some corpus element. With this in mind, how are researchers to obtain appropriate corpora? In many cases, the research purpose will require the compilation of one or more corpora which are custom-designed to achieve that goal. For example, if a researcher wished to determine the effect of a syllabus change on student writing within a particular school, then it would be necessary to obtain learner writing samples from both before and after the syllabus change. Other corpora built of student writing would probably not be able to indicate how the particular students within that particular school environment would react to the syllabus change. For other purposes, the use of pre-existing corpora makes good sense for a number of reasons. First and foremost, corpus compilation can be a time consuming and expensive affair, and only organizations with substantial financial backing will be able to expend the hundreds of thousands of dollars and years of effort to put together larger corpora. For example,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
There are three tests which give a measure of productive vocabulary:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 307
4/13/2010 2:47:57 PM
the British National Corpus (100 million words) was put together with the resources of a consortium of several major publishers and universities (Longman Group Ltd (now Addison-Wesley Longman), Oxford University Press, Chambers Harrap, Oxford University Computing Services, the Unit for Computer Research on the English Language (Lancaster University), and the British Library Research and Development Department). Spoken corpora in particular are extremely time and money-intensive to compile, even though they are typically much smaller: the CANCODE took eight years to collect, transcribe, and code 5 million words. Beside the financial and time issues, corpus compilation takes a considerable amount of expertise. Corpus linguists working on major corpora are often full-time specialists, who have years of experience, time to keep up with the latest developments in corpus linguistics, and access to the latest technology. Although it is perfectly possible for teachers and novice researchers to put together and use basic corpora, it would take considerable time and effort to build the expertise which a larger corpus project requires. The last reason is simple common sense: why reinvent the wheel? If an existing corpus is appropriate for the research aims, it seems silly not to use it. This section describes a range of corpora which are available to different degrees (some free, some for a fee, some with open access, some with restricted access). I will give extended coverage to the most accessible and what I consider the most useful corpora, and give briefer annotations of the rest. Obviously, whether a corpus is useful or not will depend on particular research purposes, but hopefully the descriptions here will guide you as to whether any of the corpora might be worth exploring for the vocabulary research you have in mind. There are more corpora than can be summarized in the space available, but the interested reader can find more details from several sources, including: ● ● ●
●
David Lee’s comprehensive corpus website () Richard Xiao’s corpus websites The original website is (). This website has also been turned into a written survey chapter for the book Corpus Linguistics (Lüdeling and Kyto, 2008). A companion website to the book contains the updated chapter (). The corpus survey at the back of From Corpus to Classroom (O’Keeffe, McCarthy, and Carter, 2007)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
308 Resources
Some of the following descriptions draw heavily on those sources, and others are mainly condensations of the information available on the respective corpus websites at the time of writing (July 2008). Note that many corpora cross over my categories (e.g. the MICASE is both spoken and academic) and that all quotations are drawn from the respective corpus websites.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 308
4/13/2010 2:47:57 PM
Vocabulary Resources 309
Changeability of the internet
The internet provides a valuable source of information and analysis tools for vocabulary research. However, by its nature, it is constantly being revised and updated. This means that some of the addresses will inevitably change or be withdrawn in the future. All of the web addresses were correct and accessible as we went to press, and I ask the reader’s indulgence when some of the addresses eventually do not work as stated in this book.
7.2.1 Corpora representing general English (mainly written) British National Corpus (BNC) [90 million written and 10 million spoken British English] () and () for full documentation The BNC has been the ‘gold standard’ corpus of general English since its launch in the early 1990s, and a great deal of vocabulary research has utilized it. This is due to its large size (100 million words), and the fact that it is a balanced corpus, i.e. it was compiled according to predetermined percentages of a wide range of different types of English. This works to avoid bias, and gives some assurance that the corpus represents English reasonably well overall. The corpus also contains a substantial spoken component. ‘The BNC is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the later part of the 20th century, both spoken and written.’ The corpus includes ‘many different styles and varieties, and is not limited to any particular subject field, genre or register.’ Its written component makes up about 90% of the corpus, and the orthographically transcribed spoken component about 10%. Although it would have been desirable to have a 50/50 split, the cost prohibited this. Regardless, the 10 million word spoken component is still a considerable achievement. The written part is made up of many kinds of text, which were selected according to three criteria: domain, medium, and time. Domain indicates the kind of writing. About 75% of the written texts are informative writings, of which roughly equal quantities were chosen from ‘the fields of applied sciences, arts, belief & thought, commerce & finance, leisure, natural & pure science, social science, [and] world affairs.’ About 25% are imaginative, that is, literary and creative works. Medium refers to the kind of publication in which the text occurs. About 60% of written texts come from books, 25% from periodicals (newspapers etc.), 5–10% from other kinds of miscellaneous published material (brochures, advertising leaflets, etc.), 5–10% from unpublished written material such as personal letters and diaries, essays and memoranda, etc, and a small amount (less than 5%) from material written
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Concept 7.1
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 309
4/13/2010 2:47:57 PM
to be spoken (for example, political speeches, play texts, broadcast scripts, etc.). In terms of time, the imaginative texts date from 1960, and the informative texts from 1975, and they end in 1993. Overall, the corpus includes samples of 45,000 words taken from various parts of single-author texts. In addition, shorter texts (up to a maximum of 45,000 words), and multiauthor material like magazines and newspapers are included in full. The spoken part contains two different elements: ‘a demographic part, containing transcriptions of spontaneous natural conversations made by members of the public and a context-governed part, containing transcriptions of recordings made at specific types of meeting and event.’ The demographic part was gathered by 124 volunteers with a balanced range of gender, age, social grouping, and location (38 across the UK). They used personal stereos to unobtrusively record all their conversations over two or three days, and after that logged the details of each conversation in a special notebook. The context-governed part contains roughly equal quantities of speech recorded in four broad categories of social context: ●
●
●
●
Educational and informative events, such as lectures, news broadcasts, classroom discussion, tutorials. Business events such as sales demonstrations, trades union meetings, consultations, interviews. Institutional and public events, such as sermons, political speeches, council meetings, parliamentary proceedings. Leisure events, such as sports commentaries, after-dinner speeches, club meetings, radio phone-ins.
Overall, the BNC contains 4,054 texts which total 100,467,090 orthographic words. It takes about 1.5 Gb to store on a computer, which taxed university servers when it first came out, but which is not a problem for modern personal computers. Work began on the corpus in 1991, and it was completed in 1994. A slightly revised second edition was released 2001, called BNC World Edition, and the latest version of the full corpus came out in 2007 (BNC XML Edition). Two smaller sub-corpora drawn from the full BNC have also been made available, the BNC Sampler and the BNC Baby. All of these can be purchased on the BNC website. BNC XML edition The XML edition is annotated with word-class information (part-of-speech) and metatextual information (e.g. author, source). The XML edition is a revised version of BNC World (which has been superseded), with the main differences being that it has been transferred to an XML format, and that it has an improved concordance program, XAIRA, which allows more search options and an improved user interface than the previous SARA search program. ‘It is available on DVD for installation on a stand-alone PC or on a
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
310 Resources
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 310
4/13/2010 2:47:57 PM
Vocabulary Resources 311
Windows, Unix or OSX server. It is delivered with a copy of the XAIRA search program and all necessary XAIRA index files.’
‘BNC Baby is a subset of BNC World. It consists of four 1-million word samples, each compiled as an example of a particular genre: fiction, newspapers, academic writing and spoken conversation. The texts have the same annotation as the full corpus (part of speech, meta data, etc). The BNC Baby is in XML format and can be searched with the XAIRA Tool. It is distributed on a CD together with the BNC Sampler and an XML version of the American English Brown corpus.’ BNC Sampler ‘The BNC Sampler is a subset of the full BNC. It comprises two samples of written and spoken material of one million words each, compiled to mirror the composition of the full BNC as far as possible. The word-class annotation of the BNC Sampler texts has been carefully checked and manually corrected. The Sampler was first created at Lancaster University during the creation of the BNC. The BNC Sampler is in XML format and can be searched with the XAIRA Tool. It is distributed on the BNC Baby CD together with the BNC Baby and an XML version of the American English Brown corpus.’ The BNC XML costs £75, and the BNC Baby+Sampler costs £21 (July 2008) and can be purchased from the BNC website. Considering the cost involved in compiling the various corpora, this is not expensive. It is also possible to query the BNC through their website using the ‘BNC Simple Search’. It will give the frequency of a word or phrase in the full corpus and up to 50 randomly-selected sentences which contain the target lexical item. The use of wildcards is possible (e.g. bread_butter will bring up bread and butter, bread with butter, bread or butter, etc.). However, the target words/phrases are not lined in the centre of the screen for easy comparison as is standard in concordancing software. Also, there is no facility for sorting or for a collocation search. Still, the site is free and is good for providing a number of authentic contexts for target lexemes. Brigham Young University and Mark Davies host a more functional web site based on the BNC called the BYU-BNC: The British National Corpus (Davies, 2004: ). It shows the frequencies of the queried word or phrase in the spoken, fiction, newspaper, academic, and miscellaneous components of the corpus in a graphic or list format. With a single mouse-click, all of the instances of the word/phrase appear, with the node highlighted in a bold and underlined font. There are also wildcard options, including searches for all words from a single word class. The site allows the comparison of vocabulary in different registers, e.g. nouns near the word chair in academic versus fiction texts. It also allows semantically-oriented searches, which is good for comparing synonyms and other semantically-related words, such as comparing the most frequent
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
BNC Baby
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 311
4/13/2010 2:47:58 PM
nouns that appear with the adjectives small and little. For more information on the site’s capabilities, see the Corpus of Contemporary American English below, as it shares the same interface. There is also some important independent BNC support material on the web. One key destination is David Lee’s corpus resources site (http://clix. to/davidlee00), which contain his BNC Index. Adam Kilgarriff has a website which provides BNC frequency lists (. The Phrases in English (PIE) interface () allows the free online phraseological interrogation of the BNC (World version) in strings up to eight words long.
Quote 6.1
BNC website on the size of the BNC
To put these numbers [100 million words] into perspective, the average paperback book has about 250 pages per centimetre of thickness; assuming 400 words a page, we calculate that the whole corpus printed in small type on thin paper would take up about ten metres of shelf space. Reading the whole corpus aloud at a fairly rapid 150 words a minute, eight hours a day, 365 days a year, would take just over four years. ()
Corpus of Contemporary American English (COCA) [309 million written and 79 million spoken American English] () The COCA developed by Mark Davies is a very exciting new corpus resource for a number of reasons. First, it represents the American variety of English. This is essential as a counterpart to the other main variety (British English), which is covered by the BNC. (Apologies to other smaller, but still important, varieties like Australian and South African English!) Second, it is large. The COCA contains more than 385 million words in over 150,000 texts, including 20 million words each year from 1990 to 2008 (as of December 15, 2008). This is nearly four times the size of the BNC, and vastly larger than any other available American English corpus. For example, the COCA compares very favorably with the American National Corpus, the other major American English corpus project, at least at the ANC’s current state of development. (See for a comparison between the COCA and ANC.) Third, the size has not been achieved at the expense of balance, with the texts being equally divided among five genre/registers. The website gives the following description:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
312 Resources
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 312
4/13/2010 2:47:58 PM
●
●
●
●
●
Spoken: (79 million words) Transcripts of unscripted conversation from more than 150 different TV and radio programs (examples: All Things Considered (NPR), Newshour (PBS), Good Morning America (ABC), Today Show (NBC), 60 Minutes (CBS), Hannity and Colmes (Fox), Jerry Springer, etc.). [The website has a discussion of the naturalness and authenticity of this ‘unscripted’ language.] Fiction: (76 million words) Short stories and plays from literary magazines, children’s magazines, popular magazines, first chapters of first edition books 1990–present, and movie scripts. Popular Magazines: (81 million words) Nearly 100 different magazines, with a good mix (overall, and by year) between specific domains (news, health, home and gardening, women, financial, religion, sports, etc.). A few examples are Time, Men’s Health, Good Housekeeping, Cosmopolitan, Fortune, Christian Century, Sports Illustrated. Newspapers: (76 million words) Ten newspapers from across the US, including: USA Today, New York Times, Atlanta Journal Constitution, San Francisco Chronicle, etc. In most cases, there is a good mix between different sections of the newspaper, such as local news, opinion, sports, financial, etc. Academic Journals: (76 million words) Nearly 100 different peer-reviewed journals. These were selected to cover the entire range of the Library of Congress classification system (e.g. a certain percentage from B (philosophy, psychology, religion), D (world history), K (education), T (technology), etc.), both overall and by number of words per year.
Fourth, as opposed to most corpora, the COCA will not be static. The plan is to update it at least twice each year from this point on (maintaining the balance proportions of the registers already in place). This promises to keep the COCA current, instead of being a ‘snapshot’ of English at a single point in time. This will also make it a useful resource for researching linguistic change in American English. Fifth, as seen above, it contains, like the BNC, a substantial element of unscripted spoken English. Sixth (and everyone’s favorite), the corpus is free to access online. Another advantage of the corpus is its very powerful search interface (the same as used with the other BYU corpus suite). It allows searches for exact words or phrase (linguistics, linguistics professor), by using wildcards or part of speech, or combinations of these. You can look for lemmas (all forms of words, like swim, swam, swum), wildcards (un*ly or r?n*), and more complex searches such as re-X-ing, *term* (in terms of; to terms with). From the ‘frequency results’ window, a simple click on the word or phrase brings up a list of the target word or string in context in a lower window. The searches can be limited by any combination of genre/register that you define (spoken, academic, poetry, medical, etc.). It is possible to compare between registers, e.g., verbs that are more common in academic or fiction texts. The program also allows searches for surrounding words
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources 313
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 313
4/13/2010 2:47:58 PM
(collocates) within a ten-word window (e.g. all adjectives within that span for girl). The interface also usefully allows comparison ‘between synonyms and other semantically-related words. One simple search, for example, compares the most frequent nouns that appear with sheer, complete, or utter (sheer nonsense, complete account, utter dismay). The interface also allows you to input information from WordNet (a semantically-organized lexicon of English) directly into the search form. This allows you to find the frequency and distribution of words with similar, more general, or more specific meanings.’ Interestingly for researchers focusing on linguistic change, one can compare language from different years from 1990 to the present time. The interface allows the creation and storage of personalized lists, which can be drawn upon as part of subsequent analyses. Overall, the COCA represents a useful resource, and one that deserves to be consulted in future vocabulary research. It is likely to be the best source for information on American English for some time to come, and even trumps the BNC as a resource for general English on some points, particularly its size and currency. However, it too has limitations. The spoken component of the BNC is probably a better representation of spontaneous speech in informal situations, especially the demographic component. Also, the limited span of the concordance lines may limit some kinds of analysis which require more context to interpret, e.g. when studying some grammatical structures, it may be necessary to look at several contiguous sentences. Finally, the online nature of the corpus limits the analyses to those that the interface supports. Some types of analysis that require a corpus to be downloaded onto one’s own computer to use other software (see the ‘Tools’ section below) are simply not possible. The TIME Magazine Corpus [100 million words written American English] () The Time Corpus is another part of the BYU stable of corpora developed by Mark Davies, and uses the same interface. It includes ‘more than 100 million words of text of American English from 1923 to the present, as found in TIME magazine.’ The corpus is taken from 275,000+ texts from the TIME Archive, which is freely available on-line. As with all of the other BYU corpora, clicking on the search/phrase, brings up KWIC (Key Word In Context) contexts. The full original texts are available on the corpus site through a hyperlink to the TIME Archive site (for copyright reasons), but in practice this is seamless, as the hyperlink is easily accessed. There is also the advantage of seeing the texts with the additional magazine features available on the archive site (e.g. related articles, quotes of the day), although with the
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
314 Resources
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 314
4/13/2010 2:47:58 PM
Vocabulary Resources 315
●
●
●
●
●
The overall frequency over time of words and phrases that were related to changes in society and culture, or historical events, such as flapper (flapper, flappers, flapperdom, etc.), cinemaddict, fascist, rocket, reds, hippy/hippies, impeach, new age, politically correct, e(-)mail, and global warming. Changes in the language itself, such as the rise and fall of words and phrases like far-out, famed, wangle, funky, beauteous, nifty, or freak out. You can also search for changes with grammatical constructions like end up V-ing, going to V, phrasal verbs with up (e.g. make up, show up), the use of whom, and the use of preposition stranding (e.g. someone to talk with). Parts of words (which show how word roots, prefixes, and suffixes are being used over time in other words), such as the roots -heart-, -home-, counter, and the suffixes -aholic (e.g. chocoholic), and -gate (e.g. Monicagate). You can also have the corpus generate a list of words that were used more in one period than another, even when you don’t know what the specified words might be. For example, you can find nouns whose usage increased a lot in the 1960s 1960s, verbs that drop off in usage after the 1930s 1930s, or adjectives that have been used much more since 2000 than previousl previously y. The corpus can also help to show how the meaning of words have changed over time, by looking at changes in collocates (co-occurring words). For example, the collocates of chip, engine, or web have changed recently, due to changes in technology. Notice also how this can signal cultural changes over time, such as adjectives used with wife in the 1920s– 1930s (which might now be politically incorrect), or adjectives with families (earlier versus later).
American National Corpus (ANC) [22 million total, 4 million spoken American English] () The ANC was envisioned as being the counterpart to the BNC, with a similar size (100 million words) and comparable across genres. Unfortunately, the project seems to have stalled, and there are currently only 22 million words available in its second edition, released in 2005. These have been annotated for lemma, part of speech, noun chunks, and verb chunks. As the corpus is incomplete, the data currently available are not yet balanced. The ANC 2nd edition can be purchased from the Linguistic Data Consortium catalog () for $75, and comes on two DVDs. There is also a freely available 14 million word ‘open’ version of the ANC (non-copyright
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
disadvantage of the target word/phrase not being highlighted, which may require a bit of scanning to find. The website suggests the following ideas as examples of what could be usefully researched with the corpus:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 315
4/13/2010 2:47:58 PM
restricted) available for download at which contains 11.4 million of the written part and the Charlotte Narratives and Switchboard spoken components. The frequency counts for the corpus are available at . (Note that the Switchboard corpus can be accessed independently from LDC Online, .) Any corpus can be useful, as the unique composition of each corpus gives somewhat different perspectives of the language contained within. The ANC provides 14 million words for free (open version), or 22 million for a reasonable fee (second edition), which is not inconsiderable, especially if the current corpus content is relevant for your topic of research. However, it must be said that COCA looks to be a better resource for researching general American English at the moment, especially considering the balancing issues. The ANC does contain a potentially interesting spoken component though, especially if one is interested in informal spontaneous speech, or telephone conversations. The second edition spoken component contains 24 unscripted telephone conversations between native speakers of American English, covering a contiguous 10-minute segment of each call, comprising 50,494 words (the CallHome component). The Switchboard component ‘consists of 2320 spontaneous [telephone] conversations averaging 6 minutes in length and comprising about 3 million words of text, spoken by over 500 speakers of both sexes from every major dialect of American English.’ These speakers come from a variety of American dialects, age groups, and education levels. There is also a set of narratives from residents of North Carolina (the Charlotte Narratives component). Interestingly for EAP researchers, the ANC also contains 50 transcripts from the MICASE corpus (see MICASE section below). However, the open version contains only the Charlotte Narratives and Switchboard components. If completed, the ANC could be an extremely useful resource for researching American English. However, there seems to have been little progress since the release of the second edition in 2005, and with the free availability of the much larger COCA, there may be little impetus to continue the project. This only goes to show the difficulties involved in creating large language corpora, even if one has the backing of a sizable consortium of partners ().
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
316 Resources
The Brown University Standard Corpus of Present-Day Edited American English (Brown corpus) [1 million written American English] The Brown corpus was the first major corpus designed with the intent of computerized analysis. It was compiled by Henry Kuˇ cera and Nelson Francis
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 316
4/13/2010 2:47:58 PM
Vocabulary Resources 317
PRESS: Reportage (44 texts) PRESS: Editorial (27 texts) PRESS: Reviews (17 texts) RELIGION (17 texts) SKILL AND HOBBIES (36 texts) POPULAR LORE (48 texts) BELLES-LETTRES: Biography, Memoirs, etc. (75 texts) MISCELLANEOUS: US Government & House Organs (30 texts) LEARNED (80 texts) FICTION: General (29 texts) FICTION: Mystery and Detective Fiction (24 texts) FICTION: Science (6 texts) FICTION: Adventure and Western (29 texts) FICTION: Romance and Love Story (29 texts) HUMOR (9 texts) The corpus was eventually tagged with about 80 parts of speech, and a few other markers (e.g. compound forms, contractions, foreign words), and formed the model for a series of later corpora which mirrored its size and text selection criteria (the ‘Brown family’ corpora). See Kuˇcera and Francis (1967), and the Brown Corpus Manual (available on the ICAME website, ) for more information. The corpus itself is available as part of the ICAME CD-ROM Corpus Collection (), or as part of the BNC Baby/Sampler CD-ROM, . The Brown corpus was a good first try at computerized corpora, and it has been used in numerous research studies, including some recent ones. However, it (and all of the other ‘Brown family’ corpora) suffer from a number of limitations in comparison to today’s larger, more modern, corpora. First, it is small. This does not rule it out for researching highly frequent linguistic phenomena, particularly grammatical and morphological features, as this kind of feature occurs sufficiently often in 1 million words for much patterning to become evident. It might also be sufficient for exploring some aspects (e.g. most frequent meaning senses) of the most frequent vocabulary items. However, it is probably not large enough to give good information about the more contextualized aspects of this high-frequency vocabulary (e.g. collocation, register constraints). For example, even for relatively frequent
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
at Brown University in Providence, Rhode Island. It contains about 1 million words of written American English, taken from 500 texts of approximately 2,000 words each, all published in 1961. These texts were distributed across 15 categories in rough proportion to the amount published in each of those genres:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 317
4/13/2010 2:47:58 PM
Resources
words, it takes many occurrences to establish patterns with lower frequency collocates. A case of this is plate. Plate is highly frequent, but unless a large corpus is consulted, it may not become obvious that tectonic is a relatively infrequent, yet strongly related collocate (as indicated by MI). It also takes a large corpus to show up some of the less common meaning senses, even of frequent words. Take the case of try. Among it many meaning senses. there is one for a score in the game of rugby. In every 100 instances of try in the British English component of the New Longman Corpus, about four carried this meaning sense. This is based on corpus data collected from Great Britain, where rugby is relatively popular. Unsurprisingly, this meaning sense is missing in the Brown corpus, partly because of its small size, and partly because it is based on English from America, where the game is not nearly as popular. Overall, when it comes to corpora, bigger is usually better, because the more texts, the greater the chance for diversity, which usually supplies more rounded information about the contextual types of word knowledge. There is also a temporal element. The language in the Brown corpus is now over 45 years old, and it is reasonable to expect that some change in usage has occurred since it was compiled. This may not be such a problem for grammar and morphology, which is relatively stable (although it is changing even as you read this; see Schmitt and Marsden (2006: 72–74) for three examples of this). However, vocabulary is much more prone to change, particularly more vernacular types like slang. Unless one is researching the language of the 1960s, the Brown corpus is likely to give a somewhat outdated version of American English. Lancaster-Oslo/Bergen Corpus (LOB) [1 million written British English] The LOB corpus was compiled in Europe to be the British English counterpart to the Brown corpus. It was built along the same lines, with 2,000 text extracts from 1961 being sampled according to the same 15 categories. As such, it has the same limitations as the Brown corpus, but for British English. Perhaps its greatest use is as a comparison corpus to the Brown corpus. For other corpora based on the Brown format, see the ‘Other Corpora in the “Brown” Family’ section below. HarperCollins COBUILD Bank of English (Bank of English) [524 million written and spoken] () The Bank of English is one of the largest (semi-) accessible corpora of (mainly British) English, containing 524 million words as of July 2008, and it continues to be expanded. It was a joint project launched in 1991 by Collins Publishers and the University of Birmingham. It was led by John Sinclair, probably the most influential of the pioneering corpus linguists, and has
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
318
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 318
4/13/2010 2:47:59 PM
been the basis of a great amount of research by Birmingham-based scholars like Sinclair, Susan Hunston, and Rosamund Moon. The Bank of English contains written texts from thousands of different sources, including ‘newspapers, magazines, fiction and non-fiction books, brochures, reports, and websites’. It also has a spoken component, but this is largely scripted speech from television and radio broadcasts. The corpus can be accessed by subscription. Unfortunately, you do not get access to the whole corpus, but rather a much smaller 56 million word subcorpus. Also, the access fee is not cheap. A subscription for one month costs £50, for six months is £300, and one year is £500. This is pricey, especially considering the fact that the complete 100 million word BNC XML Edition costs only £75, and the even larger COCA is free on-line. However, there is a free concordance and collocation query page on the website which could be useful for individual inquiries (). It allows searches for words and phrases, and up to 250 concordance lines. There is also a facility to type in node words or phrases and the software will identify a list of collocates according to t-score or MI. Overall, the Bank of English would appear to be a very good research tool, but on the present terms (a high price for only 56 million words), it seems to be mainly reserved for in-house HarperCollins lexicographers and materials writers, and those affiliated with the University of Birmingham. New Longman Corpus [179 million written and spoken] Many of the corpus examples in this book are drawn from the New Longman Corpus, and you may be interested in knowing more about its composition. It is a composite of the various corpora in the Longman Corpus Network (). It amounts to 179 million words in total, including written and spoken British and American English. It is an in-house Longman resource, and is not available to the public. Other corpora in the ‘Brown family’ There are a number of other corpora which have used the Brown corpus as a model, using the same technique of sampling 500 texts to build a corpus of 1 million words. These corpora representing various permutations of English (different national varieties, different time settings) are summarized below in Table 7.1 adapted from Richard Xiao’s website.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources 319
SUBTLEXus Database [51 million words] Another approach to frequency is taken by researchers in the Department of Experimental Psychology at the University of Gent. They compiled a frequency measure based on the vocabulary in Ameican movies. They show
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 319
4/13/2010 2:47:59 PM
Resources
Table 7.1
Written Corpora of the Brown family
Corpus
Language Variety
Period
More Information
Freiburg-Brown Corpus American English of American English (FROWN)
1991–1992
Freiburg-LOB Corpus of British English (FLOB)
British English
1991–1992
Kolhapur Corpus of Indian English
Indian English
1978
Macquarie Corpus of Written Australian English (ACE)
Australian English
1986
1986–1990
Wellington Corpus of New Zealand Written New Zealand English English (WWC)
that their frequency figures are more closely related to reaction times in lexical decision tasks than the frequency information from the Brown corpus. As such, their frequency information may be better to use in setting up psycholinguistic experiments than the frequency information from smaller corpora like the Brown corpus. However, it is unclear whether their frequency information is better than that obtained from larger, balanced corpora like the BNC. The database and its rationale is available at (). 7.2.2
Corpora representing spoken English
London-Lund Corpus [500,000 spoken] () The London-Lund corpus was the first electronic corpus of spontaneous language. It contains half a million words of spoken British English, resulting from a combination of two projects: the Survey of English Usage (SEU) and the Survey of Spoken English (SSE). It consists of 100 texts, each of 5,000 words recorded from 1953 to 1987. It distinguishes between dialogues and
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
320
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 320
4/13/2010 2:47:59 PM
Vocabulary Resources 321
Michigan Corpus of Academic Spoken English (MICASE) [1.8 million spoken] () The MICASE is one of the most influential of the smaller specialized corpora. It is an on-line corpus consisting of 152 transcripts (1,571 speakers) of spoken academic English recorded at the University of Michigan, including lectures, seminars, labs, dissertation defences, interviews, meetings, and tutorials. The corpus can be searched free on-line, and is highly interactive, allowing filtering for gender, age, academic position, native/nonnative speaker status, L1, and speech event type. The text corpus can be ordered for off-line use from $50 for an individual licence from . The corpus sound files can also be ordered at extra cost, although some 70 of them are accessible on-line in a streamed version. In addition, a comparable written academic corpus is now available for online access (Michigan Corpus of Upper-level Student Papers; MICUSP). British Academic Spoken Corpus (BASE) [1.6 million spoken] The British Academic Spoken Corpus was designed as a British counterpart to the MICASE, but is not as broad, including only lectures (160) and seminars (39). These were recorded in a variety of university departments, distributed evenly across four broad disciplinary bands. Moreover, most of the recordings are on digital video rather than audio tape. The lecture portion of the BASE can be accessed through the corpus analysis interface Sketch Engine, which can be accessed for a yearly €55 individual license subscription fee through , although this includes access to a number of other corpora as well, including the BNC, and the BASE’s written counterpart, the British Academic Written Corpus (BAWE). There is also a free 30-day trial subscription with full access to all resources. In addition, the text files can be downloaded from the BASE website, .
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
monologues in its organization. It is notable for being one of the few spoken corpora which has been annotated prosodically, i.e., having the features like stress, tones, and pausing marked in transcript. It is available on the ICAME CD-ROM (see below).
Corpus of Spoken Professional American English (CSPAE) [2 million spoken] The CSPAE contains 2 million words of spoken American English (1994– 1998), including 1 million from White House question and answer sessions, and 1 million from academic discussions such as faculty council meetings and committee meetings related to testing. Two versions are available at
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 321
4/13/2010 2:47:59 PM
322
Resources
Santa Barbara Corpus of Spoken American English (SBCSAE) [250,000 spoken American English] The SBCSAE is based on hundreds of recordings of spontaneous speech from across the United States, including speakers from different regions, ages, occupations, and ethnic and social backgrounds. It represents spoken language in daily use as varied as gossip and bedtime stories to sales pitches and sermons. The corpus might be particularly useful for research into speech recognition, as each speech file is accompanied by transcript in which phrases are time-stamped, linking directly to the audio. The SBCSAE can be purchased from the LDC website (). Wellington Corpus of Spoken New Zealand English (WSC) [1 million spoken] () The 1 million spoken words in the WSC were collected between 1988 and 1994. The corpus is made up of 2,000 extracts of formal, semi-formal, and informal speech. The corpus contains an unusually high proportion of private material (75% of the corpus consists of informal dialogue, and 50% of private conversations), which makes the corpus a good candidate for research into informal spoken registers. It is available as part of the ICAME CD-ROM. Vienna-Oxford International Corpus of English (VOICE) [1 million spoken] () The VOICE is a spoken corpus of English as a Lingua Franca. It consists of transcripts of a large number (1,250) of mainly nonnative speakers from approximately 50 different L1s (mainly European) using English to communicate with each other in naturally occurring, non-scripted face-to-face interactions. About 10% of the speakers in the corpus are native English speakers. The 1 million words come from about 120 hours of recorded and transcribed lingua franca interactions. The corpus was releasd in May 2009, and it is freely available via a user-friendly on-line search interface.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
: $49 for the raw text version and $79 for the tagged version (both individual user license).
Cambridge and Nottingham Corpus of Discourse in English (CANCODE) [5 million spoken] () Although unavailable to non-Nottingham staff and research students, it is worth mentioning the CANCODE (), simply because a great deal of research into spoken
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 322
4/13/2010 2:47:59 PM
Vocabulary Resources 323
7.2.3 Corpora representing national varieties of English In addition to the corpora already discussed which focus on either American or British English, and Kolhapur (Indian English), ACE (Australian English), and WWC (New Zealand English) members of the ‘Brown family’ corpora, there are a number of other corpora which focus on the different national varieties of English. Perhaps the most interesting set belong to the International Corpus of English (ICE) (). The ICE project was begun in 1990, with a goal of collecting material for the comparative study of English worldwide. Each national component of the ICE consists of 1 million words of spoken and written English produced after 1989. They are all compiled following the Brown corpus design (i.e. 500 texts × 2,000 words each) and a common scheme for grammatical annotation. They are available either through download from the ICE website, or on CD-ROMs available from addresses shown on the website. ICE Great Britain () ICE Hong Kong () ICE East Africa () ICE India () ICE New Zealand () ICE Philippines () ICE Singapore ()
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
discourse has been based on it, particularly by Ronald Carter, Michael McCarthy, and Svenja Adolphs, with the most prominent output being the Cambridge Grammar of English (Carter and McCarthy, 2006). It also informs the various student textbooks that Cambridge University Press produces. It is made up of 5 million words of unscripted British English, in contexts of use including casual conversation, workplace, and academic settings across different speaker relationships from intimate to professional.
Hong Kong Corpus of Spoken English (HKCSE) [2 million spoken] (< ht t p://e ng l.p oly u.e du.h k /depa r t me nt /ac ade m ic st a f f/ Pe r sona l/ ChengWinnie/HKCorpus_SpokenEnglish.htm>) The HKCSE has 2 million words representing four spoken genres of naturally occurring speech: academic discourse (e.g. lectures, seminars,
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 323
4/13/2010 2:47:59 PM
324
Resources
Scottish Corpus of Texts and Speech (SCOTS) [3.2 million written and .8 million spoken] () The SCOTS corpus provides data on the Scottish national variety of English. It is accessed via an on-line interface that allows searches for words or phrases, which can be filtered along a number of parameters. The texts cover the period from 1945 to 2007, with most of the spoken texts dating from 2000. See the website for the on-line search engine and more details.
7.2.4 Corpora representing academic/business English Various corpora described elsewhere in this section are either wholly or partly made up of academic language (e.g. MICASE, BASE, HKCSE, BNC). In addition to these, the following may be of interest. Louvain Corpus of Native English Essays (LOCNESS) [324,304 words written] The LOCNESS corpus was compiled by the Centre for English Corpus Linguistics (CECL) at the Université Catholique de Louvain to provide a native baseline comparison to their ICLE learner corpus (see below). Mirroring the ICLE, it contains argumentative essays written by native university or pre-university students. They include 114 British A-level essays (60,209 words), 90 British university essays (95,695), and 232 American university essays (168,400). The corpus can be ordered from the CECL at . Wolverhampton Business English Corpus [10 million written] () The Wolverhampton Business English Corpus contains business English texts drawn from the 23 websites between 1999 and 2000. It is available in original web formatting, plain text, and SGML encoded files. In addition, a much larger corpus of written professional English (currently 17 million words, eventually 100+ million?) is currently being developed by the Professional English Research Consortium (PERC) ().
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
workshops), business discourse (service encounters, meetings, job interviews), conversations, and public discourse (speeches, press briefings, discussion forums). The corpus is made up of about 200 hours of speech. All of this has been transcribed orthographically, and 53% is also prosodically transcribed.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 324
4/13/2010 2:47:59 PM
Vocabulary Resources 325
The Child Language Data Exchange System (CHILDES) [about 20 million words covering 25 languages] () The CHILDES database is a widely-used resource for L1 developmental language. Transcripts of normal English-speaking children make up about half of the total CHILDES database, with other components including language impairments and bilingual acquisition. The data are transcribed in CHAT format and can be analysed using the CLAN programs available on the CHILDES website. CLAN allows lexical, morpho-syntactic, discourse, and phonological analysis. The database, analysis tools, and documentation are freely available on the CHILDES website. CHILDES is the child language part of the wider TALKBANK system, which also includes databanks for aphasia and conversation analysis (). Bergen Corpus of London Teenage Language (COLT) [500,000 words spoken] () The COLT corpus was collected in 1993 and consists of the language of 13–17-year-old teenagers from five different boroughs in London. The speakers in the corpus are classified into six age groups, gender, and three social classes, with most of the speech settings either school (48%) or home (32%). A 150,000 word subcorpus has been prosodically annotated. COLT can be ordered as part of the ICAME CD-ROM corpus collection, and holders of the CD-ROM can browse and search the corpus on-line. 7.2.6 Corpora representing learner English International Corpus of Learner English (ICLE) [3+ million written] () The ICLE is foremost of a number of learner-oriented corpora to come out of Sylviane Granger’s Centre for English Corpus Linguistics (CECL) at the Université Catholique de Louvain. It contains over 3 million words from advanced learners of English from 21 L1s (although some are incomplete): Bulgarian, Brazilian Portuguese, Chinese, Czech, Dutch, Finnish, French, German, Greek, Italian, Japanese, Lithuanian, Norwegian, Pakistani, Polish, Portuguese, Russian, South African (Setswana), Spanish, Swedish, and Turkish. The data consist of argumentative university essays written on a set of similar topics. The original corpus was not tagged, but a tagged version is forthcoming. The ICLE is available on CD-ROM for €181.50 from the I6doc.com website ().
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
7.2.5 Corpora representing young native English
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 325
4/13/2010 2:48:00 PM
Resources
Louvain International Database of Spoken English Interlanguage (LINDSEI) [100,000 spoken] LINDSEI is another Louvain corpus, being the spoken counterpart to ICLE. Each L1 subcorpus will contain transcripts of 50 15-minute interviews with third- and fourth-year university students. There is currently about 100,000 words from French students, and the website reports that other mother tongue components are currently being compiled. The corpus is scheduled for release in the near future. Japanese EFL Learner Corpus (JEFLL) [700,000 words written] () The JEFLL (directed by Yukio Tono) is a collection of 20-minute in-class free compositions written by more than 10,000 Japanese EFL learners, mainly junior and senior high school students. The essays in each subcorpus are comparable across topics, proficiency, school years, school types, and other factors. The corpus should be available for on-line query by the time this book is published. There are a number of other learner corpora specific to one particular L1 available. See general corpus references at the beginning and end of this section for more information. 7.2.7 Corpora representing languages other than English Non-English corpora can either be parallel corpora or monolingual corpora. Parallel corpora contain translations of two or more languages of the same text, making it possible to compare those languages. Monolingual corpora are corpora of one non-English national language. 7.2.7.1
Parallel corpora
The Canadian Hansard Corpus [various sizes] A good example of a two-language parallel corpus is the Canadian Hansard Corpus. It contains legislative discourse from the country’s parliament published in French and English. One version has 1.3 million pairs of aligned text chunks (i.e. sentences or smaller fragments) from the Hansards (official records) of the 36th Canadian Parliament (1997–2000), making up about 2 million words each of English and French (USC) (). An on-line version (TransSearch: ) includes all the Hansard texts from 1986 to 2006 (about 273 million words). Finally, a CD-ROM version covers the mid-1970s through 1988 ().
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
326
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 326
4/13/2010 2:48:00 PM
Vocabulary Resources 327
European Parliament Proceedings Parallel Corpus (EUROPARL) () An interesting example of a multiple-language parallel corpus is EUROPARL. It covers proceedings from the European Parliament, and includes versions in 11 European languages: Danish, Dutch, English, Finnish, French, German, Greek, Italian, Portuguese, Spanish, and Swedish, with around 40 million words for most languages. There are also ten parallel two-language corpora, with each of the above languages matched with English (i.e. Danish-English, German-English), with each containing about 35 million English words and their parallel language equivalents. Although the language focuses on political issues, the availability of so many parallel languages offers many research opportunities. These corpus resources can be freely downloaded from the website. 7.2.7.2
Monolingual corpora
There is a welcome trend for the increasing compilation of non-English corpora and Table 7.2, taken from From Corpus to Classroom: Language Use and Language Teaching (O’Keeffe et al., 2007: 294–296), summarizes a number of these. In addition, Mark Davies and the BYU website offers three Spanish/ Portuguese corpora: The Corpus del Español (CdE) [100 million written Spanish] () The Corpus del Español is the more comprehensive of the two Spanish corpora available online on the BYU website. It contains about 20 million words from the 1900s, 20 million from the 1800s, 40 million from the 1500s–1700s, and 20 million from the 1200s–1400s. It uses the same interface as other BYU corpora, which allows, among other things, searches and comparisons by frequency between the different genre/register categories (spoken, fiction, newspaper, and academic), and between different historical periods (the centuries from the 1200s to the 1900s).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Richard Xiao’s survey (2008) describes a number of other two-language parallel corpora: English-Norwegian, English-Swedish, Slovene-English, Chinese-English, as well a several multiple-language parallel corpora.
Corpus del Español: Registers [20 million written Spanish] The Corpus del Español: Registers (Davies, Biber, Jones, and Tracy, 2008, ) is an enhanced version of the 1900s component of the Corpus del Español, which has been equally divided between the spoken, fiction, newspaper, and academic genre/registers.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 327
4/13/2010 2:48:00 PM
328
Resources
Banca dati dell’italiano parlato (BADIP) ▪ 500,000 words of spoken Italian developed at the University of Graz (Austria) ▪ Accessible on-line edition
Basque Spoken Corpus ▪ 42 narratives by native Basque/ Euskara speakers, who tell the story of a silent movie they have just watched to someone else. ▪ Available with sound files in MP3 format as well as transcripts
Chambers-Rostand Corpus of Journalistic French ▪ Almost 1 million words of journalistic French ▪ Made up of 1,723 articles published in 2002 and 2003, taken from three French daily newspapers: Le Monde, L’Humanité, La Dépêche du Midi ▪ Articles are categorized into types: editorial, cultural, sports, national news, international news, finance
Chinese-English Translation Base ▪ More than 100,000 English translation units together with their Chinese translation equivalents and vice versa
Corpus di Italiano Scritto (CORIS) ▪ 100 million words of written Italian sampled from categories such as press, academic prose, legal and administrative and ephemera ▪ Accessible online
Corpas Náisiúnta na Gaeilge/National Corpus of Irish ▪ Consists of approximately 30 million words of text from a variety of contemporary books, newspapers, periodicals and dialogue ▪ Approximately 8 million words are SGML tagged
Corpas Na Gaeilge 1600–1882: The Irish language Corpus. 2004. Dublin: Royal Irish Academy.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 7.2 Some examples of non-English corpora
Continued
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 328
4/13/2010 2:48:00 PM
Vocabulary Resources 329 Continued
Corpus Oral de Referenda del Español Contemporáneo. COREC ▪ 1,100,000 of words of spoken Spanish collected at Universidad Autónoma de Madrid ▪ Administrative, scientists, conversational and familiar, education, humanistic, instructions (megafonía), legal, playful, politicians, journalistic
Sample of corpus
The CREA corpus of Spanish
▪ 133 million words ▪ Sampled from a wide range of written (90%) and spoken (10%) text categories produced in all Spanish-speaking countries between 1975 and 1999 (divided into five-year periods). The domains covered in the corpus include science and technology, social sciences, religion and thought, politics and economics, arts, leisure and ordinary life, health, and fiction ▪ The texts in the corpus are distributed evenly between Spain and America Czech National Corpus (CNC) ▪ Written component: 100 million words including fiction and non-fiction texts ▪ Spoken component: 800,000 words of transcription of spontaneous spoken language sampled according to four sociolinguistic criteria: speaker sex, age, educational level, and discourse type
Hungarian National Corpus (HNC) ▪ 153.7 million words of texts produced from the mid-1990s onwards ▪ Divided into five subcorpora, each representing a written text type: media (52.7%), literature (9.43%), scientific texts (13.34%), official documents (12.95%), and informal texts (e.g. electronic forum discussion, 11.58%)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Table 7.2
Le corpus BAF (English-French parallel corpus) ▪ Circa 400,000 words per language ▪ Contains four subsets of texts: institutional, scientific articles, technical documentation, Jules Verne’s novel De la terre á la lune in French and English Continued
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 329
4/13/2010 2:48:00 PM
Resources
Table 7.2
Continued
TRACTOR archive
▪ Contains monolingual and multilingual language resources available on-line in the following languages: Bulgarian, Croatian, Czech, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Polish, Romanian, Russian, Serbian, Slovak, Slovene, Swedish, Turkish, Ukrainian, and Uzbek (O’Keeffe et al., 2007: 294–296)
‘This site allows you to find the frequency of nearly 150 different grammatical features (pronouns, tense, clauses, etc.) in 20 different registers of Modern Spanish (e.g. conversation, fiction, newspapers, academic). It also lets you see examples of each of these constructions in context, from a 20 million word corpus of Spanish, taken from the 100 million word Corpus del Español.’ You can perform two types of searches: finding the frequency of a given feature in all 20 registers, or finding which features are more common in one register than in another register (e.g. conversation versus. newspapers). The Corpus do Português (CdP) [45 million written Portuguese] The Corpus do Português website (Davies and Ferreira, 2006, ) is similar to the CdE site, except that it gives access to Portuguese texts. It contains more than 45 million words in almost 57,000 Portuguese texts. ‘There are 20 million words from the 1900s, 10 million from the 1800s, and 15 million words from the 1300s–1700s. For the 1900s, there are 6 million words from fiction, 6 million from newspapers and magazines, 6 million from academic texts, and 2 million from spoken. For each of these four genres (and therefore overall) the texts from the 1900s are evenly divided between texts from Portugal and texts from Brazil.’ The interface does all of the things mentioned in previous discussions of BYU corpora, including comparing the frequency of and distribution of words, phrases, and grammatical constructions across texts, by genre/register, dialect (European and Brazilian Portuguese), and historical period (from the 1300s–1900s). There are a number of other non-English corpora in various stages of compilation. Some of those which were completed by the time of publishing are briefly introduced below.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
330
Lancaster Corpus of Mandarin Chinese (LCMC) () A 1 million word Mandarin corpus which is part of the ‘Brown family’, and so has 500 samples of written text, each of about 2,000 words. The sampling window was 1991 +/– 3 years.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 330
4/13/2010 2:48:00 PM
Vocabulary Resources 331
Russian Reference Corpus (BOKR) () The BOKR was designed as a Russian counterpart to the BNC. It follows 150 million words of modern Russian, following the sampling framework of the BNC. It can be searched on-line at . Hellenic National Corpus () The Hellenic corpus contains 47 million words of written modern Greek, mostly sampled from 1990 onwards. It can be searched on-line for free. German National Corpus () The German National Corpus is made up of two parts. The first is a balanced 100 million word ‘core’ roughly comparable to the BNC, and the second is a much larger opportunistic subcorpus. The website appears to be available in German only for the moment. Another project is the huge German tracking corpus (currently more than 3 billion words) being compiled by the Instituts für Deutsche Sprache in Mannheim. It can be accessed through the COSMAS 2 interface at (). Slovak National Corpus () The aim is to build the Slovak National Corpus into a 200 million word reference, and at the time of writing, it was up to 30 million words, which is searchable. Again, the websites at the end of this section and Xiao’s website/survey list a number of other non-English monolingual corpora, including Chinese, German, Portuguese, Dutch, Welsh, and Czech. 7.2.8
Corpus compilations
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
The Reference Corpus of Polish (PELCRA) () The PELCRA corpus has 93 million words in 81,000 texts searchable on-line with a number of analysis tools.
If you are an academic researcher, one way of easily obtaining multiple corpus resources is to buy the ICAME (the International Computer Archive of Modern and Medieval English) CD-ROM which includes 21 corpora for Norwegian Kroner 3,500 (US$690/UK£344/€435, as of July 2008). Below is a listing of these with a brief description. See for purchasing information, samples from the corpora, and links to their manuals.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 331
4/13/2010 2:48:00 PM
Resources
Written 1. Brown Corpus untagged/tagged 2. LOB Corpus untagged/tagged 3. Freiburg-LOB (FLOB) 4. Freiburg-Brown (Frown) 5. Kolhapur Corpus (India) 6. Australian Corpus of English (ACE) 7. Wellington Written Corpus (New Zealand) 8. The International Corpus of English – East African component Spoken: 9. London Lund Corpus 10. Lancaster/IBM Spoken English Corpus (SEC) 11. Corpus of London Teenage Language (COLT) 12. Wellington Spoken Corpus (New Zealand) 13. The International Corpus of English – East African component Historical: 14. The Helsinki Corpus of English Texts: Diachronic Part 15. The Helsinki Corpus of Older Scots 16. Corpus of Early English Correspondence, sampler 17. The Newdigate Newsletters 18. Lampeter Corpus 19. Innsbruck Computer-Archive of Machine-Readable English Texts (ICAMET) Parsed 20. Polytechnic of Wales Corpus 21. Lancaster Parsed Corpus (LOB) The Lexical Computing website also offers access to a number of English, Chinese, French, German, Greek, Italian, Japanese, Persian, Portuguese, Russian, Slovenian, and Spanish corpora, including the BNC and the BASE for a yearly €55 site licence subscription fee. See for details. Other distributors of corpus resources include: ●
●
●
●
Centre for Spoken Language Understanding
European Language Resources Association
European Network in Language and Speech
European National Activities for Basic Language Resources
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
332
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 332
4/13/2010 2:48:01 PM
Vocabulary Resources 333
●
●
Linguistic Data Consortium
Oxford Text Archive
Trans-European Language Resources Infrastructure
7.2.9
Web-based sources of corpora
The above discussion discusses electronic corpora, but what about using the internet as a source, given that it has a massive size which no corpus can match? It definitely has its uses (see below), but first it is useful to highlight its limitations. Although Google (and other search engines) are good at finding internet matches to user queries, Mark Davies () outlines a number of advantages corpora have over internet searches. First, internet search engines are not good at semantically-based searches, because they do not do collocates, and much of the meaning of lexical items depends on the context those items reside in, including their collocations. Also, the search engines can’t use collocates to compare word meanings in different genres, or to see how they’re changing over time. It is also difficult to search by words that are related in meaning, such as all of the synonyms of a given word. Second, the search engines do not allow searches by part of speech or lemma (e.g. all of the forms of a word), which makes grammatical analysis difficult. Third, search engines do not really facilitate researching the changes in linguistic elements over time (e.g. is wireless used more or less now than twenty years ago?). Fourth, neither are they very handy at looking at differences between different styles or types of English. Fifth, you need to know the words you are looking for with search engines in order to type in the query. The engines are not designed for you to set up parameters and then letting them find relevant words for you. On the other hand, corpus concordancers will produce lists of words for you, e.g. the most frequent words in academic speech. Finally, search engines do not give accurate figures for the frequency of word strings. Davies relates a search he made for the word string might be taken for a. Google showed 92,400 hits. However, when he paged through the hits, they ran out at about 530. According to Davies, ‘Google usually doesn’t “know” the frequency of anything more than single words – it’s usually just guessing.’ In this case, the Google ‘guess’ was about 200 times what it should have been. David Lee (the Web as a Corpus page of his www.devoted.to/corpora website) adds several more ‘quality control’ warnings: (1) not everything on the web is language of a high standard, as many native speakers write (and type) badly; (2) nonnative speakers of a language (especially English) put up web
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 333
4/13/2010 2:48:01 PM
Resources
pages too, and the quality is highly variable (just as with natives); (3) search engines such as Google give different results on different days, and have gaps, omissions and inclusions that are hard to explain (due to copyrighted, proprietary technology). Finally, Ute Römer pointed out to me that corpora are usually compiled in a principled way to answer particular research questions. However, with the web, you do not know what the total ‘corpus’ consists of, which makes reasonable analysis and interpretation difficult. Nevertheless, given the vast amount of language on the internet, it remains a very interesting alternative. This is a particularly true for languages with no established corpora available (still the majority of languages around the world). There are several ways to harness the internet. The most basic is to use a search engine like Google, but this has serious disadvantages as we have seen. A better approach is to use software specially designed to extract specified elements from the vast internet pool. In this approach, software can extract words/phrases/texts from the internet according to predetermined criteria. One good way to do this is through Webcorp (). It is a suite of tools which allows access to the internet as a corpus. It works like a normal concordancer, providing frequency lists (in frequency or alphabetical order) of particular web pages. This can be useful for finding words/phrases which are too new or too rare to appear in normal corpora. Webcorp also supplies concordance lines for target lexical items, which can be sorted left or right of the node. You can use wildcards and lemma searches. You can also limit the search to particular site domains. Perhaps the most exciting feature is the ability to call up the collocates of the target search item (something search engines like Google cannot do). Webcorp’s abilities begin to address many of the shortcomings outlined above, and with it, the internet can start to be used as a legitimate corpuslike resource. However, there is one snag limiting Webcorp’s utility, and that is its speed. The searches are slow (they reminded me of my multi-minute corpus searches in the mid-1990s), but the website suggests that increases in speed will be forthcoming. A software program designed to select texts from the Internet according to predefined criteria is REAP (). It is a pedagogical tool which first tests learners for their vocabulary knowledge, and then selects texts from the internet based on the learner’s indication of interest in a number of topic categories (e.g. science, sports), and other criteria such as reading level, text contiguity, and length. Those texts then have the target vocabulary highlighted with hyperlinks that take the learner to an electronic dictionary entry for the target word. In addition, any other word in the text can be clicked on to call up the dictionary entry for it. REAP tracks the learner’s progress (completed documents, word look-ups, vocabulary exercises completed, comprehension question results, and a list of the learner’s focus vocabulary), and this is summarized for the teacher or researcher in a table
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
334
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 334
4/13/2010 2:48:01 PM
Vocabulary Resources 335
7.2.10 Bibliographies concerning corpora In addition to the corpora references already given, the following have a great deal of useful information: Stanford Natural Language Processing Group () The NLP list of resources contains references to a large number of corpora, corpus tools, and other corpus-related stuff. Yukio Tono’s corpus website () This site gives information on a number of learner corpora in Europe, America, and Asia. Mike Scott’s Web () Mike Scott’s links to other corpus linguistics websites. Yvonne Breyer’s Gateway to Corpus Linguistics website () Descriptions and links to corpora, concordancers, markup tools, a bibliography, research centers, and other corpus resources.
7.3 Concordancers/tools There is now a large and diverse array of language analysis tools available. David Lee’s website has perhaps the most comprehensive listing of these resources. I have selected some of the better-known and more widely-used tools for comment below, but see his website for a much, much wider range of possibilities.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
format. Although the purpose of the program is pedagogical, it should also be possible to use it for text selection when building a corpus from the internet, in order to achieve some sort of balancing. It is also an exciting research tool for the acquisition of vocabulary from both incidental (reading) and intentional (dictionary glossing and vocabulary activities) approaches. David Lee’s website also lists a host of other concordancers designed to analyze the web, including KWiCFinder/WebKWiC and WebConc.
Concordancing packages WordSmith Tools () This is the concordancing package of choice among most of the corpus linguists that I know. It does most of what you want, and more recent versions
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 335
4/13/2010 2:48:01 PM
336
Resources
MonoConc Pro () This is the other major player in commercial concordancers. It is similar to WordSmith, but perhaps not quite as customizable. It costs $85. AntConc (). David Lee considers this the best free concordancer available. It does most of the things that commercial concordancers do, including frequencies, concordances, collocations, and clusters/N-Grams. Web-based concordancers Wmatrix () Wmatrix is a suite of tools developed by Paul Rayson for corpus annotation and analysis. It is a web-based environment which is accessed via a web browser through a password (after a one-month trial period has expired, a yearly fee of around £50 applies). A ‘tag wizard’ will automatically tag your corpus for grammatical part-of-speech with the CLAWS utility, and for semantic category with USAS utility. Wmatrix also generates frequency lists (lemmatized option available) and concordances, either all-inclusive or according to the POS and semantic tags. The program also does keyword comparisons and N-Grams. A interesting addition is Collapsed-Grams (C-Grams), which are merged versions of the N-Gram lists. They show you which 2-grams are subsets of 3-grams, which 3-grams are subsets of 4-grams, and so forth:1 at the end of world war ii the end of world war ii the end of world war end of world war end of world.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
have been better able to handle the larger corpora, like the BNC. The most recent 5.0 version also has ConcGrams, which allows the investigation of ‘open slot’ formulaic sequences (see below). It is available for download for £50. If you are serious about vocabulary research, it is well worth the money.
Phrases in English (PIE) () PIE (by Bill Fletcher) allows the free on-line phraseological interrogation of the BNC (World version) in strings up to eight words long. It incorporates a database of N-Grams from one to six words long occurring three or more
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 336
4/13/2010 2:48:01 PM
Vocabulary Resources 337
Word Neighbors () Word Neighbors is a basic pedagogically-oriented web concordancer which might be useful to researchers as a quick reference source. It shows the various derivatives of the target word or phrase and their frequencies, e.g. for stimulate, it shows stimulus (n. 2107), stimuli (n. 1276), stimulation (n. 962), stimulants (n. 85), stimulant (n. 79), stimulate (v. 1235), stimulated (v. 821), stimulates (v. 236), stimulating (v. 130), stimulating (adj. 691), stimulated (adj. 197), stimulant (adj. 23). Word Neighbors also gives collocations up to a +/– 4 word span, and looks for phrases up to 7 words. The program is linked to several dictionaries for definitions. Word Neighbor’s limitations include sometimes questionable part-of-speech classifications. However, this is a problem common to all corpora which have been coded by automatic tagging software, and the relevant caveat is prominently given on the web page. Also, while the total available texts total a very respectable 141 million words (divided into seven categories), there is no documentation concerning the source of these texts, and so it is difficult to know how representative they are. Nevertheless, Word Neighbors is likely to be a useful first point of reference when researchers need a quick idea of a word or phrase’s characteristics. Just the Word () A very quick and easy website that directly gives collocations for a search word, without the concordance lines. It shows results by POS, and graph bars give an indication of the t-score strength. Results are based on an 80 million word subset of the BNC. Concorcordancers for identifying ‘open slot’ patterns kfNgram () A number of concordancing programs generate N-Grams, that is, contiguous sequences of words of varying length, for example 2-grams (bigrams), 3-grams (trigrams), 4-grams, and so on. However as previously explained, some of the most interesting and potentially important patterning in language consists of sequences with some fixed elements and some ‘open slot’ elements (see Section 3.5). There are now software programs available which
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
times in the BNC. It is possible to explore the N-Gram lists and their frequencies, search for specific N-Grams and/or their collocations, e.g. 2-grams of the pattern ‘ADJ day’, to find the most frequent adjectives describing day. There is a phrase pattern discovery tool, and the ability to see N-Gram concordances from the BNC. In essence, PIE is a kfNgram program (see below) for the BNC.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 337
4/13/2010 2:48:01 PM
338
Resources
as * as the as well as the as far as the as soon as the as long as the as much as the
4566 2674 874 652 316 50
5
ConcGrams () Another program which can identify open slot sequences is the ConcGram List Builder ($20). It is similar to kfNgram, but has the advantage that the words do not have to be in the same order. This is important, as the following discussion summarized from the website explains. There are many formulaic sequences which do not occur in one fixed grammatical pattern. The relationship of verbs/adverbs, verbs/nouns, nouns/ adverbs, quantifier/noun, and many other sequence components are flexible and may occur in non-fixed patterns. For example, most adjectives can be used both attributively and predicatively. The bigram challenging exercise would show in an N-Gram search, but when the adjective is used predicatively as in the exercise turned out to be quite challenging, it would not. The positions for challenging in this case would be –1 and +6 respectively, but in both cases, the collocation is important. The result is that many co-occurrence patterns that occur in noncontiguous sequences may not be discovered by traditional N-Gram analysis. An additional problem is that user-nominated searches are limited by the requirement that the user must enter (and therefore know) items to enable the search to take place. The automated concgram search provided by ConcGram is able to reveal all formulaic sequence patterns (both contiguous and noncontiguous in a corpus, with both positional (AB, BA) and constituent (ACB) variation) and, since it is automated, the user does not have to first enter one or more search items. To do this, it starts by automatically extracting all of the 2-word concgrams in the corpus. It can then use these to build up a list of 3-word, 4-word, and 5-word concgrams. Alternatively, the user can select the initial word to build from. Either way, quite a lot of patterns are identified, and so the program also has statistical tests for determining statistically significant
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
can identify these flexible sequences. One of these programs is kfNgram, written by Bill Fletcher, and freely available. It generates lists of N-Grams, but also of ‘phrase-frames’ (also known as skipgrams), i.e. groups of N-Grams which are identical but for a single word. An example from the website illustrates this. From BNC written text data, kfNgram identified the open slot sequence as * as the, with a frequency of 4,566, and 5 variants. This information is shown as follows:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 338
4/13/2010 2:48:01 PM
cut-off points to limit the lists to the most important sequences. The t-score test will tend to identify the more frequent patterns (customer service) and the MI score will highlight the strongly associated (Coca Cola). One shortcoming of ConcGrams is that it is too slow to be practical for large corpora, e.g. the BNC, but works fine with smaller corpora/subcorpora of 1 or 2 million words. While this may be addressed in future versions of ConcGrams, for the moment, the current versions of kfNgram and Collocate work better with larger corpora. Collocate () Michael Barlow’s Collocate program ($45) does similar things to the above two programs, and includes frequency and statistical information about collocations and N-Grams found. It also allows the user to specify search words around which N-Gram extractions are made. The program seems to work with larger corpora, with Barlow illustrating an analysis on a 46 million word corpus. FrameNet/FrameGrapher () The Berkeley FrameNet project has developed a very interesting alternative way at looking at the patterning in language based on Charles Filmore’s ideas on frame semantics. While the above packages search for lexical strings with both fixed and open components, FrameNet (and its graphic interface FrameGrapher – both free) illustrate the semantic relationships around a word based on case relationships (e.g. agent, goal, circumstances, degree). The program has 825+ ‘frames’ ranging from abandonment, abounding_with, absorb_heat, all the way to within_distance, word_relations, and working_on. A very small sample (only 6 of the 19 case information segments) of the frame ‘Destroying’ from the website illustrates the kind of information that FrameNet provides. The output is in color; in the example below, these are rendered as follows: Red = BOLD CAPS Light blue = ITALIC CAPS Sky blue = Bold + Italics CAPS Dark blue = UNDERLINED CAPS Black = BOLD + UNDERLINED CAPS
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources 339
Destroying Definition: A DESTROYER (a conscious entity) or CAUSE (an event, or an entity involved in such an event) affects the UNDERGOER negatively so that the UNDERGOER no longer exists.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 339
4/13/2010 2:48:01 PM
340 Resources
DESTROYER [AGT] Semantic Type
The event or entity which is responsible for the destruction of the UNDERGOER. TORNADOS VAPORIZED this town a few decades back. The conscious entity, generally a person, that performs the intentional action that results in the UNDERGOER’s destruction.
Sentient WHO can UNMAKE the ring? UNDERGOER [UND] The entity which is destroyed by the DESTROYER. Who can UNMAKE THE RING? Non-Core: DEGREE [DEGR]
The degree to which the destruction is completed.
Semantic Type Degree I DESTROYED all signs of our presence completely. MEANS [MNS] Semantic Type State_of_affairs
An intentional action performed by the DESTROYER that accomplishes the destruction. Samptu OBLITERATED the land of Abde WITH A GREAT FLOOD, leaving only the sea.
Lexical Units annihilate.v, annihilation.n, blow up.v, demolish.v, demolition.n, destroy.v, destruction.n, destructive.a, devastate.v, devastation.n, dismantle.v, dismantlement.n, lay_waste.v, level.v, obliterate.v, obliteration.n, raze.v, unmake.v, vaporize.v In addition, the case relationships are illustrated graphically:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Core: CAUSE [CAUSE]
Thus FrameGrapher gives the ‘big picture’ of how words associate with each other to create meaning, rather than just which words sequence together. As such, it would seem to be a very useful complement to N-Gram/ConcGram analyses. In addition to the English version, Spanish FrameNet was just released at the time of writing.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 340
4/13/2010 2:48:02 PM
Vocabulary Resources 341
Objective_influence
Change_of_state_initial_state
Event
Change_of_state_endstate
38 children total
Destroying
Cause_to_fragment
(FrameGrapher, accessed July 2008)
Compleat Lexical Tutor (Lextutor) () This website is so good it deserves a category of its own. Created and continuously updated and improved by Tom Cobb in Montreal, Lextutor is the most essential tool in the vocabulary researcher’s toolbox. It has a number of really useful functions, some of which are described below. Fabulous. ●
●
Frequency analysis Cut and paste a text into the web window (alternatively download larger texts) and Lextutor tells which frequency band the words in the text belong to, up to the 20,000 th level (which will typically be all or nearly all of the words). The results are given in three ways. First, a frequency summary is given, showing what percentage of the text lies in each frequency band (see Table 5.3, p. 209, for an example of this, although it does not do justice to the colorized web output). Second, the text is given, with each word color-coded for frequency. Finally, lists of the words in each frequency band are given, according to token, type, and word family. This tool is excellent for getting an overview of the frequency profile of a text, and in highlighting low-frequency vocabulary that may be a problem for lower-proficiency learners in a study. Range analysis The Range programs tell you about the distribution of words or other lexical units across a set of two or more texts. The texts can be comparable corpora or subdivisions of a corpus, or a set of texts supplied by a user. Lextutor can use its internal corpora to make comparisons between speech and writing in English (using BNC Sampler data), between speech and writing in French (150,000 words of each), and between the Press, Academic, and Fiction components of the Brown Corpus. You can also upload up to 25 of your own texts and see how many of them each word appears in, and in which specific texts it appears.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Transitive_action
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 341
4/13/2010 2:48:02 PM
●
●
●
●
Resources
Vocabulary Tests There are several vocabulary tests available, including the Vocabulary Levels Test, Vocabulary Size Test, the Word Associates Test, a test of the first 1,000 words, and a checklist test. Other tools A range of other corpus-based research tools include a concordancer, frequency word lists, an N-Gram extractor, a frequency level-based cloze passage generator and a traditional nth-word cloze builder. There are also tools for helping to build your own corpora. Reaction time experiment builder Lextutor has ventured into the psycholinguistic paradigm with a basic reaction-time experiment builder. You type in the words to be recognized, and the nonword distractors, and the program will build a word-recognition experiment where participants type 1 for ‘real word’, 3 for ‘nonword’, and 2 to move to the next stimulus. It then gives reaction time summaries for each of the real words. Pedagogical tools It is important to note that Lextutor is as useful for pedagogic purposes as research ones, with features such as concordance line builders, spelling information and activities, and cloze builders. Teachers would be well advised to become familiar with these and other Lextutor features.
Tools for showing semantic associations WordNet () WordNet is a freely-downloadable program which provides a range of information about queried words. It first acts like a dictionary, giving the various meaning senses, with definitions and examples. It then shows the various derived forms. It also gives thesaurus-like information, providing lists of synonyms, antonyms, hypernyms (X is one way to ...), and troponyms (particular ways to ...), as well as information as to how commonly the word is used. It is a quick and easy resource for obtaining semantically-based information about vocabulary of interest. WordNet is perhaps most accessible with a graphical interface, so that all of the associative links are more obvious. One free internet site that does this is the Visuwords Online Graphical Dictionary (). You type in a word, and it produces a 3D network of the connections, colorcoded for word class (nouns = blue, verbs = green) and connection type (is a part of = turquoise, opposes = red). Rolling the cursor over any of the nodes brings up definitions and examples. A commercial graphical interface, Visual Thesaurus, costs $40 and has more features (). It allows you to rotate the 3D networks in any direction, and when you click on any of the nodes, that node automatically starts its own new network. This makes browsing through the semantic space around a topic area very easy.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
342
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 342
4/13/2010 2:48:02 PM
Vocabulary Resources 343
E-Prime () This is one of the mainstream commercial programs which facilitates the design, administration, and analysis of psycholinguistic research designs, such as word recognition, reaction time, and a multitude of others. It takes some time to learn, but once mastered, allows great flexibility in research design, very precise timing of experimental output, and allows for easy randomization of stimuli. It is expensive ($795–995) and so is probably more practical for research groups or university departments than for individual researchers. DMDX () DMDX is a Win 32-based display system used in psychological laboratories around the world to measure reaction times to visual and auditory stimuli. It was programmed by Jonathan Forster at the University of Arizona. It is free software, but requires some technical expertise to use. MiniJudge () MiniJudge is a free on-line tool designed to allow researchers to gather participant judgements about linguistic features. Although originally created for syntacticians, it looks as if it can also be used for lexical judgements. Program for handling speech PRAAT () Lexical researchers interesting in working with oral vocabulary might want to consider PRAAT, a free, comprehensive speech analysis, synthesis, and manipulation package. It can analyze speech according to a number of parameters, manipulate existing speech (e.g. change pitch and duration contours), synthesize new speech, and set up listening experiments. Translation software There are several internet translators available (e.g. iGoogle Translate, and Babel Fish), as well as numerous commercial translation packages, many of them expensive. Babylon is one that gets good reviews, and a free version is available at their website ().
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Experiment generator packages
Statistical packages and other analysis tools SPSS () The most widely-used statistical package in applied linguistics is SPSS (Statistical Package for the Social Sciences). It carries out a wide range of
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 343
4/13/2010 2:48:02 PM
344 Resources
AMOS Part of the SPSS family, AMOS is a popular package for carrying out structural equation modelling. NVivo () NVivo is a software program that facilitates the organization and exploration of qualitative data, helping to find trends in output like interview data. It allows the user to interrogate the data and create categories and connections, which NVivo can then graphically illustrate, helping to make the underlying patterns more salient. It can thus be useful for organizing and understanding qualitative data, such as learners’ opinions about vocabulary learning which are gathered during open-ended interview sessions or through the collection of their email messages. ITEMAN () A very useful program for doing classical test item analysis. This can help to determine the effectiveness of key and distractor options on multiple choice vocabulary tests, as well as analysing survey (e.g. Lickert-type rating scale) data. It is produced by Assessment Systems Corporation. ConQuest () While classical test analyses (e.g. with ITEMAN) can be just as informative as IRT (Item Response Theory) analyses if the testing population and response behaviors are relatively homogeneous (e.g. Cseresznyés, 2008), the more technical IRT approach can be useful with more divergent test output (as is often the case with vocabulary scores). There are three main software programs which do IRT analysis. The first is ConQuest. It is available with manual for AUS$699 from the online ACER website.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
statistical procedures, and is widely supported by after-market instructional material. My students and I have found Andy Field’s (2005) manual to be very useful, explaining the statistics themselves and their underlying assumptions, and also giving clear step-by-step instructions on how to make SPSS work. SPSS comes in modules, and is expensive, but there seem to be discounts for academic staff and students.
WINSTEPS and Facets () These two programs are also popular means for doing IRT analyses. WINSTEPS does Rasch analysis for persons and items, while Facets is capable
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 344
4/13/2010 2:48:02 PM
Vocabulary Resources 345
of many-facet analyses for persons, items, judges, tasks, and more. Both are available at the website for $149 each.
7.4
Vocabulary lists
There are a number of frequency lists based on the BNC and other corpora. Here are a few of them. BNC The book Word Frequencies in Written and Spoken English (Leech, Rayson, and Wilson, 2001) gives fairly comprehensive BNC frequency data. The companion website () gives frequency lists for the whole BNC, for the spoken versus written components, for the conversational (i.e. demographic) versus task-oriented (i.e. context-governed) parts of the spoken component, and for the imaginative versus informative parts of the written component. There are also ranked frequency word lists according to parts of speech (e.g. all verbs), as well as frequencies for individual part-of-speech tags (e.g. NN1, VDG) based on the BNC Sampler. Adam Kilgarriff’s BNC website () includes lemmatized and unlemmatized frequency lists in various formats, and variances of word frequencies. However, the above Leech, Rayson, and Wilson book/website uses a newer text classification system, and contains fewer word-tagging errors, and so largely supersedes the Kilgarriff lists. Brown Corpus The word list from the Brown Corpus, originally published as the Computational Analysis of Present-Day American English (Kuˇ cera and Francis, 1967), is available on the web at the following sites: MRC Psycholinguistic Database (Not disambiguated by parts of speech) ICAME word lists (POSdifferentiated version) SUBTLEXus
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Frequency lists
The Department of Experimental Psychology at the University of Ghent has a web page presenting the SUBTLEXus frequency lists. They are based on a corpus of 51 million words of subtext from American movies (34.9m) and television series (16.1m). It was found that the frequency lists from this corpus accounted for more of the variance in accuracy and reaction times in psycholinguistic studies than the Kuˇ cera and Francis (1967) and CELEX
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 345
4/13/2010 2:48:02 PM
346
Resources
Other corpora The ICAME word list website also contains frequency lists of the POS-tagged LOB (1960s, BrE), untagged FLOB (1990s, BrE), and untagged FROWN (1990s, AmE) corpora. English trigram frequencies In addition to the frequency of words, there is a list of the most frequent trigrams in English (e.g. the, and, ing, tio), based on frequency per 10,000 words of the Brown Corpus. See . Word/phrase Lists Academic Word List (AWL) () Averil Coxhead’s website at Victoria University of Wellington provides information on the AWL. The full AWL is given in ten AWL sublists with headwords and derivative forms. There is also a list with headwords only, and one with the most frequent words listed by sublist. The much-cited article outlining the development of the list is in TESOL Quarterly (Coxhead, 2000), but there is some background information about the list and the underlying academic corpus on the site. There are also web links to useful pedagogic sites which make use of the AWL, including Sandra Haywood’s (see below). The list was originally created as a Masters dissertation project, and shows the type of research that postgraduate students can do at the Masters level if they have ambition and good supervision. The General Service List (GSL) John Bauman’s General Service List page () gives background information about the GSL and frequency/rank listings. The original GSL list was modified by Bauman and Brent Culligan to include headwords according to the standard set out in Bauer and Nation (1995), which led to a total of 2,284 headwords. They then attached frequency information to these headwords based on frequency figures from the Brown Corpus. The resultant list is given in rank order. The GSL list is also available on Sandra Haywood’s website, divided in 500 word segments. James Dickins offers an extended version of the GSL downloadable in Excel format (). This allows sorting of the list in a number of ways, including:
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
(1993) frequency data which psycholinguists often use. This is partially because the SUBTLEXus corpus is larger than the these corpora (Kuˇ cera and Francis = 1m; CELEX ≈ 18m), but the SUBTLEXus also better reflects the fact that most people in America watch more television and movies than read books and newspapers. The lists can be accessed through the University of Ghent website (http://expsy.ugent.be/subtlexus/).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 346
4/13/2010 2:48:02 PM
Vocabulary Resources 347
● ● ● ● ●
order in which they appear in the printed GSL headwords lemmatized headword McArthur category word class word count.
Academic Formulas List (AFL) Simpson-Vlach and Ellis (in press) developed a list of three- to five-word formulaic sequences (which they term formulas) which are typical of academic discourse. They list 200 formulas which are more common in written academic discourse compared to written non-academic discourse. Similarly they list 200 which are more common in spoken academic discourse. They also identify 207 formulas which were relatively more frequent in both written and spoken academic discourse, which they consider core formulas. The formulas are categorized into a number of functional categories: referential expressions (identification and focus, contrast and comparison, diectics and locatives, vagueness markers), stance expressions (hedges, epistemic stance, obligation and directive, ability and possibility, evaluation, intention/volition), and discourse organizing expressions (metadiscourse and textual reference, topic introduction and focus, topic elaboration, discourse markers). Simpson-Vlach and Ellis’s selection procedure for these formulas is interesting in that it combined three main criteria (range, MI, and frequency) in a way that was determined empirically (Ellis, Simpson-Vlach, and Maynard, 2008). Function Word List Nation (2001) includes a list of the function words in English in Appendix 6, pp. 430–431.
7.5 Websites A number of scholars/institutions host language-based websites which include various material useful to the vocabulary researcher and teacher. Below are some of the most notable. Paul Nation’s LALS Vocabulary website () The leading specialist in second-language vocabulary pedagogy has a personal website well worth visiting. To start with, his personal publications list is a mini vocabulary bibliography in itself, and many are downloadable. He also offers his large vocabulary bibliography, sorted alphabetically and by topic. The RANGE program, with either GSL/AWL lists or with BNC lists, is provided available for download. The website includes the GSL and AWL word lists, but in addition has a very interesting
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
●
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 347
4/13/2010 2:48:03 PM
348 Resources
● ● ●
●
● ●
one receptive version of the revised Vocabulary Levels Test (VLT) two productive versions of the VLT bilingual 1,000 and 2,000 receptive versions of the VLT (Chinese, Indonesian, Japanese, Russian, Samoan, Tagalog, Thai, Tongan, Vietnamese) a basic True/False VLT version focusing on the first 1,000 word level. It is aimed at beginners, using very simple vocabulary and pictures to define the target words a monolingual English version of the Vocabulary Size Test (VST) a bilingual Mandarin version of the VST.
Finally, for any researchers or students needing inspiration about vocabulary research topics, Nation offers a multitude grouped according to 11 categories, mirroring the organization of his book Learning Vocabulary in Another Language (2001). Paul Meara’s _lognostics Vocabulary website () Meara’s _lognostics website includes a variety of material focusing on vocabulary acquisition, and features the VARGA (Vocabulary Acquisition Research Group Archive), which contains annotated bibliographies of most of the research on vocabulary acquisition since 1970. You can download the bibliography by individual year, or search the website database through keyword and range of years. This is the best vocabulary bibliography available, especially given that most publications have abstracts and that fact that Meara was the pioneer in collecting vocabulary research beginning with his CILT publication Vocabulary in a Second Language, in 1983. There is also a selection of downloadable papers from Meara and his colleagues. Equally notable is an interesting range of innovative vocabulary tests, language aptitude test, and assessment tools which Meara and his colleagues have developed, all downloadable in ZIP files: X_Lex, Y_Lex, P_Lex, D_Tools, V_Size, V_Quint, and Llama. There is also an online association test (Lex_30). The website also promises some future programs, including WA, a program for handling word association data. Other information on the site includes entries on the VocabularyWiki page on the Kent-Rosanoff association list, Spanish word frequency lists, the MacArthur Communicative Development Inventory (an assessment scale
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
set of lists of survival vocabulary for 19 languages which includes common expressions like greetings and closings, numbers, units of money and weight and size, directions, and conversation gambits (e.g. please speak slowly). There is a list of graded readers divided into difficulty (i.e. vocabulary size) level. One of the highlights of the website is the multitude of vocabulary tests available:
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 348
4/13/2010 2:48:03 PM
Vocabulary Resources 349
Batia Laufer’s Vocabulary website () Batia Laufer’s university website contains an impressive personal publications bibliography, but is most notable for the CATSS test (Computer Adaptive Test of Size and Strength) available on-line (see Sections 2.8 and 5.2.3). Rob Waring’s personal website () Rob Waring’s site has some useful information on vocabulary and reading. For vocabulary, his bibliography is unusual in that it is coded for entry into a database program (such as Filemaker) which makes it much more searchable. Unfortunately, at the time of writing, it only contained references up until 2002. There are also other resources like a listing of word lists, and a page on how to find vocabulary resources on the internet. On the reading front, there is an extensive reading resources page and a link to the Extensive Reading Foundation’s website. There is also something not commonly seen: an extensive listening page. Andy Gillet’s Vocabulary in EAP website () This site includes a range of vocabulary material, including information on selecting which words to learn, using the GSL and AWL word lists. It also has particularly useful sets of non-GSL/AWL vocabulary which occurs particularly frequently in the fields of criminal law, environmental science, business, science and technology, music, health science, computer science, and mathematics. For example, there are 376 words for criminal law, including the A–Z sample below: abet, blameworthy, complicity, defendant, exculpatory, felony, grievous, homicide, indecent, judicial, kidnapping, liability, maliciously, negligence, offence, parole, quashed, rape, self-defence, tort, unjustifiable, verdict, warrant, (no X, Y, or Z) There are numerous exercises to learn new words, and to work on vocabulary learning strategies, such as dictionary use, vocabulary notebooks, and lexical inferencing. Word parts are highlighted, with a useful list of affixes and an online quiz in different word classes to check affix knowledge.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
for monolingual children’s lexical growth). Finally, there are links to the websites of a number of other prominent vocabulary researchers.
Sandra Haywood’s AWL website () This website focuses on pedagogical tools for the AWL, demonstrates a number of vocabulary exercises focusing on AWL vocabulary, and features
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 349
4/13/2010 2:48:03 PM
350
Resources
Data was collected by the International Labour Office on hourly rates of pay in fifty different occupations, and on consumer prices for a sample of household items in about 100 countries. After analysis, it was shown that the worth of an hour’s work, in terms of purchasing power, varied considerably from one country to another. The AWL Gapmaker creates cloze tests by replacing AWL words in a text with a gap. Learners can practise filling in the gaps, then checking their work by comparing their answers to a list of the deleted words. Researchers may also find this an easy way to create cloze tests focusing on AWL vocabulary for their studies. Hong Kong Polytechnic University’s Vocabulary site () The vocabulary section of this learning website has loads of material, including information and exercises on phraseology, word parts and affixes, synonyms, and vocabulary games. I found the academic crossword puzzles and hangman particularly addicting. There is also an English-Chinese bilingual dictionary. Gerry Luton’s Vocabulary website () This site focuses on academic vocabulary, first presenting the AWL and a rationale for using it. He then provides multiple choice matching exercises for the vocabulary in each sublist, divided into manageable 10-word blocks. The site gives feedback about the correctness of the answers, and provides a score. The exercises were created with Gerry’s Vocabulary Teacher, which is available for purchase from the site for $50. (There is a free demonstration version.) It consists of a vast collection of sentences in context, illustrating over 2,600 words with a minimum of 15 contexts each, for a total of over 50,000 sentences and 750,000 words of data, with the facility to edit, delete, or add additional sentences. Teachers can select target words for their students to study, and use the program to create example sentences, multiple choice matching exercises, and cloze exercises. It is an excellent way to promote the introduction and recycling of academic vocabulary for intermediate to advanced learners of English.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
two AWL tools. The AWL Highlighter marks all of the AWL words in a text in bold, making them more noticeable for learners and helping researchers to see them in context:
Lexxica () This site has a number of vocabulary learning activities including word games and flashcards, graded reading materials, a vocabulary test, and an
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 350
4/13/2010 2:48:03 PM
application for teachers to manage their students’ use of the materials. The compilers of the site are very conversant with vocabulary research, with the result that the materials on the site are sound theoretically. For example, the flashcards are presented using the principle of spaced repetition (also known as expanding rehearsal), i.e. cards are repeated at gradually increasing time intervals as the word becomes better learned. The site is also one of the few to work on visual and listening speed. The target words are selected according to frequency criteria, and students are given feedback about how far their current vocabulary size will take them according to various learning goals (using general English, taking the TOEFL or TOEIC tests, using the Interchange textbook). Overall, the site is a good example of how current vocabulary research can be transformed into high-quality vocabulary learning materials. Gabriella Nuttall’s Vocabulary Resource Centre () This website has links to other vocabulary websites, papers and presentations on vocabulary teaching, and various learner activities. Dave’s ESL Cafe () This is a wide-ranging ESL website without too much on vocabulary, but it does include useful lists of idioms and phrasal verbs with definitions and examples.
7.6 Bibliographies In addition to the general vocabulary bibliographies on Nation’s, Meara’s and Waring’s websites, and the corpus bibliographies on David Lee’s and Richard Xiao’s sites, the following bibliographies are useful places to search for lexical references. Paul Meara published annotated bibliographies on vocabulary research spanning the period 1960–1990 in three volumes: 1. (1983). Vocabulary in a Second Language. London: Centre for Information on Language Teaching and Research. (1960–1980) 2. (1987). Vocabulary in a Second Language, Vol. 2. London: Centre for Information on Language Teaching and Research. (1980–1985) 3. (1992). Vocabulary in a Second Language, Vol. 3. Reading in a Foreign Language 9, 1. (1986–1990)
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Vocabulary Resources 351
(Computational) Theories of Contextual Vocabulary Acquisition () This bibliography from 1884 to the present has a large number of references on L1 acquisition, particularly from context.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 351
4/13/2010 2:48:03 PM
352 Resources
Formulaic Language Bibliography Probably the best bibliography for formulaic sequences is the massive bibliography at the end of Alison Wray’s 2002 book Formulaic Language and the Lexicon. Stanford Natural Language Processing Group Bibliography () The Stanford Natural Language Processing Group has an extensive annotated bibliography on corpus building tools (e.g. parsers and taggers), corpora, and other computational linguistics stuff. Learner Corpus Bibliography () The Centre for English Corpus Linguistics hosts an extensive bibliography on learner corpora, maintained by Magali Paquot. At the time of writing, it had 370 references. Brian Richards and David Malvern’s Lexical Diversity Bibliography () This bibliography covers publications (up to 1997) concerning type-token type measures of lexical diversity, and was compiled as part of their project to develop their D measurement software (see Section 5.2.4).
7.7 Important personalities in the field of vocabulary studies A vast number of scholars have contributed to our understanding of vocabulary through the ages. The following is a list of researchers who I feel have made sustained contributions to the field. Of course, the list is not comprehensive, and many worthy personalities do not appear, e.g., corpus linguists. In particular, it reflects my research interest of L2 vocabulary studies, and so does not include many scholars whose primary interest is L1 vocabulary. I offer this personal selection as a initial list of the vocabulary scholars whose work can usefully inform your own research.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Phraseology Bibliography () A bibliography on phraseology with research up until 2003 is available at this website.
Currently active researchers whose primary interest is L2 vocabulary Each of these scholars has made vocabulary their main area of research endeavor, and has published widely on a variety of lexical topics across a
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 352
4/13/2010 2:48:03 PM
Vocabulary Resources 353
sustained period of time. They are the leading authorities in the area of second-language vocabulary.
Paul Meara Vocabulary acquisition and processing, the mental lexicon, word associations, matrix models of vocabulary acquisition, vocabulary tests, particularly checklist tests, the Lex family of vocabulary tests/tools, VARGA, created a large and successful distance PhD program at the University of Swansea for students focusing on vocabulary issues. Paul Nation Vocabulary pedagogy, word lists, frequency levels of vocabulary, creator of the original Vocabulary Levels Test, Vocabulary Size test, vocabulary and reading, word knowledge taxonomy, word families and affixation, the fourstrand approach to vocabulary teaching, author of the landmark Teaching and Learning Vocabulary (1990) and Learning Vocabulary in Another Language (2001) books. John Read Vocabulary testing, Word Associates Test, author of Assessing Vocabulary (2000). Norbert Schmitt Vocabulary acquisition, acquisition and use of formulaic language, research based on the word knowledge framework, developed the revised versions of the Vocabulary Levels Test, vocabulary learning strategies, vocabulary and reading/listening, explicit versus implicit lexical knowledge, author of Vocabulary in Language Teaching (2000). Current researchers with an interest in vocabulary studies This category includes scholars who have made important contributions to the field of second language vocabulary studies, although it is not necessarily their main research area. It also includes developing scholars with a lexical focus who have already made a promising start to their research careers.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Batia Laufer Learning and usage difficulties caused by similarity in word form, lexical frequency profile, CATSS test, lexical coverage requirements, vocabulary pedagogy, task-induced involvement load.
Joe Barcroft Vocabulary acquisition, research methodology, cognitive constraints to acquisition, acquisition of form versus meaning.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 353
4/13/2010 2:48:03 PM
354
Resources
Frank Boers Facilitating factors of vocabulary acquisition, including awareness of metaphor categories, formulaic language, and etymology.
Ronald Carter Vocabulary in discourse and literature, corpus linguistics, written vs. spoken vocabulary, author of Vocabulary: Applied Linguistic Perspectives (1998, 2nd edn.). Tom Cobb The Compleat Lexical Tutor, the use of computers and technology in vocabulary learning, intentional versus incidental vocabulary learning, vocabulary and reading. Averil Coxhead Creation and utilization of the Academic Word List, vocabulary pedagogy. Kees de Bot The mental lexicon, modelling the multilingual lexicon, language attrition. Annette de Groot Word recognition multilingualism.
and
the
mental
lexicon,
bilingualism
and
Nick Ellis Cognitive factors in vocabulary acquisition, effects of frequency, implicit versus explicit knowledge. Rod Ellis Best known for researching and reviewing SLA in general, he has also done considerable work on vocabulary acquisition. Tess Fitzpatrick Word associations, vocabulary acquisition, formulaic language, supervises the Swansea distance vocabulary PhD programme.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Doug Biber Corpus linguistics, formulaic language, particularly lexical bundles.
Keith Folse Vocabulary pedagogy and teacher education on lexical issues, author of Vocabulary Myths (2004).
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 354
4/13/2010 2:48:03 PM
Vocabulary Resources 355
Dee Gardner Vocabulary and reading, corpus-based vocabulary analysis.
Kirsten Haastrup Lexical inferencing, lexical processing, vocabulary acquisition, vocabulary and reading. Birgit Henriksen The nature of vocabulary knowledge, vocabulary pedagogy, reading/writing and vocabulary. Marlise Horst Incidental vocabulary learning from reading, the use of online tools for learning academic vocabulary. Jan Hulstijn Incidental versus intentional vocabulary learning, glossing, task-induced involvement load, cognitive aspects of language learning. Nan Jiang Psycholinguistic approaches to vocabulary acquisition and processing, semantic representation and transfer. Keiko Koda Effects of L1 word form on the learning of second-language form, secondlanguage reading. Kon Kuiper Formulaic language, particularly in spoken contexts under time pressure. Michael McCarthy Written versus spoken vocabulary, author of Vocabulary (1990), and several books in the Cambridge University Press English Vocabulary in Use textbook series.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Sylvianne Granger Development and analysis of learner corpora, formulaic language.
Margaret McKeown L1 vocabulary instruction, use of dictionaries. Rosamund Moon Multi-word units, COBUILD dictionaries and materials.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 355
4/13/2010 2:48:03 PM
356
Resources
William Nagy The relationship between vocabulary knowledge and L1 reading, vocabulary instruction.
Diana Pulido The influence of vocabulary knowledge in reading, factors affecting lexical development, learner involvement in lexical development tasks. Paul Rayson Computational analysis of vocabulary. Susanna Rott Incidental learning of vocabulary, number of exposures necessary for learning, formulaic language. Rob Schoonen Assessing vocabulary depth of knowledge, automaticity of lexical processing as prerequisite of L2 reading and writing. Norman Segalowitz Automaticity of lexical and language processing. David Singleton The second-language mental lexicon, author of Language and the Lexicon (2000). Rob Waring Vocabulary and reading, extensive reading, editor of the Heinle Cengage graded reader series (Foundations Reading Library), receptive versus productive knowledge. Stuart Webb Acquiring depth of vocabulary knowledge from incidental and intentional input, using word knowledge test batteries.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Sima Paribakht Lexical inferencing and vocabulary learning from reading, the Vocabulary Knowledge Scale, vocabulary pedagogy.
Bert Weltens Lexical attrition. Mari Wesche Lexical inferencing, vocabulary acquisition through reading, L1 influences in initial acquisition, the Vocabulary Knowledge Scale.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 356
4/13/2010 2:48:04 PM
Vocabulary Resources 357
Alison Wray Wrote the seminal overview of formulaic language Formulaic Language and the Lexicon (2002), created FLaRN (Formulaic Language Research Network), and is still the leading specialist in this area. Cheryl Zimmerman Pedagogical issues and vocabulary learning, knowledge of derivative forms, teacher training in vocabulary issues. Past masters These scholars have made important contributions in the past: Isabel Beck L1 vocabulary instruction, vocabulary learning from reading. John Carroll L1 vocabulary acquisition, co-author of The American Heritage Word Frequency Book (Carroll, Davies, and Richman, 1971), word parts and affixes. Hermann Ebbinghaus One of the first scholars (1885) to systematically research how vocabulary is learned. Jim Nattinger and Jeanette DeCarrico Authors of Lexical Phrases and Language Teaching (1992), the seminal book which first highlighted the importance of formulaic language. Harold Palmer Although primarily interested in an oral approach to language learning, he collaborated with West in what became known as the ‘vocabulary control movement’. Charles Ogden and Ivor Richards Created and promoted Basic English, featuring an 850-word lexicon (), which competed with the concurrent frequency-based approach to vocabulary control.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Brent Wolter Word associations, vocabulary acquisition, psycholinguistic approaches to second-language vocabulary acquisition.
Håkan Ringbom The effects of the L1 on L2 and L3 vocabulary acquisition, crosslinguistic influences in the second language lexicon, vocabulary learning.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 357
4/13/2010 2:48:04 PM
358 Resources
Steven Stahl L1 vocabulary acquisition.
Edward Thorndike and Irving Lorge Compilers of the influential early word lists, including The Teacher’s Book of 30,000 Words (1944). Michael West One of the first scholars to systematically consider the influence of frequency of occurrence on vocabulary learning, compiled the General Service List (1953), used frequency-based approach to writing graded reading materials. Dave and Jane Willis Vocabulary in the syllabus, authors of The Lexical Syllabus () and the COBUILD English Course, formulaic language, task-based instruction.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
John Sinclair The father of corpus linguistics, the idiom principle, guiding light behind COBUILD and the Bank of English Corpus.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_08_cha07.indd 358
4/13/2010 2:48:04 PM
1
Vocabulary Use and Acquisition
1. Vocabulary and lexis will be used interchangeably in this book. 2. It is beyond the scope of this book to discuss statistics and how to carry them out. There are numerous textbooks on statistics, and one source which my students have found particularly useful is Field (2005), who shows how to perform statistics with the widely-used statistical program SPSS. 3. DIALANG is a European project for the development of diagnostic language tests in 14 European languages. It offers separate tests for reading, writing, listening, grammatical structures, and vocabulary in each of the languages. 4. The Common European Framework (2007) does not stipulate required vocabulary sizes for the various levels, but rather describes learner performance expectations at each level. The C2 descriptors for reading and vocabulary include the following, for which a 5,000 word family lexicon would appear inadequate (although firm research on this is lacking): ● Can understand and interpret critically virtually all forms of the written language including abstract, structurally complex, or highly colloquial literary and non-literary writings. ● Can understand a wide range of long and complex texts, appreciating subtle distinctions of style and implicit as well as explicit meaning. ● Can exploit a comprehensive and reliable mastery of a very wide range of language to formulate thoughts precisely, give emphasis, differentiate and eliminate ambiguity ... No signs of having to restrict what he/she wants to say. ● Has a good command of a very broad lexical repertoire including idiomatic expressions and colloquialisms; shows awareness of connotative levels of meaning. 5. Terminology is a problem in this area. Wray (2002: 9) found over 50 terms to describe the notion that recurrent multi-word lexical items can have a single meaning or function. Some highlight the characteristic of multiple words (multi-word units, multi-word chunks), others the fixedness of the items (fixed expressions, frozen phrases), some the recurrent phraseology (phrasal vocabulary, routine formulas), while still others focus on the psycholinguistic notion that these multi-word lexical items are stored and processed in the mind as wholes (chunks, prefabricated routines). In this book, I will generally use the term formulaic language as a cover term for this phenomenon, and formulaic sequences for the individual phrasal items. Finer distinctions will be discussed in Chapter 3, which covers formulaic language in more detail. Likewise, I will use the term lexical item to include vocabulary made up of either individual word forms or formulaic sequences. 6. I reserve the term collocation for two-word partnerships, which is a subcategory of the umbrella term formulaic language. 7. Corpus size figures as of November 13, 2008.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Notes
359
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_09_not.indd 359
6/11/2010 12:52:28 PM
2
Notes
Issues of Vocabulary Acquisition and Use
1. I know of no association research that used formulaic sequences as prompts. 2. Actually, Michael West does not deserve all of the credit for the GSL. It was the culmination of a long series of conferences and studies in which a number of linguists of the age played an prominent part, including Harold Palmer, Lawrence Faucett, Edward Thorndike, and Irving Lorge. See Howatt (2004) for a discussion of the development of the GSL. 3. Note the classical tradeoff between depth of assessment and sampling rate: Webb’s assessment is extensive, but only allows for the measurement of ten lexical items. 4. The tests were not in this order in the study. 5. Early eye-movement measures (e.g. first-pass reading time) are thought to reflect integration processes. Thus more frequent or more predictable lexical items have faster early measures. Items which are known will be easier to integrate into the unfolding meaning of a text. Unfamiliar items will be harder to integrate and will have longer reading times and fixation counts. In contrast, late measures (e.g. total reading time) reflect recovery when processing is difficult. So if an item is ambiguous or doesn’t fit with a context, long total reading times and fixation counts are found. Thus, if non-natives read kick the bucket and it is clear that this doesn’t mean ‘kick a pail’, there will be recovery time required, and therefore longer total reading time and more total fixations. 6. N400 is also sensitive to a range of lexical properties, including whether a letter string is a word in a language, frequency, phonological priming, morphological context within a string, and the sequential probabilities of the likelihood of words occurring in succession. Thus N400 amplitude reflects a combination of lexical and semantic/conceptual factors (Osterhout et al., 2006: 205).
3
Formulaic Language
1. There is evidence that individual lexical bundles are generally preferred in either spoken or written discourse, but seldom in both, at least in academic discourse (Biber et al., 2004). 2. This statistical section draws heavily on information in Church and Hanks (1990), Dunning (1993), Evert (2004), and Manning and Schütze (1999). I am particularly indebted to my former PhD student Phil Durrant for showing me the calculations and formulas underlying common strength of association measures, and for allowing me to closely shadow his PhD thesis account of those measures. See Durrant (2008) for a fuller discussion of these methodologies. 3. Another less widely-used approach to correcting this problem is adjusting the MI formula to give greater weight to the ‘observed occurrences’ portion of the equation (Evert, 2004). Proposed corrections include local MI (O x log2O), MI2 (log2O2), E E and MI3 (log2O3). E 4. John Sinclair first introduced me to the idea of variable expressions using this example during my visit to the Tuscan Word Centre. This is an extension of his original analysis. 5. A number of these studies have come out of the Centre for English Corpus Linguistics (Université catholique de Louvain) headed by Sylviane Granger. The researchers in this centre usually analyzed the academic prose output of L2 university students in the International Corpus of Learner English (ICLE,
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
360
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_09_not.indd 360
6/11/2010 12:52:28 PM
Notes 361
4
Issues in Research Methodology
1. It must be said that little is known about the relationship between explicit/declarative and implicit/procedural lexical knowledge. I see this as a key area for vocabulary research, probably incorporating some of the psycholinguistic and neurolinguistic methodologies discussed in Section 2.11.
5
Measuring Vocabulary
1. It is important to note that Bauer and Nation formed their hierarchy based solely on linguistic criteria, and not on any acquisition evidence. While there is reason to believe that the hierarchy should reflect acquisition order to a considerable degree, this needs to be empirically explored, and would be an excellent research project. 2. Note that the PVLT predates the Academic Word List (Coxhead, 2000) and so uses the older listing of academic vocabulary, the University Word List (Xue and Nation, 1984). 3. There was a theoretical basis for including academic words in the profile analysis. Frequency analysis of academic texts shows that the first 2,000 GSL words plus academic vocabulary typically covers a large percentage of academic texts, e.g. GSL + AWL = 86% (Coxhead, 2000). Thus VocabProfile has been more useful for the analysis of academic writing, rather than general English writing, where the percentage of academic vocabulary is lower, and the AWL words less important. 4. Laufer’s (1995) Beyond 2000 also produces a single figure based on an LFP-type analysis. 5. Barcroft (2002) proposes the Lexical Production Scoring Protocol-Written (LPSPWritten) as a way of quantifying the ability to spell words: 0.00 points None of word is written; this includes: ● nothing is written ● the letters present do not meet any ‘for 0.25’ criteria ● L1 word only is written 0.25 points 1/4 of word is written; this includes: ● any 1 letter is correct ● 25–49.9% of the letters are present ● correct number of syllables
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Granger, n.d.), and often compared it to the equivalent(ish) native university student output found in the Louvain Corpus of Native English Essays (LOCNESS, a native corpus compiled to mirror the ICLE). (See Section 6.2 for a more detailed description of these corpora, and how to order them.) 6. Although Nesselhauf finds evidence for extensive erroneous use of collocations, the issue of whether collocations are more problematic than non-collocations is not satisfactorily resolved (Durrant, 2007).
0.50 points 1/2 of word is written; this includes ● 25–49.9% of letters correct ● 50–74.9% of letters present
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_09_not.indd 361
6/11/2010 12:52:28 PM
362
Notes 0.75 points 3/4 of word written; this includes: ● 50–99.9% of letters correct ● 75–100% of letters present
6
Example Research Projects
1. Of course, there may be contexts where this assumption does not hold true, e.g. if the study was carried out in Nigeria.
7
Vocabulary Resources
1. Thanks to Paul Rayson for this example.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
1 point Entire word is written; this includes: ● 100% letters correct.
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_09_not.indd 362
6/11/2010 12:52:28 PM
Adolphs, S. and Durow, V. (2004). Social-cultural integration and the development of formulaic sequences. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. pp. 107–126. Adolphs, S. and Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics 24, 4: 425–438. Aitchison, J. (2003). Words in the Mind (3rd edn). Oxford: Blackwell. Albrechtsen, D., Haastrup, K., and Henriksen, B. (2008). Vocabulary and Writing in a First and Second Language: Process and Development. Basingstoke: Palgrave Macmillan. Alderson, J.C. (2005). Diagnosing Foreign Language Proficiency. London: Continuum. Alderson, J.C. (2007). Judging the frequency of English words. Applied Linguistics 28, 3: 383–409. Alderson, J.C., Clapham, C.M., and Steel, D. (1997). Metalinguistic knowledge, language aptitude, and language proficiency. Language Teaching Research 1: 93–121. Al-Homoud, F. and Schmitt, N. (2009). Extensive reading in a challenging environment: A comparison of extensive and intensive reading approaches in Saudi Arabia. Language Teaching Research 13, 4: 383–402. Altenberg, B. (1998). On the phraseology of spoken English: The evidence of recurrent word-combinations. In Cowie, A.P. (ed.), Phraseology: Theory, Analysis and Applications. Oxford: Oxford University Press. pp. 101–122. Altenberg, B. and Granger, S. (2001). The grammatical and lexical patterning of make in native and non-native student writing. Applied Linguistics 22, 2: 173–194. Altmann, G.T.M. and Kamide, Y. (1999). Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition 73: 247–264. Anderson, R.C. and Freebody, P. (1981). Vocabulary knowledge. In Guthrie, J.T. (ed.), Comprehension and Teaching: Research Reviews. Newark, DE: International Reading Association. Anderson, R.C. and Freebody, P. (1983). Reading comprehension and the assessment and acquisition of word knowledge. In Hutson, B.A. (ed.), Advances in Reading/ Language Research. Greenwich, CT: JAI Press. pp. 132–255. Arnaud, P.J.L., Bejoint, H., and Thoiron, P. (1985). A quoi sert le programme lexical? Les Langues Modernes 79, 3/4: 72–85. Ashby, M. (2006). Prosody and idioms in English. Journal of Pragmatics 38, 10: 1580– 1597. Atay, D. and Kurt, G. (2006). Elementary school EFL learners’ vocabulary learning: The effects of post-reading activities. Canadian Modern Language Review 63, 2: 255–273. Bachman, L.F. (1990). Fundamental Considerations in Language Testing. Oxford: Oxford University Press. Bachman, L.F. and Palmer, A.S. (1996). Language Testing in Practice. Oxford: Oxford University Press. Baddeley, A. (1990). Human Memory: Theory and Practice. Needham Heights, MA: Allyn and Bacon. Bahns, J. and Eldaw, M. (1993). Should we teach EFL students collocations? System 21: 101–114.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
362
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 362
4/13/2010 2:41:13 PM
363
Bahrick, H.P. (1984). Fifty years of language attrition: Implications for programmatic research. Modern Language Journal 68: 105–118. Barcroft, J. (2002). Semantic and structural elaboration in L2 lexical acquisition. Language Learning 52, 2: 323–363. Bardovi-Harlig, K. (2002). A new starting point? Investigating formulaic use and input in future expression. Studies in Second Language Acquisition 24: 189–198. Barfield, A. (2003). Collocation Recognition and Production: Research Insights. Chuo University, Japan. Barrow, J., Nakashimi, Y., and Ishino, H. (1999). Assessing Japanese College students’ vocabulary knowledge with a self-checking familiarity survey. System 27: 223–247. Bates, E. and MacWhinney, B. (1987). Competition, variation, and language learning. In MacWhinney, B. (ed.), Mechanisms of Language Acquisition. Hillsdale, NJ: Lawrence Erlbaum. pp. 157–193. Bauer, L. and Nation, I.S.P. (1993). Word families. International Journal of Lexicography 6: 253–279. Beeckmans, R., Eyckmans, J., Jansens, V., Dufranne, M., and van de Velde, H. (2001). Examining the Yes/No vocabulary test: Some methodological issues in theory and practice. Language Testing 18, 3: 235–274. Beglar, D. (2010). A Rasch-based validation of the vocabulary size test. Language Testing. Beglar D. and Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing 16: 131–162. Beks, B. (2001). Le degré des connaissances lexicales [The degree of lexical knowledge]. Unpublished MA thesis, Vrije Universiteit Amsterdam. Bell, H. (2002). Using frequency counts to assess L2 texts. PhD thesis, University of Wales. Bensoussan, M. and Laufer, B. (1984). Lexical guessing in context in EFL reading comprehension. Journal of Research in Reading 7, 1: 15–32. Bertram, R., Baayen, R., and Schreuder, R. (2000). Effects of family size for complex words. Journal of Memory and Language 42: 390–405. Bertram, R., Laine, M., and Virkkala, M. (2000). The role of derivational morphology in vocabulary acquisition: Get by with a little help from my morpheme friends. Scandinavian Journal of Psychology 41, 4: 287–296. Biber, D., Conrad, S., and Cortes, V. (2004). If you look at ... : Lexical bundles in university teaching and textbooks. Applied Linguistics 25, 3: 371–405. Biber, D., Johansson, S., Leech, G., Conrad, S., and Finegan, E. (1999). Longman Grammar of Spoken and Written English. Harlow: Longman. Bishop, H. (2004). The effect of typographic salience on the look up and comprehension of unknown formulaic sequences. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. Bley-Vroman, R. (2002). Frequency in production, comprehension, and acquisition. Studies in Second Language Acquisition 24, 2: 209–13. Blum, S. and Levenston, E.A. (1978). Universals of lexical simplification. Language Learning 28, 2: 399–416. Bogaards, P. and Laufer, B. (eds). (2004). Vocabulary in a Second Language. Amsterdam: John Benjamins. Bonk, W.J. (2001). Testing ESL learners’ knowledge of collocations. In Hudson, T. and Brown, J.D. (eds.), A Focus on Language Test Development: Expanding the Language Proficiency Construct Across a Variety of Tests. (Technical Report #21).
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 363
4/13/2010 2:41:14 PM
References
Honolulu: University of Hawai’i, Second Language Teaching and Curriculum Center. pp. 113–142. Brown, D. (in press). What aspects of vocabulary knowledge do textbooks give attention to? Language Teaching Research. Brown, F.G. (1983). Principles of Educational and Psychological Testing. New York: Holt, Rinehart and Winston. Brown, R. (1973). A First Language. London: Allen and Unwin. Brown, R. and McNeill, D. (1966). The ‘tip of the tongue’ phenomenon. Journal of Learning and Verbal Behaviour 5: 325–337. Brown, R., Waring, R., and Donkaewbua, S. (2008). Vocabulary acquisition from reading, reading-while-listening, and listening to stories. Reading in a Foreign Language 20, 2: 136–163. Cameron, L. (2002). Measuring vocabulary size in English as an Additional Language. Language Teaching Research 6, 2: 145–173. Cain, K., Oakhill, J., and Lemmon, K. (2005). The relation between children’s reading comprehension level and their comprehension of idioms. Journal of Experimental Child Psychology 90, 1: 65–87. Carey, S. (1978). The child as word learner. In Halle, M., Bresnan, J., and Miller, G.A. (eds.), Linguistic Theory and Psychological Reality. Cambridge, MA: MIT Press. pp. 264–293. Carrell, P.L. and Grabe, W. (2002). Reading. In Schmitt, N. (Ed.), An Introduction to Applied Linguistics. London: Arnold. Carroll, J.B., Davies, P., and Richman, B. (1971). The American Heritage Word Frequency Book. New York: American Heritage Publishing. Carter, R. (1998). Vocabulary: Applied Linguistic Perspectives (2nd edn). London: Routledge. Carter, R. and McCarthy, M. (2006). Cambridge Grammar of English. Cambridge: Cambridge University Press. Chamot, A.U. (1987). The learning strategies of ESL students. In Wenden, A. and Rubin, J. (eds.), Learner Strategies in Language Learning. New York: Prentice Hall. Cheng, W., Greaves, C., and Warren, M. (2006). From n-gram to skipgram to concgram. International Journal of Corpus Linguistics 11, 4: 411–433. Cho, K-S. and Krashen, S. (1994). Acquisition of vocabulary form Sweet Valley Kids Series: adult ESL acquisition. Journal of Reading 37: 662–667. Church, K.W. and Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics 16, 1: 22–29. Clarke, D.E. and Nation, I.S.P. (1980). Guessing the meanings of words from context: Strategy and techniques. System 8, 3: 211–220. Clear, J. (1993). Tools for the study of collocation. In Baker, M., Francis, G., and Tognini-Bonelli, E. (eds), Text and Technology: In Honour of John Sinclair. Amsterdam: Benjamins. pp. 271–292. Coady, J. and Huckin, T. (eds). (1997). Second Language Vocabulary Acquisition. Cambridge: Cambridge University Press. Cohen, A.D. (1989). Attrition in the productive lexicon of two Portuguese third language speakers. Studies in Second Language Acquisition 11, 2: 135–149. Cohen, A., Glasman, H., Rosenbaum-Cohen, P.R., Ferrara, J., and Fine, J. (1988). Reading English for specialized purposes: Discourse analysis and the use of student informants. In Carrell, P.L., Devine, J., and Eskey, D. (eds), Interactive Approaches to Second Language Reading. Cambridge: Cambridge University Press. pp. 152–167. Coltheart, M. (1981). The MRC Psycholinguistic Database. Quarterly Journal of Experimental Psychology, 33A: 497–505.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
364
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 364
4/13/2010 2:41:14 PM
365
Conklin, K., Dijkstra, T., and van Heuven, W. (under review). Bilingual processing of grammatical gender information specific to one language: Evidence from eyetracking. Conklin, K. and Schmitt, N. (2008). Formulaic sequences: Are they processed more quickly than nonformulaic language by native and nonnative speakers? Applied Linguistics 29, 1: 72–89. Cooper, T. C. (1999). Processing of idioms by L2 learners of English. TESOL Quarterly 33: 233–262. Coulmas, F. (1979). On the sociolinguistic relevance of routine formulae. Journal of Pragmatics 3: 239–66. Coulmas, F. (1981). Conversational Routine. The Hague: Mouton. Cowie, A. (ed.). (1998). Phraseology: Theory, Analysis, and Applications. Oxford: Oxford University Press. Coxhead, A. (2000). A new academic word list. TESOL Quarterly 34, 213–238. Craik, F.I.M. and Lockhart, R.S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior 11: 671–684. Cruttenden, A. (1981). Item-learning and system-learning. Journal of Psycholinguistic Research 10: 79–88. Crystal, D. (1987). The Cambridge Encyclopedia of Langage. Cambridge: Cambridge University Press. Cseresznyés, M. (2008) The reading tests. In Alderson, J.C., Nagy, E., and Öveges, E. (eds), English Language Education in Hungary Part 2. Accessed July 2008 from . Cutler, A., Mehler, J., Norris, D. and Segui, J. (1986). Limits on bilingualism. Nature 340: 229–30. Cutler, A. and Norris, D.G. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance 14: 113–121. Dagut, M.B. and Laufer, B. (1985). Avoidance of phrasal verbs by English learners, speakers of Hebrew – a case for contrastive analysis. Studies in Second Language Acquisition 7: 73–79. Dale, E. (1965). Vocabulary measurement: Techniques and major findings. Elementary English 42: 895–90l. Daller, H., Milton, J., and Treffers-Daller, J. (2007). Modelling and Assessing Vocabulary Knowledge. Cambridge: Cambridge University Press. D’Anna, C.A., Zechmeister, E.B., and Hall, J.W. (1991). Toward a meaningful definition of vocabulary size. Journal of Reading Behavior 23, 1: 109–122. Davies, M. (2002) Corpus del Español (100 million words, 1200s–1900s). Available online at . Davies, M. (2004). BYU-BNC: The British National Corpus. Available on-line at . Davies, M. (2007). TIME Magazine Corpus (100 million words, 1920s–2000s). Available on-line at . Davies, M. (accessed 2008) American Corpus of English. Accessed July 2008. Available on-line at . Davies, M., Biber, D., Jones, J., and Tracy, N. (2008) Corpus del Español: Registers. Accessed July 2008 at . Davies, M. and Ferreira, M. (2006). Corpus do Português (45 million words, 1300s– 1900s). Available on-line at . Davis, M.H., Di Betta, A.M., Macdonald, M.J.E., and Gaskell, M.G. (2008). Learning and consolidation of novel spoken words. Journal of Cognitive Neuroscience 21, 4: 803–820.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 365
4/13/2010 2:41:14 PM
Davou, M. (2008). Formulaic language in second language oral production. Presentation given at the Formulaic Language Research Network conference, University of Nottingham, June 2008. Day, R.R. and Bamford, J. (1998). Extensive Reading in the Second Language Classroom. Cambridge: Cambridge University Press. de Bot, K. (1992). A bilingual production model: Levelt’s speaking model adapted. Applied Linguistics 13, 1: 1–24. de Bot, K., Lowrie, W., and Verspoor, M. (2005). Second Language Acquisition. London: Routledge. de Bot, K. and Stoessel, S. (2000). In search of yesterday’s words: Reactivating a long forgotten language. Applied Linguistics 21, 3: 364–384. Dechert, H. (1983). How a story is done in a second language. In Faerch, C. and Kasper, G. (eds), Strategies in Interlanguage Communication. London: Longman. pp. 175–195. De Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and writing. In Mair, C. and Hundt, M. (eds), Corpus Linguistics and Linguistic Theory. Amsterdam: Rodopi. pp. 51–68. De Cock, S., Granger, S., Leech, G., and McEnery, T. (1998). An automated approach to the phrasicon on EFL learners. In Granger, S. (ed.), Learner English on Computer. London: Addison Wesley Longman. pp. 67–79. de Groot, A.M.B. (1992). Determinants of word translation. Journal of Experimental Psychology: Learning, Memory, and Cognition 18, 5: 1001–1018. de Groot, A.M.B. (2006). Effects of stimulus characteristics and background music on foreign language vocabulary learning and forgetting. Language Learning 56, 3: 463–506. de Groot, A.M.B. and Keijzer, R. (2000). What is hard to learn is easy to forget: The roles of word concreteness, cognate status, and word frequency in foreign-language vocabulary learning and forgetting. Language Learning 50, 1: 1–56. de Groot, A.M.B. and van Hell, J.G. (2005). The learning of foreign language vocabulary. In Kroll, J.F. and de Groot, A.M.B. (eds), Handbook of Bilingualism. Oxford: Oxford University Press. DeKeyser, R. (2003). Implicit and explicit learning. In Doughty, C.J. and Long, M.H. (eds), The Handbook of Second Language Acquisition. Malden, MA: Blackwell. pp. 313–348. Dörnyei, Z. (2001a). Motivational Strategies in the Language Classroom. Cambridge: Cambridge University Press. Dörnyei, Z. (2001b). Teaching and research motivation. Harlow, UK: Pearson Education. Dörnyei, Z. (2005). The Psychology of the Language Learner. Mahwah, NJ: Lawrence Erlbaum. Dörnyei, Z. (2007). Research Methods in Applied Linguistics. Oxford: Oxford University Press. Dörnyei, Z. (2009). The Psychology of Second Language Acquisition. Oxford: Oxford University Press. Dörnyei, Z., Durow, V., and Zahran, K. (2004). Individual differences and their effects on formulaic sequence acquisition. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. pp. 87–106. Doughty, C.J. (2003). Instructed SLA: Constraints, compensation, and enhancement. In Doughty, C.J. and Long, M.H. (eds), The Handbook of Second Language Acquisition. Malden, MA: Blackwell.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
366 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 366
4/13/2010 2:41:15 PM
367
Drew, P. and Holt, E. (1998). Figures of speech: Figurative expressions and the management of topic transition in conversation. Language in Society 27: 495–522. Dumay, N. and Gaskell, M.G. (2007). Sleep-associated changes in the mental representation of spoken words. Psychological Science 18: 35–39. Dumay, N., Gaskell, M.G., and Feng, X. (2004). A day in the life of a spoken word. In Forbus, K., Gentner, D., and Regier, T. (eds), Proceedings of the Twenty-Sixth Annual Conference of the Cognitive Science Society. Mahwah, NJ: Erlbaum. pp. 339–344. Durrant, P. (2007). Review of Nadja Nesselhauf’s Collocations in a learner corpus. Functions of Language 14, 2: 251–261. Durrant, P. (2008). High Frequency Collocations and Second Language Learning. Unpublished PhD dissertation, University of Nottingham. Durrant, P. and Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations? International Review of Applied Linguistics 47: 157–177. Durrant, P. and Schmitt, N. (in press). Adult learners’ retention of collocations from exposure. Second Language Research. Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19: 61–74. Duyck, W., van Assche, E., Drighe, D., and Hartsuiker, R. (2007). Visual word recognition by bilinguals in a sentence context: Evidence for nonselective lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition 33, 4: 663–679. Ellis, N.C. (1996). Sequencing in SLA: Phonological memory, chunking, and points of order. Studies in Second Language Acquisition 18: 91–126. Ellis, N.C. (1997). Vocabulary acquisition: Word structure, collocation, word-class, and meaning. In Schmitt, N. (ed.), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. Ellis, N.C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition 24: 143–188. Ellis, N.C. (2006a). Language acquisition as rational contingency learning. Applied Linguistics 27, 1: 1–24. Ellis, N.C. (2006b). Selective attention and transfer phenomena in L2 acquisition: Contingency, cue competition, salience, interference, overshadowing, blocking, and perceptual learning. Applied Linguistics 27, 2: 164–194. Ellis, N.C. and Beaton, A. (1993). Psycholinguistic determinants of foreign language vocabulary learning. Language Learning 43: 559–617. Ellis, N.C., Simpson-Vlach, R., and Maynard, C. (2008). Formulaic language in native and second-language speakers: Psycholinguistics, corpus linguistics, and TESOL. TESOL Quarterly 41, 3: 375–396. Ellis, R. and He, X. (1999). The roles of modified input and output in the incidental acquisition of word meanings. Studies in Second Language Acquisition 21: 285–301. Ellis, R., Tanaka, Y. and Yamazaki, A. (1994). Classroom interaction, comprehension, and the acquisition of L2 word meanings. Language Learning 44: 449–491. Entwisle, D.R. (1966). Word Associations of Young Children. Baltimore, MD: Johns Hopkins Press. Entwisle, D.R., Forsyth, D.F., and Muuss, R. (1964). The syntactic-paradigmatic shift in children’s word associations. Journal of Verbal Learning and Verbal Behavior 3: 19–29. Ehrman, M.E., Leaver, B.L., and Oxford, R.L. (2003). A brief overview of individual differences in second language learning. System 31: 313–30.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 367
4/13/2010 2:41:15 PM
Erman, B. and Warren, B. (2000). The idiom principle and the open choice principle. Text 20, 1: 29–62. Ervin, S.M. (1961). Changes with age in the verbal determinants of word association. American Journal of Psychology 74: 361–372. Evert, S. (2004). Computational approaches to collocations. Available on-line at . Evert, S., and Krenn, B. (2001). Methods for the qualitative evaluations of lexical association measures. Paper presented at the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France. Available on-line at . Fan, M. (2000). How big is the gap and how to narrow it? An investigation into the active and passive vocabulary knowledge of L2 learners. RELC Journal 31, 2: 105–119. Fan, M. (2003). Frequency of use, perceived usefulness, and actual usefulness of second language vocabulary strategies: A study of Hong Kong learners. Modern Language Journal 87, 2: 222–240. Farghal, M. and Obiedat, H. (1995). Collocations: A neglected variable in EFL. International Journal of Applied Linguistics 28, 4: 313–331. Fellbaum, C. (ed.). (2007). Idioms and Collocations. London: Continuum. Field, A. (2005). Discovering Statistics Using SPSS. London: Sage. Firth, J. (1935). The technique of semantics. Transactions of the Philological Society: 36–72. Fitzpatrick, T. (2006). Habits and rabbits: Word associations and the L2 lexicon. EUROSLA Yearbook 6: 121–145. Fitzpatrick, T. (2007). Word association patterns: Unpacking the assumptions. International Journal of Applied Linguistics 17: 319–31. Flowerdew, J. (1992). Definitions in science lectures. Applied Linguistics 13, 2: 201– 221. Folse, K.S. (2004). Vocabulary Myths. Ann Arbor: University of Michigan Press. Folse, K.S. (2006). The effect of type of written exercise on L2 vocabulary retention. TESOL Quarterly 40, 2: 273–293. Foster, P. (2001). Rules and routines: A consideration of their role in the task-based language production of native and non-native speakers. In Bygate, M., Skehan, P., and Swain, M. (eds), Researching Pedagogic Tasks: Second Language Learning, Teaching, and Testing. Harlow: Longman. pp. 75–94. Francis, W.N. and Kucera, H. (1982). Frequency Analysis of English Usage. Boston: Houghton Mifflin. Fraser, C.A. (1999). Lexical processing strategy use and vocabulary learning through reading. Studies in Second Language Acquisition 21: 225–241. Frayn, M. (2002). Spies. London: Faber and Faber. Fukkink, R.G. and De Glopper, K. (1998). Effects of instruction in deriving word meaning from context: A metaanalysis. Review of Educational Research 68, 4: 450–69. Gaskell, M.G. and Dumay, N. (2003). Lexical competition and the acquisition of novel words. Cognition 89: 105–132. Gibbs, R., Bogadanovich, J., Sykes, J., and Barr, D. (1997). Metaphor in idiom comprehension. Journal of Memory and Language 37: 141–154. Gläser, R. (1998). The stylistic potential of phraseological units in the light of genre analysis. In Cowie, A. (ed.), Phraseology: Theory, Analysis and Applications. Oxford: Oxford University Press. pp. 125–143. Gleason, J.B. (2005). The Development of Language. Boston: Pearson Education.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
368 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 368
4/13/2010 2:41:15 PM
369
Goulden, R., Nation, P., and Read, J. (1990). How large can a receptive vocabulary be? Applied Linguistics 11, 4: 341–363. Grabe, W. and Stoller, F.L. (2002). Teaching and Researching Reading. Harlow: Longman. Graesser A.C, McNamara, D.S., Louwerse, M.M., and Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers, 36: 193–202. Grainger, J. and Dijkstra, T. (1992). On the representation and use of language information in bilinguals. In Harris, R.J. (ed.), Cognitive Processing in Bilinguals. Amsterdam: North Holland. pp. 207–220. Granger, S. (1993). Cognates: An aid or a barrier to successful L2 vocabulary development. ITL: Review of Applied Linguistics 99–100: 43–56. Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and formulae. In Cowie, A.P. (ed.), Phraseology: Theory, Analysis and Applications. Oxford: Oxford University Press. pp. 145–160. Granger, S. and Meunier, F. (eds). (2008). Phraseology: An Interdisciplinary Perspective. Amsterdam: John Benjamins. Granger S., Paquot, M., and Rayson, P. (2006). Extraction of multi-word units from EFL and native English corpora: The phraseology of the verb ‘make’. In Häcki Buhofer, A. and Burger, H. (eds), Phraseology in Motion I: Methoden und Kritik. Akten der Internationalen Tagung zur Phraseologie (Basel, 2004). Baltmannsweiler: Schneider Verlag Hohengehren. pp. 57–68. Greidanus, T., Bogaards, P., van der Linden, E., Nienhuis, L., and de Wolf, T. (2004). The construction and validation of a deep word knowledge test for advanced learners of French. In Bogaards, P. and Laufer, B. (eds), Vocabulary in a Second Language. Amsterdam: John Benjamins. Greidanus, T. and Nienhuis, L. (2001). Testing the Quality of word knowledge in a second language by means of word associations: Types of distractors and types of associations. Modern Language Journal 85, 4: 567–577. Grendel, M. (1993). Verlies en herstel van lexicale kennis [Attrition and recovery of lexical knowledge]. Unpublished doctoral dissertation, University of Nijmegen. Groot, P.J.M. (2000). Computer assisted second language vocabulary acquisition. Language Learning and Technology 4, 1: 60–81. Gu Y. and Johnson, R.K. (1996). Vocabulary learning strategies and language learning outcomes. Language Learning 46, 4: 643–679. Gyllstad, H. (2005). Words that go together well: Developing test formats for measuring learner knowledge of English collocations. In Heinat, F. and Klingvall, E. (eds), The Department of English in Lund: Working Papers in Linguistics 5. pp. 1–31. Available on-line at . Gyllstad, H. (2007). Testing English collocations. Lund University: PhD thesis. Haastrup, K. (1991). Lexical Inferencing Procedures. Tübingen: Gunter Narr Verlag. Haastrup, K. (2008). Lexical inferencing procedures in two languages. In Albrechtsen, D., Haastrup, K., and Henriksen, B., Vocabulary and Writing in a First and Second Language: Process and Development. Basingstoke: Palgrave Macmillan. pp. 67–111. Haastrup, K. and Henriksen, B. (2000). Vocabulary acquisition: Acquiring depth of knowledge through network building. Intemational Joumal of Applied Linguistics 10, 2: 221–240. Hall, C.J. (2002). The automatic cognate form assumption: Evidence for the parasitic model of vocabulary development. International Review of Applied Linguistics in Language Teaching 40: 69–87.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 369
4/13/2010 2:41:15 PM
Hansen, L. and Chen,Y-L. (2001). What counts in the acquisition and attrition of numeral classifiers? JALT (Japan Association for Language Teaching) Journal 23, 1: 90–110. Hansen, L., Umeda, Y., and McKinney, M. (2002). Savings in the relearning of second language vocabulary: The effects of time and proficiency. Language Learning 52, 4: 653–678. Harley, T.A. (2008). The Psychology of Language: From Data to Theory (3rd edn). Hove: Psychology Press. Haynes, M. (1993). Patterns and perils of guessing in second language reading. In Huckin, T., Haynes, M., and Coady, J. (eds), Second Language Reading and Vocabulary Learning. Norwood, NJ: Ablex. pp. 46–65. Hazenberg, S. and Hulstijn, J.H. (1996). Defining a minimal receptive second-language vocabulary for non-native university students: An empirical investigation. Applied Linguistics 17, 2: 145–163. Heigham, J. and Croker, R.A. (2009). Qualitative Research in Applied Linguistics: A Practical Introduction . Basingstoke: Palgrave Macmillan. Hemchua, S. and Schmitt, N. (2006). An analysis of lexical errors in the English compositions of Thai learners. Prospect 21, 3: 3–25. Henriksen, B. (1999). Three dimensions of vocabulary development. Studies in Second Language Acquisition 21, 2: 303–317. Henriksen, B. (2008). Declarative lexical knowledge. In Albrechtsen, D., Haastrup, K., and Henriksen, B., Vocabulary and Writing in a First and Second Language. Basingstoke: Palgrave Macmillan. Hill, M. and Laufer, B. (2003). Type of task, time-on-task and electronic dictionaries in incidental vocabulary acquisition. International Review of Applied Linguistics in Language Teaching 41, 2: 87–106. Hoey, M. (2005). Lexical Priming: A New Theory of Words and Language. London: Routledge. Hoffman, S. and Lehmann, H.M. (2000). Collocational evidence from the British National Corpus. In Kirk, J.M. (ed.), Corpora Galore: Analyses and Techniques in Describing English. Papers from the Nineteenth International Conference on English Language Research on Computerised Corpora (ICAME 1998). Amsterdam: Rodopi. pp. 17–32. Hofland, K. and Johansson, S. (1982). Word Frequencies in British and American English. Bergen: Norwegian Computing Centre for the Humanities. Holley, F.M. and King, J.K. (1971). Vocabulary glosses in foreign language reading materials. Language Learning 21: 213–219. Horst, M. (2005). Learning L2 vocabulary through extensive reading: A measurement study. Canadian Modern Language Review 61, 3: 355–382. Horst, M., Cobb T., and Meara, P. (1998). Beyond A Clockwork Orange: Acquiring second language vocabulary through reading. Reading in a Foreign Language 11, 2: 207–223. Horst, M. and Collins, L. (2006). From faible to strong: How does their vocabulary grow? Canadian Modern Language Review 63, 1: 83–106. Howarth, P. (1998). The phraseology of learners’ academic writing. In Cowie, A. (ed.), Phraseology: Theory, Analysis and Applications. Oxford: Oxford University Press. pp. 161–186. Howatt, A.P.R. (2004). A History of English Language Teaching (2nd edn). Oxford: Oxford University Press. Hu, M. and Nation, I.S.P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language 23, 1: 403–430.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
370 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 370
4/13/2010 2:41:15 PM
371
Hughes, A. (2003). Testing for Language Teachers. Cambridge: Cambridge University Press. Hughes, G. (2000). A History of English Words. Oxford: Blackwell. Huibregtse, I., Admiraal, W., and Meara, P. (2002). Scores on a yes–no vocabulary test: Correction for guessing and response style. Language Testing 19: 227–245. Hulstijn, J.H. (1992). Retention of inferred and given word meanings: Experiments in incidental vocabulary learning. In Anaud, P.J. and Béjoint, H. (eds), Vocabulary and Applied Linguistics. London: Macmillan. pp. 113–125. Hulstijn, J.H. (2007). Psycholinguistic perspectives on language and its acquisition. In Cummins, J. and Davison, C. (eds), International Handbook of English Language Teaching. New York: Springer. pp. 783–795. Hulstijn, J.H., Hollander, M. and Greidanus, T. (1996). Incidental vocabulary learning by advanced foreign language students: The influence of marginal glosses, dictionary use, and reoccurrence of unknown words. Modern Language Journal 80: 327–339. Hulstijn, J.H. and Laufer, B. (2001). Some empirical evidence for the involvement load hypothesis in vocabulary acquisition. Language Learning 51, 3: 539–558. Hulstijn, J.H. and Trompetter, P. (1998). Incidental learning of second language vocabulary in computer-assisted reading and writing tasks. In Albrechtsen, D., Hendricksen, B., Mees, M., and Poulsen, E. (eds), Perspectives on Foreign and Second Language Pedagogy. Odense, Denmark: Odense University Press. pp. 191–200. Hulstijn, J.H., van Gelderen, A., and Schoonen, R. (2009). Automatization in second-language acquisition: What does the coefficient of variation tell us? Applied Psycholinguistics 30, 4: 555–582. Hunston, S. (2002). Corpora in Applied Linguistics. Cambridge: Cambridge University Press. Hunston, S. (2007). Semantic prosody revisited. International Journal of Corpus Linguistics 12, 2: 249–268. Hunston, S. and Francis, G. (2000). Pattern Grammar. Amsterdam: John Benjamins. Hunt, A. and Beglar, D . (1998). Current research and practice in teaching vocabulary. The Language Teacher. Accessed January 1998. Available on-line at . Hunt, A. and Beglar, D. (2005). A framework for developing EFL reading vocabulary. Reading in a Foreign Language 17, 1: 23–59. Hutchison, K.A. (2003). Is semantic priming due to association strength of feature overlap? A micro-analytic review. Psychonomic Bulletin and Review 10, 4: 785–813. Hyland, K. and Tse, P. (2007). Is there an ‘Academic Vocabulary’? TESOL Quarterly 41, 2: 235–253. Irujo, S. (1986). A piece of cake: Learning and teaching idioms. ELT Journal 40: 236–242. Irujo, S. (1993). Steering clear: Avoidance in the production of idioms. International Review of Applied Linguistics in Language Teaching 31: 205–219. Jackendoff, R. (1995). The boundaries of the lexicon. In Everaert, M., van der Linden, E., Schenk, A., and Schreuder, R. (eds), Idioms: Structural and Psychological Perspectives. Hillsdale, NJ: Erlbaum. pp. 133–166. Jacobs, G.M., Dufon, P., and Fong, C.H. (1994). LI and L2 vocabulary glosses in L2 reading passages: Their effectiveness for increasing comprehension and vocabulary knowledge. Journal of Research in Reading 17: 19–28. Jarvis, S. (2000). Methodological rigor in the study of transfer: Identifying L1 influence in the interlanguage lexicon. Langauge Learning 50, 2: 245–309.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 371
4/13/2010 2:41:16 PM
Jarvis, S. (2002). Short texts, best-fitting curves and new measures of lexical diversity. Language Testing 19, 1: 57–84. Jezzard, P., Matthews, P.M., and Smith, S.M. (eds). (2003). Functional Magnetic Resonance Imaging: An Introduction to Methods. Oxford: Oxford University Press. Jiang, N. (2002). Form-meaning mapping in vocabulary acquisition in a second language. Studies in Second Language Acquisition 24: 617–637. Jiang, N. and Nekrasova, T.M. (2007). The processing of formulaic sequences by second language speakers. Modern Language Journal 91, 3: 433–445. Joe, A. (1995). Text-based tasks and incidental vocabulary learning: A case study. Second Language Research 11: 149–158. Joe, A. (1998). What effects do text-based tasks promoting generation have on incidental vocabulary acquisition? Applied Linguistics 19, 3: 357–377. Joe, A.G. (2006). The nature of encounters with vocabulary and long-term vocabulary acquisition. Unpublished PhD thesis, Victoria University of Wellington. Johansson, S. and Hofland, K. (1989). Frequency Analysis of English Vocabulary and Grammar Volumes 1 & 2. Oxford: Clarendon Press. Johnston, M.H. (1974). Word associations of schizophrenic children. Psychological Reports 35: 663–674. Keller, E. (1981). Gambits: Conversation strategy signals. In Coulmas, F. (ed.), Conversational Routine. The Hague: Mouton. pp. 93–113. Kilgarriff, A. BNC frequency lists. Available on-line at . Kiss, G.R., Armstrong, C., Milroy, R., and Piper, J. (1973). An associative thesaurus of English and its computer analysis. In Aitken, A.J., Bailey, R.W., and HamiltonSmith, N. (eds), The Computer and Literary Studies. Edinburgh: Edinburgh University Press. Knight, S. (1994). Dictionary: The tool of last resort in foreign language reading? A new perspective. Modern Language Journal 78: 285–299. Koda, K. (1997). Orthographic knowledge in L2 lexical processing. In Coady, J. and Huckin, T. (eds), Second Language Vocabulary Acquisition. Cambridge: Cambridge University Press. Koda, K. (1998). The role of phonemic awareness in second language reading. Second Language Research 14, 2: 194–215. Kruse, H., Pankhurst, J., and Sharwood Smith, M. (1987). A multiple word association probe in second language acquisition research. Studies in Second Language Acquisition 9, 2: 141–154. Kučera, H. and Francis, W.N. (1967). Computational Analysis of Present-Day American English. Providence, Rhode Island: Brown University Press Kuhn, M.R. and Stahl, S.A. (1998). Teaching children to learn word meanings from context: A synthesis and some questions. Journal of Literacy Research 30, 1: 119–38. Kuiper, K. (1996). Smooth Talkers. Hillsdale, NJ: Lawrence Erlbaum. Kuiper, K. (2004). Formulaic performance in conventionalised varieties of speech. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. pp. 37–54. Kuiper, K. (2009). Formulaic Genres. Basingstoke: Palgrave Macmillan. Kuiper, K., Columbus, G., and Schmitt, N. (2009). Acquiring phrasal vocabulary. In Foster-Cohen, S. (ed.), Advances in Language Acquisition. Basingstoke: Palgrave Macmillan. Kuiper, K. and Flindall, M. (2000). Social rituals, formulaic speech and small talk at the supermarket checkout. In Coupland, J. (ed.), Small Talk. Harlow: Longman. pp. 183–207.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
372 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 372
4/13/2010 2:41:16 PM
373
Kuiper, K. and Haggo, D. (1984). Livestock auctions, oral poetry, and ordinary language. Language in Society 13: 205–234. Kuiper, K., van Egmond, M-E., Kempen, G., and Sprenger, S. (2007). Slipping on superlemmas: Multi-word lexical items in speech production. The Mental Lexicon 2, 3: 313–357. Kutas, M., Van Petten, C.K., and Kluender, R. (2006). Psycholinguistics electrified II (1994–2005). In Traxler, M.J. and Gernsbacher, M.A. (eds), Handbook of Psycholinguistics (2nd edn). London: Academic Press. pp. 659–724. Lambert, W.E. and Moore, N. (1966). Word association responses: Comparisons of American and French monolinguals with Canadian monolinguals and bilinguals. Journal of Personality and Social Psychology 3, 3: 313–320. Larsen-Freeman, D. (1975). The acquisition of grammatical morphemes by adult ESL students. TESOL Quarterly 9: 409–430. Laufer, B. (1988). The concept of ‘synforms’ (similar lexical forms) in vocabulary acquisition. Language and Education 2, 2: 113–132. Laufer, B. (1989). What percentage of text-lexis is essential for comprehension? In Lauren, C. and Nordman, M. (eds), Special Language: From Humans Thinking to Thinking Machines. Clevedon: Multilingual Matters. Laufer, B. (1992). How much lexis is necessary for reading comprehension? In Arnaud, P.J.L. and Béjoint, H. (eds), Vocabulary and Applied Linguistics. London: Macmillan. pp. 126–132. Laufer, B. (1995). Beyond 2000 – a measure of productive lexicon in a second language. In Eubank, L., Sharwood-Smith, M., and Selinker, L. (eds), The Current State of Interlanguage. Amsterdam: John Benjamins. pp. 265–272. Laufer, B. (1997). What’s in a word that makes it hard or easy? Intralexicalfactors affecting the difficulty of vocabulary acquisition. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. pp. 140–155. Laufer, B. (1998). The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics 12: 255–271. Laufer, B. (2000a). Task effect on instructed vocabulary learning: The hypothesis of ‘involvement’. Selected Papers from AILA ‘99 Tokyo. Tokyo: Waseda University Press. pp. 47–62. Laufer, B. (2000b). Avoidance of idioms in a second language: The effect of L1-L2 degree of similarity. Studia Linguistica 54: 186–196. Laufer, B. (2001). Quantitative evaluation of vocabulary: How it can be done and what it is good for. In Elder, C., Hill, K., Brown, A., Iwashita, N., Grove, L., Lumley, T., and McNamara, T. (eds), Experimenting with Uncertainty. Cambridge: Cambridge University Press. Laufer, B. (2005a). Focus on form in second language vocabulary learning. EUROSLA Yearbook 5: 223–250. Laufer, B. (2005b). Lexical frequency profiles: From Monte Carlo to the real world. A response to Meara (2005). Applied Linguistics 26, 4: 582–588. Laufer, B., Elder, C., Hill, K., and Congdon, P. (2004). Size and strength: Do we need both to measure vocabulary knowledge? Language Testing 21, 2: 202–226. Laufer, B. and Eliasson, S. (1993). What causes avoidance in L2 learning: L1-L2 difference, L1-L2 similarity, or L2 complexity? Studies in Second Language Acquisition 15: 35–48. Laufer, B. and Goldstein, Z. (2004). Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning 54, 3: 399–436.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 373
4/13/2010 2:41:16 PM
Laufer, B. and Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics 22, 1: 1–26. Laufer, B. and Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics 16, 3: 307–322. Laufer, B. and Nation, P. (1999). A vocabulary-size test of controlled productive vocabulary. Language Testing 16, 1: 33–51. Laufer, B. and Paribakht, T.S. (1998). The relationship between passive and active vocabularies: Effects of language learning context. Language Learning 48, 3: 365–391. Laufer, B. and Shmueli, K. (1997). Memorizing new words: Does teaching have anything to do with it? RELC Journal 28: 89–108. Leech, G., Rayson, P., and Wilson, A. (2001) . Word Frequencies in Written and Spoken English. Harlow: Longman. Lennon, P. (2000). The lexical element in spoken second language fluency. In Riggenbach, H. (ed.), Perspectives on Fluency. Ann Arbor: University of Michigan Press. pp. 25–42. Levelt, W.J.M. (1989). Speaking. From Intention to Articulation. Cambridge, Mass: MIT Press. Li, J. and Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study. Journal of Second Language Writing 18: 85–102. Li, J. and Schmitt, N. (in press). The development of collocation use in academic texts by advanced L2 learners: A multiple case-study approach. In Woods, D. (ed.), Perspectives on Formulaic Language in Communication and Acquisition. London: Continuum. Li, P., Farkas, I., and MacWhinney, B. (2004). Early lexical development in a self organising neural network. Neural Networks 17: 1345–1362. Liao, P. (2006). EFL learners’ beliefs about and strategy use of translation in English learning. RELC Journal 37, 2: 191–215. Lin, P.M.S. (in preparation). Are formulaic sequences phonologically coherent as we assumed? Unpublished PhD thesis, University of Nottingham. Liu, N. and Nation, I.S.P. (1985). Factors affecting guessing vocabulary in context. RELC Journal 16: 33–42. Longman Language Activator. (1993). Harlow: Longman. Lotto, L. and de Groot, A.M.B. (1998). Effects of learning method and word type on acquiring vocabulary in an unfamiliar language. Language Learning 48, 1: 31–69. Lüdeling, A. and Kyto, M. (eds). (2009). Corpus Linguistics: An International Handbook. Berlin: Mouton de Gruyter. Luppescu, S. and Day, R.R. (1993). Reading, dictionaries, and vocabulary learning. Language Learning 43, 2: 263–287. MacAndrew, R. (2002). Inspector Logan. Cambridge English Readers Series. Cambridge: Cambridge University Press. Malvern, D., Richards, B.J., Chipere, N., and Durán, P. (2004). Lexical Diversity and Language Development: Quantification and Assessment. Basingstoke: Palgrave Macmillan. Manning, C.D. and Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. Mason, B. and Krashen, S. (2004). Is form-focused vocabulary instruction worthwhile? RELC Journal 35, 2: 179–185. McArthur, T. (ed.). (1992). The Oxford Companion to the English Language. Oxford: Oxford University Press. McCarthy, M. (1990). Vocabulary. Oxford: Oxford University Press.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
374 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 374
4/13/2010 2:41:16 PM
375
McCarthy, M. and Carter, R. (1997). Written and spoken vocabulary. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. McCarthy, M. and Carter, R. (2002). This that and the other: Multi-word clusters in spoken English as visible patterns of interaction. Teanga (Yearbook of the Irish Association for Applied Linguistics) 21: 30–52. McCarthy, P.M. and Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing 24, 4: 459–488. McCrostie, J. (2007). Investigating the accuracy of teachers’ word frequency intuitions. RELC Journal 38, 1: 53–66. McDonough, K. and Trofimovich, P. (2008). Using Priming Methods in Second Language Research. London: Routledge/Taylor and Francis. McGee, I. (2008). Word frequency estimates revisited – A response to Alderson (2007). Applied Linguistics 29, 3: 509–514. McMillion, A. and Shaw, P. (2008). The balance of speed and accuracy in advanced L2 reading comprehension. Nordic Journal of English Studies, December. McMillion, A. and Shaw, P. (2009). Comprehension and compensatory processing in advanced L2 readers. In Brantmeier, C. (ed.), Empirical Research on Adult Foreign Language Reading. New York: Information Age Publishing. McNeill, A. (1996). Vocabulary knowledge profiles: Evidence from Chinese-speaking ESL teachers. Hong Kong Journal of Applied Linguistics 1: 39–63. Meara, P. (1980). Vocabulary acquisition: A neglected aspect of language learning. Language Teaching & Linguistics: Abstracts 13, 4: 221–246. Meara, P. (1983). Word associations in a foreign language. Nottingham Linguistic Circular 11, 2: 29–38. Meara, P. (1987). Vocabulary in a second language, Vol. 2. Specialised Bibliography 4. London: CILT. Meara, P. (1990). A note on passive vocabulary. Second Language Research 6, 2: 150–154. Meara, P. (1992). EFL Vocabulary Tests. University College, Swansea: Centre for Applied Language Studies. Meara, P. (1996a). The classical research in vocabulary acquisition. In Anderman, G. and Rogers, M. (eds), Words, Words, Words. Clevedon: Multilingual Matters. pp. 27–40. Accessed on-line at . Meara, P.M. (1996b). The dimensions of lexical competence. In Brown, G., Malmkjaer, K., and Williams, J. (eds), Performance and Competence in Second Language Acquisition. Cambridge: Cambridge University Press. pp. 35–53. Meara, P. (1996c). The vocabulary knowledge framework. Available on-line at . Meara, P. (1997). Towards a new approach to modelling vocabulary acquisition. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. pp. 109–121. Meara, P. (1999). Lexis: Acquisition. In Spolsky, B. (ed.), Concise Encyclopedia of Educational Linguistics. Amsterdam: Elsevier. pp.565–567. Meara, P. (2004). Modelling vocabulary loss. Applied Linguistics 25, 2: 137–155. Meara, P. (2005). Lexical frequency profiles: A Monte Carlo analysis. Applied Linguistics 26, 1: 32–47. Meara, P. (2006). Emergent properties of multilingual lexicons. Applied Linguistics 27, 4: 620–644. Meara, P. (accessed 2008). P_Lex v2.0: The Manual. Available on the _lognostics website, .
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 375
4/13/2010 2:41:17 PM
Meara, P. (2009). Connected Words: Word Associations and Second Language Vocabulary Acquisition. Amsterdam: John Benjamins. Meara, P.M. and Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteristics of short L2 texts. Prospect 16, 3: 5–19. Meara, P.M. (in preparation b). L2 vocabulary profiles and Zipf’s law. Cited in the V_Size: The Manual. Meara, P.M. and Miralpeix, I. Accessed on-line at . Meara, P. and Miralpeix, I. (accessed 2008a). V_Size: The Manual. Available on the_ lognostics website, . Meara, P. and Miralpeix, I. (accessed 2008b). D_Tools: The Manual. Available on the_ lognostics website, . Meara, P. and Wolter, B. (2004). V_Links: Beyond vocabulary depth. In Albrechtsen, D., Haastrup, K., and Henriksen, B. (eds), Angles on the English-Speaking World. Copenhagen: Museum Tusculanum/University of Copenhagen Press. pp. 85–96. Mel’cuk, I.A. (1995). Phrasemes in language and phraseology in linguistics. In Everaert, M., van der Linden, E.-J., Schenk, A., and Schreuder, R. (eds), Idioms: Structural and Psychological Perspectives. Hillsdale, NJ: Lawrence Erlbaum. pp. 167–232. Melka, F. (1997). Receptive vs. productive aspects of vocabulary. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. pp. 84–102. Milton, J. and Hales, T. (1997). Applying a lexical profiling system to technical English. In Ryan, A. and Wray, A. (eds), Evolving Models of Language. Clevedon: Multilingual Matters. pp. 72–83. Milton, J. and Hopkins, N. (2006). Comparing phonological and orthographic vocabulary size: Do vocabulary tests underestimate the knowledge of some learners? Canadian Modern Language Review 63, 1: 127–147. Milton, J. and Meara, P. (1998). Are the British really bad at learning foreign languages? Language Learning Journal 18: 68–76. Miralpeix, I. (2007). Lexical knowledge in instructed language learning: The effects of age and exposure. International Journal of English Studies 7, 2: 61–83. Miralpeix, I. (2008). The influence of age on vocabulary acquisition in English as a Foreign Language. PhD thesis, Universitat de Barcelona. Mochida, A. and Harrington, M. (2006). The yes/no test as a measure of receptive vocabulary knowledge. Language Testing 23: 73–98. Mochizuki, M. (2002). Exploration of two aspects of vocabulary knowledge: Paradigmatic and collocational. Annual Review of English Language Education in Japan 13:121–129. Mondria, J-A. (2003). The effects of inferring, verifying, and memorizing on the retention of L2 word meanings. Studies in Second Language Acquisition 25: 473–499. Moon, R. (1997). Vocabulary connections: Multi-word items in English. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. pp. 40–63. Moon, R. (1998). Fixed Expressions and Idioms in English: A Corpus-based Approach. Oxford: Oxford University Press. Morris, L. and Cobb, T. (2004). Vocabulary profiles as predictors of the academic performance of Teaching English as a Second Language trainees. System 32: 75–87. Moss, H.E. and Older, L.J.E. (1996). Birkbeck Word Association Norms. London: Taylor and Francis. Myles, F., Hooper, J., and Mitchell, R. (1998). Rote or rule? Exploring the role of formulaic language in classroom foreign language learning. Language Learning 48, 3: 323–363.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
376 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 376
4/13/2010 2:41:17 PM
377
Nagy, W.E., Anderson, R., Schommer, M., Scott, J.A., and Stallman, A. (1989). Morphological families in the internal lexicon. Reading Research Quarterly 24, 3: 263–282. Nagy, W.E., Diakidoy, I.N., and Anderson, R.C. (1993). The acquisition of morphology: Learning the contribution of suffixes to the meanings of derivatives. Journal of Reading Behavior 25: 155–170. Namei, S. (2004). Bilingual lexical development: A Persian-Swedish word association study. International Journal of Applied Linguistics 14, 3: 363–388. Nassaji, H. (2003). L2 vocabulary learning from context: Strategies, knowledge sources, and their relationship with success in L2 lexical inferencing. TESOL Quarterly 37, 4: 645–670. Nation, P. (1983). Testing and teaching vocabulary. Guidelines 5: 12–25. Nation, I.S.P. (1990). Teaching and Learning Vocabulary. New York: Newbury House. Nation, I.S.P. (1993). Using dictionaries to estimate vocabulary size: Essential, but rarely followed, procedures. Language Testing 10, 1: 27–40. Nation, P. (1995). The word on words: An interview with Paul Nation. The Language Teacher 19, 2: 5–7. Nation, I.S.P. (2001). Learning Vocabulary in Another Language. Cambridge: Cambridge University Press. Nation, I.S.P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review 63, 1: 59–82. Nation, I.S.P. (2008). Teaching Vocabulary: Strategies and Techniques. Boston: Heinle. Nation, P. and Gu, P.Y. (2007). Focus on Vocabulary. Sydney: National Centre for English Language Teaching and Research. Nation, I.S.P. and Hwang K. (1995). Where would general service vocabulary stop and special purposes vocabulary begin? System 23, 1: 35–41. Nation, P. and Meara, P. (2002). Vocabulary. In Schmitt, N. (ed.), An Introduction to Applied Linguistics. London: Arnold. Nation, I.S.P. and Wang, K. (1999). Graded readers and vocabulary. Reading in a Foreign Language 12: 355–380. Nation, I.S.P. and Waring, R. (1997). Vocabulary size, text coverage, and word lists. In Schmitt, N. and McCarthy, M. (eds.), Vocabulary: Description, Acquisition, and Pedagogy. pp. 6–19. Nattinger, J.R. and DeCarrico, J.S. (1992). Lexical Phrases and Language Teaching. Oxford: Oxford University Press. Nelson, K. (1973). Structure and Strategy in Learning to Talk. Monographs of the Society for Research in Child Development, Serial no. 149, nos 1–2. Nelson, K. (1981). Individual differences in language development: Implications for development and language. Developmental Psychology 17: 170–187. Nelson, D.L., McEvoy, C.L., and Schreiber, T.A. (1998). The University of South Florida word association, rhyme, and word fragment norms. Available on-line at . Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics 24/2: 223–242. Nesselhauf, N. (2004). How learner corpus analysis can contribute to language teaching: A study of support verb constructions. In Aston, G., Bernardini, S., and Stewart, D. (eds), Corpora and Language Learners. Amsterdam: John Benjamins. pp. 109–124. Nesselhauf, N. (2005). Collocations in a Learner Corpus. Amsterdam: John Benjamins. Newton, J. (1995). Task-based interaction and incidental vocabulary learning: A case study. Second Language Research 11: 159–177.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 377
4/13/2010 2:41:17 PM
Nippold, M.A. (1991). Evaluating and enhancing idiom comprehension in languagedisordered students. Language, Speech, and Hearing Services in Schools 22: 100–106. Cited in Cain, K., Oakhill, J., and Lemmon, K. (2005). The relation between children’s reading comprehension level and their comprehension of idioms. Journal of Experimental Child Psychology 90, 1: 65–87. Nissen, H.B. and Henriksen, B. (2006). Word class influence on word association test results. International Journal of Applied Linguistics 16, 3: 389–408. Nooteboom, S.G. (1999). Sloppiness in uttering stock phrases. International Congress of Phonetic Sciences. San Francisco. Nurweni, A., and Read, J. (1999). The English vocabulary knowledge of Indonesian university students. English for Specific Purposes 18: 161–175. O’Keeffe, A., McCarthy, M., and Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge: Cambridge University Press. Olshtain, E. (1989). Is second language attrition the reversal of second language acquisition? Studies in Second Language Acquisition 11, 2: 151–165. Oppenheim, N. (2000). The importance of recurrent sequences for nonnative speaker fluency and cognition. In Riggenbach, H. (ed.), Perspectives on Fluency. Ann Arbor: University of Michigan Press. pp. 220–240. Osterhout, L., McLaughlin, J., Pitkänen, I., Frenck-Mestre, C., and Molinaro, N. (2006). Novice learners, longitudinal designs, and event-related potentials: A means for exploring the neurocognition of second language processing. Language Learning 56, Supplement 1: 199–230. Oxford, R.L. (1990). Language Learning Strategies: What Every Teacher Should Know. New York: Newbury House. Palmer, H.E., West, M.P., and Faucett, L. (1936). Interim Report on Vocabulary Selection for the Teaching of English as a Foreign Language. Report of the Carnegie Conference, New York 1934, and London 1935. London: P.S. King and Son. Paribakht, T.S. (2005). The influence of L1 lexicalization on L2 lexical inferencing: A Study of Farsi-speaking EFL learners. Language Learning 55, 4: 701–748. Paribakht, T.S. and Wesche, M.B. (1993). Reading comprehension and second language development in a comprehension-based ESL program. TESL Canada Journal 11: 9–29. Paribakht, T.S. and Wesche, M. (1997). Vocabulary enhancement activities and reading for meaning in second language vocabulary acquisition. In Coady, J. and Huckin, T. (eds), Second Language Vocabulary Acquisition. Cambridge University Press. pp. 174–200. Paribakht, T.S. and Wesche, M. (1999). Reading and ‘incidental’ L2 vocabulary acquisition: An introspective study of lexical inferencing. Studies in Second Language Acquisition 21: 195–224. Partington, A. (1998). Patterns and Meanings. Amsterdam: John Benjamins. Pawley, A. and Syder, F.H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In Richards, J.C. and Schmidt, R.W. (eds), Language and Communication. London: Longman. pp. 191–225. Pellicer Sánchez, A. and Schmitt, N. (in press). Incidental vocabulary acquisition from an authentic novel: A Clockwork Orange/multiple word knowledge approach. Reading in a Foreign Language. Peters, A. (1983). The Units of Language Acquisition. Cambridge: Cambridge University Press. Phongphio, T. and Schmitt, N. (2006). Learning English multi-word verbs in Thailand. Thai TESOL Bulletin 19: 122–136.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
378 References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 378
4/13/2010 2:41:17 PM
379
Pienemann, M. and Johnston, M. (1987). Factors influencing the development of language proficiency. In Nunan, D. (ed.), Applying Second Language Acquisition Research. Adelaide: National Curriculum Resource Centre. pp. 45–141. Pigada, M. and Schmitt, N. (2006). Vocabulary acquisition from extensive reading: A case study. Reading in a Foreign Language 18, 1: 1–28. Pintrich, P.R., Smith, D.A.F., Garcia, T., and McKeachie, W.J. (1991). A Manual for the Use of The Motivated Strategies for Learning Questionnaire (MSLQ ). Ann Arbor: University of Michigan Press. Politzer, R.L. (1978). Paradigmatic and syntagmatic associations of first year French students. In Honsa, V. and Hardman-de-Baudents, J. (eds), Papers in Linguistics and Child Language. The Hague: Mouton. pp. 203–210. Postman, L. and Keppel, G. (1970). Norms of Word Association. New York: Academic Press. Prince, P. (1996). Second language vocabulary learning: The role of context versus translations as a function of proficiency. Modern Language Journal 80: 478–493. Pulvermüller, F. (2005). Brain mechanisms linking language and action. Nature Reviews Neuroscience 6, 7: 576–582. Available on-line at . Qian, D.D. (2002). Investigating the relationship between vocabulary knowledge and academic reading performance: An assessment perspective. Language Learning 52, 3: 513–536. Ramachandran, S.D. and Rahim, H.A. (2004). Meaning recall and retention: The impact of the translation method on elementary level learners’ vocabulary learning. RELC Journal 35, 2: 161–178. Rastle, K., Harrington, J., and Coltheart, M. (2002). 358,534 nonwords: The ARC Nonword Database. Quarterly Journal of Experimental Psychology, 55A: 1339–1362. Rayson, P. (2008). Software demonstration: Identification of multiword expressions with Wmatrix. Presentation given at the Formulaic Language Research Network (FLaRN) conference, University of Nottingham. Read, J. (1988). Measuring the vocabulary knowledge of second language learners. RELC Journal 19: 12–25. Read, J. (1993). The development of a new measure of L2 vocabulary knowledge. Language Testing 10: 355–371. Read, J. (1998). Validating a test to measure depth of vocabulary knowledge. In Kunnan, A. (ed.), Validation in Language Assessment. Mahwah, NJ: Lawrence Erlbaum. pp. 41–60. Read, J. (2000). Assessing Vocabulary. Cambridge: Cambridge University Press. Read, J. (2004). Research in teaching vocabulary. Annual Review of Applied Linguistics 24: 146–161. Read, J. (2005). Applying lexical statistics to the IELTS listening test. Research Notes 20: 12–16. Renandya, W.A., Rajan, B.R.S., and Jacobs, G.M. (1999). Extensive reading with adult learners of English as a second language. RELC Journal 30: 39–60. Reppen, R. and Simpson, R. (2002). Corpus linguistics. In Schmitt, N. (ed.), An Introduction to Applied Linguistics. London: Arnold. Richards, B. and Malvern, D. (2007). Validity and threats to the validity of vocabulary measurement. In Daller, H., Milton, J., and Treffers-Daller, J. (eds), Modelling and Assessing Vocabulary Knowledge. Cambridge: Cambridge University Press. pp. 79–92. Richards, J.C. (1976). The role of vocabulary teaching. TESOL Quarterly 10, 1: 77–89.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 379
4/13/2010 2:41:18 PM
References
Ringbom, H. (2007). Cross-linguistic Similarity in Foreign Language Learning. Clevedon: Multilingual Matters. Rosenzweig, M.R. (1961). Comparisons among word-association responses in English, French, German, and Italian. American Journal of Psychology 74: 347–360. Rosenzweig, M.R. (1964). Word associations of French workmen: Comparisons with associations of French students and American workmen and students. Journal of Verbal Behavior and Verbal Learning 3: 57–69. Rott, S. (1999). The effect of exposure frequency on intermediate language learners’ incidental vocabulary acquisition through reading. Studies in Second Language Acquisition 21, 1: 589–619. Rott, S., Williams, J. and Cameron, R. (2002). The effect of multiple-choice L1 glosses and input-output cycles on lexical acquisition and retention. Language Teaching Research 6, 3: 183–222. Ruke-Dravina, V. (1971). Word associations in monolingual and multilingual individuals. Linguistics 74: 66–85. Ryan, A. (1997). Learning the orthographic form of L2 vocabulary – A receptive and a productive process. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition and Pedagogy. Cambridge: Cambridge University Press. Saragi, T., Nation, I.S.P., and Meister, F. (1978). Vocabulary learning and reading. System 6, 2: 72–78. Scarcella, R. and Zimmerman, C.B. (1998). Academic words and gender: ESL student performance on a test of academic lexicon. Studies in Second Language Acquisition 20: 27–49. Schmidt, R.W. (1983). Interaction, acculturation, and the acquisition of communicative competence: A case study of an adult. In N. Wolfson and E. Judd (eds), Sociolinguistics and Language Acquisition. Rowley, MA: Newbury House. pp. 137–174. Schmitt, N. (1997). Vocabulary learning strategies. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. Schmitt, N. (1998a). Tracking the incidental acquisition of second language vocabulary: A longitudinal study. Language Learning 48, 2: 281–317. Schmitt, N. (1998b). Measuring collocational knowledge: Key issues and an experimental assessment procedure. ITL Review of Applied Linguistics 119–120: 27–47. Schmitt, N. (1998c). Quantifying word association responses: What is native-like? System 26: 389–401. Schmitt, N. (2000). Vocabulary in Language Teaching. Cambridge: Cambridge University Press. Schmitt, N. (ed.). (2004). Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam: John Benjamins. Schmitt, N. (2005). Grammar: Rules or patterning? Applied Linguistics Forum 26, 2: 1–2. Available on-line at . Schmitt, N. (2008). Instructed second language vocabulary learning. Language Teaching Research 12, 3: 329–363. Schmitt, N. and Carter, R. (2004). Formulaic sequences in action: An introduction. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. pp. 1–22. Schmitt, N., Dörnyei, Z., Adolphs, S., and Durow, V. (2004). Knowledge and acquisition of formulaic sequences: A longitudinal study. In Schmitt, N. (ed.), Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam: John Benjamins. pp. 55–86. Schmitt, N. and Dunham, B. (1999). Exploring native and non-native intuitions of word frequency. Second Language Research 15, 4: 389–411.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
380
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 380
4/13/2010 2:41:18 PM
381
Schmitt, N., Grandage, S., and Adolphs, S. (2004). Are corpus-derived recurrent clusters psycholinguistically valid? In Schmitt, N. (ed.), Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam: John Benjamins. pp. 127–151. Schmitt, N., Jiang, X., and Grabe, W. (in press). The percentage of words known in a text and reading comprehension. Modern Language Journal. Schmitt, N. and Marsden, R. (2006). Why is English Like That? Historical Answers to Hard ELT Questions. An Arbor: University of Michigan Press. Schmitt, N. and McCarthy, M. (eds). (1997). Vocabulary: Description, Acquisition and Pedagogy. Cambridge: Cambridge University Press. Schmitt, N. and Meara, P. (1997). Researching vocabulary through a word knowledge framework: Word associations and verbal suffixes. Studies in Second Language Acquisition 19: 17–36. Schmitt, N., Schmitt, D., and Clapham, C. (2001). Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test. Language Testing 18, 1: 55–88. Schmitt, N. and Zimmerman, C.B. (2002). Derivative word forms: What do learners know? TESOL Quarterly 36, 2: 145–171. Schonell, F.J., Meddleton, I.G., and Shaw, B.A. (1956). A Study of the Oral Vocabulary of Adults. Brisbane: University of Queensland Press. Schoonen, R., van Gelderen, A., de Glopper, K., Hulstijn, J., Simis, A., Snellings, P., and Stevenson, M. (2003). First language and second language writing: The role of linguistic fluency, linguistic knowledge, and metacognitive knowledge. Language Learning 53, 1: 165–202. Schoonen, R. and Verhallen, M. (2008). The assessment of deep word knowledge in young first and second language learners. Language Testing 25, 2: 211–236. Scrivener, J. (2005). Learning Teaching: A Guidebook for English Language Teachers. Oxford: Macmillan. Segalowitz, N. and Hulstijn, J. (2005). Automaticity in bilingualism and second language learning. In Kroll, J.F. and de Groot, A.M.B. (eds), Handbook of Bilingualism: Psycholinguistic Approaches. Oxford: Oxford University Press. pp. 371–388. Segalowitz, N.S., Poulsen, C., and Segalowitz, S.J. (1999). RT coefficient of variation is differentially sensitive to executive control involvement in an attention switching task. Brain and Cognition 38: 255–258. Segalowitz, N.S. and Segalowitz, S.J. (1993). Skilled performance, practice, and the differentiation of speed-up from automatization effects: Evidence from second language word recognition. Applied Psycholinguistics 14: 369–385. Segalowitz, S.J., Segalowitz, N.S., and Wood, A.G. (1998). Assessing the development of automaticity in second language word recognition. Applied Psycholinguistics 19: 53–67. Shapiro, B. J. (1969). The subjective estimation of relative word frequency, Journal of Verbal Learning and Verbal Behaviour 8: 248–51. Sharp, D. and Cole, M. (1972). Patterns of responding in the word associations of West African children. Child Development 43: 55–65. Shillaw, J. (1995). Using a word list as a focus for vocabulary learning. The Language Teacher 19, 2: 58–59. Shillaw, J. (1996). The application of RASCH modelling to yes/no vocabulary tests. Available on-line at . Simpson-Vlach, R. and Ellis, N.C. (in press). An academic formulas list: New methods in phraseology research. Applied Linguistics. Sinclair, J.M. (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 381
4/13/2010 2:41:18 PM
References
Sinclair, J. (2004). Trust the Text: Lexis, Corpus, Discourse. London: Routledge. Singleton, D. (1999). Exploring the Second Language Mental Lexicon. Cambridge: Cambridge University Press. Singleton, D. (2000). Language and the Lexicon. London: Arnold. Siyanova, A., Conklin, K., and Schmitt, N. (under review). Processing of idioms by native speakers and proficient L2 learners: An eye-tracking study. Siyanova, A. and Schmitt, N. (2007). Native and nonnative use of multi-word vs. oneword verbs. International Review of Applied Linguistics, 45: 119–139. Siyanova, A. and Schmitt, N. (2008). L2 learner production and processing of collocation: A multi-study perspective. Canadian Modern Language Review 64, 3: 429–458. Söderman, T. (1993). Word associations of foreign language learners and native speakers – different response tvpes and their relevance to lexical development. In Hammarberg, B. (ed.), Problems, Process and Product in Language Learning. Abo: AfinLA. Sorhus, H. (1977). To hear ourselves – Implications for teaching English as a second language. English Language Teaching Journal 31: 211–21. Staehr, L. (2009). Vocabulary knowledge and advanced listening comprehension in English as a Foreign Language. Studies in Second Language Acquisition 31: 1–31. Stubbs, M. (1995). Collocations and semantic profiles: On the cause of the trouble with quantitative methods. Functions of Language 2: 1–33. Stubbs, M. (2002). Words and Phrases: Corpus Studies of Lexical Semantics. Oxford: Blackwell. Sunderman, G. and Kroll, J.F. (2006). First language activation during second language lexical processing. Studies in Second Language Acquisition 28: 387–422. Sutarsyah, C., Nation, P., and Kennedy, G. (1994). How useful is EAP vocabulary for ESP? A corpus based study. RELC Journal 25, 2: 34–50. Swan, M. (1997). The influence of the mother tongue on second language vocabulary acquisition and use. In Schmitt, N. and McCarthy, M. (eds), Vocabulary: Description, Acquisition, and Pedagogy. Cambridge: Cambridge University Press. Takala, S. (1984). Evaluation of Students’ Knowledge of English Vocabulary in the Finnish Comprehensive School. (Rep. No. 350). Jyväskylä, Finland: Institute of Educational Research. Teliya, V., Bragina, N., Oparina, E., and Sandomirskaya, I. (1998). Phraseology as a language of culture: Its role in the representation of a collective mentality. In Cowie, A. (ed.), Phraseology: Theory, Analysis, and Applications. Oxford: Oxford University Press. pp. 55–75. Thorndike, E.L. and Lorge, I. (1944). The Teacher’s Word Book of 30,000 Words. New York: Teachers College, Columbia University. Tomasello, M. (2003). Constructing a Language: A Usage-based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Tomiyama, M. (1999). The first stage of second language attrition: A case study of a Japanese returnee. In Hansen, L. (ed.), Second Language Attrition in Japanese Contexts. Oxford: Oxford University Press. pp. 59–79. Tremblay. A., Baayen, H., Derwing, B., and Libben, G. (2008). Lexical bundles and working memory: An ERP study. Presentation given at the (FLaRN) Formulaic Language Research Network Conference. University of Nottingham, June 19–20, 2008. Tseng, W-T., Dörnyei, Z., and Schmitt, N. (2006). A new approach to assessing strategic learning: The case of self-regulation in vocabulary acquisition. Applied Linguistics 27, 1: 78–102.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
382
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 382
4/13/2010 2:41:18 PM
383
Tseng, W-T. and Schmitt, N. (2008). Towards a self-regulating model of vocabulary learning: A structural equation modeling approach. Language Learning 58, 2: 357–400. Underwood, G., Schmitt, N., and Galpin, A. (2004). The eyes have it: An eye-movement study into the processing of formulaic sequences. In Schmitt, N. (ed.), Formulaic Sequences. Amsterdam: John Benjamins. van Gelderen, A., Schoonen, R., de Glopper, K., Hulstijn, J., Simis, A., Snellings, P., and Stevenson, M. (2004). Linguistic knowledge, processing speed, and metacognitive knowledge in first- and second-language reading comprehension: A componential analysis. Journal of Educational Psychology 96, 1: 19–30. van Hout, R. and Vermeer, A. (2007). Comparing measures of lexical richness. In Daller, H., Milton, J., and Treffers-Daller, J. (eds), Modelling and Assessing Vocabulary Knowledge. Cambridge: Cambridge University Press. pp. 93–115. van Lancker, D., Canter, G.J., and Terbeek, D. (1981). Disambiguation of diatropic sentences: Acoustic and phonetic cues. Journal of Speech and Hearing Research 24: 330–335. Walters, J.M. (2004). Teaching the use of context to infer meaning: A longitudinal survey of L1 and L2 vocabulary research. Language Teaching 37, 4: 243–52. Walters, J. (2006). Methods of teaching inferring meaning from context. RELC Journal 37, 2: 176–190. Wang, M. and Koda, K. (2005). Commonalities and differences in word identification skills among learners of English as a second language. Language Learning 55, 1: 71–98. Waring, R. (1999). Tasks for assessing second language receptive and productive vocabulary. Unpublished PhD thesis, University of Wales, Swansea. Available online at . Waring, R. and Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language 15: 130–163. Watanabe,Y. (1997). Input, intake, and retention: Effects of increased processing on incidental learning of foreign vocabulary. Studies in Second Language Acquisition 19: 287–307. Webb, S. (2005). Receptive and productive vocabulary learning: The effects of reading and writing on word knowledge. Studies in Second Language Acquisition 27: 33–52. Webb, S. (2007a). Learning word pairs and glossed sentences: The effects of a single context on vocabulary knowledge. Language Teaching Research 11, 1: 63–81. Webb, S. (2007b). The effects of repetition on vocabulary knowledge. Applied Linguistics 28, 1: 46–65. Weltens, B. (1989). The Attrition of French as a Foreign Language. Dordrecht: Foris. Weltens, B. and Grendel, M. (1993). Attrition of vocabulary knowledge. In Schreuder, R. and Weltens, B. (eds), The Bilingual Lexicon. Amsterdam: John Benjamins. Weltens, B., Van Els, T.J.M., and Schils, E. (1989). The long-term retention of French by Dutch students. Studies in Second Language Acquisition 11, 2: 205–216. Wesche, M. and Paribakht, T.S. (1996). Assessing L2 vocabulary knowledge: Depth versus breadth. Canadian Modern Language Review 53, 1: 13–40. Wesche, M. and Paribakht, T.S. (2009). Lexical Inferencing in a First and Second Language: Cross-linguistic Dimensions. Clevedon: Multilingual Matters. West, M. (1953). A General Service List of English Words. London: Longman, Green and Co.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
References
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 383
4/13/2010 2:41:19 PM
References
Wilks, C. and Meara, P. (2002). Untangling word webs: Graph theory and the notion of density in second language word association networks. Second Language Research 18, 4: 303–324. Wilkins, D.A. (1972). Linguistics in Language Teaching. London: Arnold. Winne, P.H. and Perry, N.E. (2000). Measuring self-regulated learning. In Boekaerts, M., Pintrich, P.R., and Zeidner, M. (eds), Handbook of Self-Regulation. San Diego, CA: Academic Press. Wong Fillmore, L. (1976). The second time around. Doctoral dissertation, Stanford University. Wood, D. (2002). Formulaic language in acquisition and production: Implications for teaching. TESL Canada Journal 20, 1: 1–15. Woodrow, H. and Lowell, F. (1916). Children’s association frequency tables. Psychology Monographs 22, 5, No. 97. Wray, A. (2000). Formulaic sequences in second language teaching: Principle and practice. Applied Linguistics 21, 4: 463–489. Wray, A. (2002). Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. Wray, A. (2008). Formulaic Language: Pushing the Boundaries. Oxford: Oxford University Press. Wray, A. and Bloomer, A. (2006). Projects in Linguistics: A Practical Guide to Researching Language. London: Hodder Arnold. Wray, A. and Perkins, M.R. (2000). The functions of formulaic language: An integrated model. Language and Communication 20: 1–28. Xiao, R. (2008). Well-known and influential corpora. In Lüdeling, A. and Kyto, M. (eds), Corpus Linguistics: An International Handbook. Berlin: Mouton de Gruyter. pp. 383–457. Xue, G. and Nation, I.S.P. (1984). A university word list. Language Learning and Communication 3, 2: 215–229. Yorio, C.A. (1980). Conventionalized language forms and the development of communicative competence. TESOL Quarterly 14, 4: 433–442. Yoshii, M. (2006). LI and L2 glosses: Their effects on incidental vocabulary learning. Language Learning and Technology 10, 3: 85–101. Zahar, R., Cobb, T., and Spada, N. (2001). Acquiring vocabulary through reading: Effects of frequency and contextual richness. Canadian Modern Language Review 57, 3: 541–572. Zareva, A. (2007). Structure of the second language mental lexicon: How does it compare to native speakers’ lexical organization? Second Language Research 23, 2: 123–153. Zechmeister, E.B., Chronis, A.M., Cull, W.L., D’Anna, C.A. and Healy, N.A. (1995). Growth of a functionally important lexicon. Journal of Reading Behavior 27, 2: 201–212. Zechmeister, E.B., D’Anna, C.A., Hall, J.W., Paus, C.H., and Smith, J.A. (1993). Metacognitive and other knowledge about the mental lexicon: Do we know how many words we know? Applied Linguistics 14, 2: 188–206. Zeidner, M., Boekaerts, M., and Pintrich, P.R. (2000). Self-regulation: Directions and challenges for future research. In Boekaerts, M., Pintrich, P.R., and Zeidner, M. (eds), Handbook of Self-Regulation. San Diego, CA: Academic Press. Zimmerman, C.B. (1997). Historical trends in second language vocabulary instruction. In Coady, J. and Huckin, T. (eds), Second Language Vocabulary Acquisition. Cambridge: Cambridge University Press.
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
384
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_10_ref.indd 384
4/13/2010 2:41:19 PM
absolute results, 167 academic vocabulary, 78–9 Academic Word List (AWL), 79 AKC Nonword Database, 164, 180 attrition, 23 measuring, 256–9 automaticity, 17, 106–7 measurement of, 242 British National Corpus, 13, 309–12 characteristics of lexical items, 48–9 coefficient of variance, 245 cognates, 73–4 computer simulations of vocabulary, 97–105 corpora, 307–35 American National Corpus, 315–36 British Academic Spoken Corpus, 321 British National Corpus, 309–12 Brown Corpus, 316–18 ‘Brown family’ corpora, 319–20 CANCODE, 322–3 CHILDES Corpus, 325 COBUILD Corpus (Bank of English), 318–19 COLT Corpus, 325 Corpus of Contemporary American English, 312–14 ICAME Corpus Compilation, 331–2 International Corpus of English, 323 International Corpus of Learner English, 325 Lancaster–Oslo/Bergen Corpus, 318 London–Lund Corpus, 320 MICASE / MICUSP, 321 Non-English corpora, 327–31 SUBTLEXUS, 319–20 Time Corpus, 314–15 VOICE Corpus, 322 corpus analysis, 12–15 and frequency, 13–14 and formulaic language, 123–32 concordancers / tools, 335–43
content vs. function words 54–5 counting units of vocabulary size, 188–93 cross-association, 56 cut-points, 187 definitions in vocabulary items, 174–7 delayed posttests, 155–8 depth of knowledge, 15–17 measurement of, 216–42 effects size, 166–7 engagement, 26–8 equivalence of tests, 177 exposures, number necessary for learning, 30–1 extensive reading, 32 form, 24–5 form-meaning link, 49–52 formulaic language acquisition of, 136–41 amount in language, 9–10, 40, 117–18 and fluency, 11–12 identification, 120–32 nonnative use of, 142 processing of, 134–6 project, 268 psycholinguistic reality of, 141–2 types of, 10–11, 119–20 with open slots, 132–4 frequency, 13, 63–71 General Service List, 13, 75–6 general vocabulary, 75–6 glosses, 34 Graph Theory, 254 guessing from context see lexical inferencing
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
Index
Involvement Load Hypothesis, 27 importance of vocabulary, 3–5 incidental vocabulary learning, 29–31 project, 274 385
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_11_ind.indd 385
4/13/2010 1:00:59 PM
Index
incremental nature of vocabulary learning, 19–22 internet, using as a corpus, 333–5 interviews, in validating measurement, 182–3 intrinsic difficulty of lexical items, 55–7 L1 influence, 25–6, 71–5 learning strategies, 89–97 Lexical Frequency Profile, 104 lexical inferencing, 32–4 Lextutor, 39, 71, 199–200, 209, 341–2 lists of vocabulary, description of, 345–7 project, 265 Mason and Krashen critique, 169–72 meaning, 52–5 MRC Psycholinguistic Database, 163–4 multiple measures of vocabulary, 152–5 mutual information score, 130–1 network connections, 58–62 organization, measuring lexical, 247–56 participants, 150–2 pre-existing knowledge, controlling for, 179–80 productive vs. receptive mastery, see receptive vs. productive mastery psycholinguistic approaches to vocabulary research, 105–16
spoken discourse and vocabulary, 38 synonymy, 52 target items, selection of, 158–64 task types project, 271 teaching pedagogy project, 266 technical vocabulary, 77–8 tests and measures of vocabulary BNC 20,000 Profile, 208 CATSS Test, 84–7, 202, 216 checklist tests, 199–202 Coh-Metrix, 215–16 collocation measures, 229–36 Lexical Frequency Profile, 205–7 Peabody Picture Vocabulary Test, 196 P_Lex, 208–10 Productive Vocabulary Levels Test, 203–5 project, 263 Schmitt and Zimmerman scale, 221–3 Test of English Derivatives, 228–9 Type-token measures, 212–15 Vocabulary Levels Test, 197–8, 279–92 Vocabulary Knowledge Scale, 218–21 Vocabulary Size Test, 198–9, 293–306 vocd (D_Tools), 213–14 V_Links, 254–6 V_Size, 210–12 V_Quint, 255 Word Associates Format, 226–8 theories of vocabulary acquisition, 36 t-score, 126
qualitative research, 149–50 receptive vs. productive mastery, 21, 36, 79–89 project, 275 reliability, 183–7 repetition of vocabulary in texts project, 272 sampling from dictionaries, 193–6 sample size, 164–5 size of vocabulary measurement of, 187–216 of native English speakers, 6 of non native English speakers, 9, 19 project, 260 requirements for using English, 7
validity, 181–3 variable expressions, 132–5 vocabulary and other language proficiencies, 4–5 vocabulary control movement, 13 websites, 347–51 word associations, 18, 59–62, 248–56 word association norms, 252 project, 262 word families number of members, 8 word knowledge framework, 16–17, 79–80, 211
Copyright material from www.palgraveconnect.com - licensed to University of South Florida - PalgraveConnect - 2011-05-25
386
Zipf’s Law, 64, 211
10.1057/9780230293977 - Researching Vocabulary, Norbert Schmitt 9781403_985354_11_ind.indd 386
4/13/2010 1:00:59 PM